Privacy-Preserving Machine Learning Research Scientist

David Turtora Zagardo

I build differentially private synthetic data systems, privacy attacks and evaluations, and deployable ML infrastructure for organizations that need useful models without exposing sensitive records.

Contact GitHub LinkedIn

ICML 2026 Sole-author, self-funded paper on geometry-aware tabular synthetic diffusion.

5+ years Building privacy-preserving ML and privacy engineering systems.

$170K+ Privacy engineering consulting revenue in 2025.

7 cases Expert witness analysis of tracking, pixels, cookies, and fingerprinting.

Geometry-aware synthesis Pairwise structure for tabular diffusion.

Private data generation Accounting, clipping, noise, and release controls.

Attack evaluation Membership and attribute inference evidence.

DP-EGGROLL centered fitness vector visualization

Fitness-vector privacy Centered population signals with RDP accounting.

Selected work at the research-to-production boundary.

These projects show how privacy guarantees, model utility, attack evaluation, and deployment constraints connect in real systems.

Synthetic Data Research

Geometry-Aware Tabular Diffusion

Sole-author, self-funded ICML 2026 paper introducing pairwise geometric features for tabular diffusion, improving fidelity and utility with fewer parameters than transformer baselines.

Diffusion Tabular synthesis ICML 2026

Diagram of a private synthetic data pipeline

Production DP

Private Synthetic Data Pipelines

Built tabular, time-series, and text synthesis systems with DP-SGD, RDP accounting, group privacy, private preprocessing, and on-prem deployment controls for sensitive enterprise workflows.

DP-SGD RDP accounting On-prem Docker

Privacy Risk Evaluation

Membership and Attribute Inference Attacks

Built attack tooling for language and image models, plus FAISS-based attribute inference dashboards for synthetic data risk, translating privacy failures into measurable evidence.

MIA FAISS Risk dashboards

Algorithmic Privacy Research

DP-EGGROLL

Differentially private evolution strategies via centered fitness vector privatization. DP-EGGROLL combines clipping, Poisson subsampling, and RDP accounting, with practical classification non-inferiority to fast DP-AdamW on 22/25 AUROC endpoints.

Fitness vectors RDP accounting DP optimization

Privacy engineering toolkit.

The throughline is practical privacy-preserving ML: mechanisms, accounting, attack evaluation, and governance-aware deployment.

Differential Privacy

DP-SGD, Laplace, Gaussian, Exponential, Sparse Vector, subsampling, shuffling, RDP, and zCDP.

Private ML Systems

PyTorch, Opacus, HuggingFace, PEFT, LoRA, vLLM, FAISS, diffusion models, and deployment packaging.

Risk Evaluation

Membership inference, attribute inference, reconstruction risk, utility testing, and synthetic data dashboards.

Privacy Governance

Privacy by Design, PIAs, OneTrust, cookie consent, tracking technology review, and expert reporting.

Publications and service.

Research outputs, open-source work, and service that reinforce the privacy and ML positioning.

2026

Geometry-Aware Tabular Diffusion

ICML 2026. Sole author and self funded.

Case Study

Research

DP-EGGROLL: Differentially Private Evolution Strategies via Fitness Vector Privatization

Research note and code with experiments against fast DP-AdamW, tuned uncentered EGGROLL, and scalar DP-ZO.

Code

2024

Blockwise Gradient Aggregation for Deep Learning

IEEE Digital Privacy. Algorithmic work on blockwise gradient aggregation and model privacy evaluation.

IEEE

2024

A More Practical Approach to Machine Unlearning

arXiv paper cited 6 times.

arXiv

Service

ACM CODASPY Program Committee

Program committee member for 2025 and 2026.

CODASPY

Experience.

A concise narrative of the roles behind the research and systems work.

Privacy Engineer, ML Research

Secludy AI

Built DP synthetic data systems, inference attacks, privacy accounting, on-prem deployment workflows, and private LLM fine-tuning infrastructure.

Owner

Green Willow Studios

Enterprise privacy engineering consulting across differential privacy, synthetic data, compliance infrastructure, open-source tools, and expert reporting.

Technical Privacy Consultant

WebXRay

Audited web tracking, pixels, fingerprinting, cross-site tracking, and cookie consent practices across large-scale web datasets.

Research Assistant

Carnegie Mellon University

Studied privacy preferences and health data granularity, with survey design, statistical modeling, and cross-cultural privacy analysis.

Build useful ML without exposing sensitive data.

I am available for privacy-preserving ML research, synthetic data systems, differential privacy consulting, and technical privacy analysis.

Email

dave@greenwillowstudios.com

GitHub

github.com/dzagardo

linkedin.com/in/davidzagardo

Focus

Differential privacy, synthetic data, privacy attacks, LLM privacy, and privacy engineering.