Privacy-Preserving Machine Learning Research Scientist

David Turtora Zagardo

I build differentially private synthetic data systems, privacy attacks and evaluations, and deployable ML infrastructure for organizations that need useful models without exposing sensitive records.

ICML 2026 Sole-author, self-funded paper on geometry-aware tabular synthetic diffusion.
5+ years Building privacy-preserving ML and privacy engineering systems.
$170K+ Privacy engineering consulting revenue in 2025.
7 cases Expert witness analysis of tracking, pixels, cookies, and fingerprinting.
GATD versus TabDiff result chart
Geometry-aware synthesis Pairwise structure for tabular diffusion.
Private synthetic data pipeline diagram
Private data generation Accounting, clipping, noise, and release controls.
Privacy attack evaluation dashboard
Attack evaluation Membership and attribute inference evidence.
DP-EGGROLL centered fitness vector visualization
Fitness-vector privacy Centered population signals with RDP accounting.

Selected work at the research-to-production boundary.

These projects show how privacy guarantees, model utility, attack evaluation, and deployment constraints connect in real systems.

GATD versus TabDiff result chart
Synthetic Data Research

Geometry-Aware Tabular Diffusion

Sole-author, self-funded ICML 2026 paper introducing pairwise geometric features for tabular diffusion, improving fidelity and utility with fewer parameters than transformer baselines.

Diffusion Tabular synthesis ICML 2026
Diagram of a private synthetic data pipeline
Production DP

Private Synthetic Data Pipelines

Built tabular, time-series, and text synthesis systems with DP-SGD, RDP accounting, group privacy, private preprocessing, and on-prem deployment controls for sensitive enterprise workflows.

DP-SGD RDP accounting On-prem Docker
Privacy attack evaluation dashboard
Privacy Risk Evaluation

Membership and Attribute Inference Attacks

Built attack tooling for language and image models, plus FAISS-based attribute inference dashboards for synthetic data risk, translating privacy failures into measurable evidence.

MIA FAISS Risk dashboards
DP-EGGROLL centered fitness vector visualization
Algorithmic Privacy Research

DP-EGGROLL

Differentially private evolution strategies via centered fitness vector privatization. DP-EGGROLL combines clipping, Poisson subsampling, and RDP accounting, with practical classification non-inferiority to fast DP-AdamW on 22/25 AUROC endpoints.

Fitness vectors RDP accounting DP optimization

Privacy engineering toolkit.

The throughline is practical privacy-preserving ML: mechanisms, accounting, attack evaluation, and governance-aware deployment.

Differential Privacy

DP-SGD, Laplace, Gaussian, Exponential, Sparse Vector, subsampling, shuffling, RDP, and zCDP.

Private ML Systems

PyTorch, Opacus, HuggingFace, PEFT, LoRA, vLLM, FAISS, diffusion models, and deployment packaging.

Risk Evaluation

Membership inference, attribute inference, reconstruction risk, utility testing, and synthetic data dashboards.

Privacy Governance

Privacy by Design, PIAs, OneTrust, cookie consent, tracking technology review, and expert reporting.

Publications and service.

Research outputs, open-source work, and service that reinforce the privacy and ML positioning.

2026

Geometry-Aware Tabular Diffusion

ICML 2026. Sole author and self funded.

Case Study
Research

DP-EGGROLL: Differentially Private Evolution Strategies via Fitness Vector Privatization

Research note and code with experiments against fast DP-AdamW, tuned uncentered EGGROLL, and scalar DP-ZO.

Code
2024

Blockwise Gradient Aggregation for Deep Learning

IEEE Digital Privacy. Algorithmic work on blockwise gradient aggregation and model privacy evaluation.

IEEE
2024

A More Practical Approach to Machine Unlearning

arXiv paper cited 6 times.

arXiv
Service

ACM CODASPY Program Committee

Program committee member for 2025 and 2026.

CODASPY

Experience.

A concise narrative of the roles behind the research and systems work.

Privacy Engineer, ML Research

Secludy AI

Built DP synthetic data systems, inference attacks, privacy accounting, on-prem deployment workflows, and private LLM fine-tuning infrastructure.

Owner

Green Willow Studios

Enterprise privacy engineering consulting across differential privacy, synthetic data, compliance infrastructure, open-source tools, and expert reporting.

Technical Privacy Consultant

WebXRay

Audited web tracking, pixels, fingerprinting, cross-site tracking, and cookie consent practices across large-scale web datasets.

Research Assistant

Carnegie Mellon University

Studied privacy preferences and health data granularity, with survey design, statistical modeling, and cross-cultural privacy analysis.

Build useful ML without exposing sensitive data.

I am available for privacy-preserving ML research, synthetic data systems, differential privacy consulting, and technical privacy analysis.

Focus

Differential privacy, synthetic data, privacy attacks, LLM privacy, and privacy engineering.