Privacy-Preserving Federated Learning
Co-authored UN paper exploring federated aggregation strategies with differential privacy and homomorphic encryption for national statistical offices.
The challenge
National statistical offices hold sensitive data that cannot leave their premises, yet cross-border collaboration on machine learning models is needed to generate better statistics for the public good.
Approach
- Implemented four federated aggregation strategies — FedAvg, FedYogi, FedAdagrad, and FedAdam — using the Flower library across a simulated multi-NSO environment with Statistics Canada, ISTAT, and Statistics Netherlands.
- Applied differentially private training with varying privacy budgets (ε = 10, 1, 0.3) using Opacus to quantify the privacy–utility tradeoff for each strategy on a Human Activity Recognition dataset.
- Tested homomorphic encryption (Paillier) on model layers to prevent the central server from inspecting client models, measuring computational overhead across partial and full model encryption.
- Evaluated all configurations using per-class F1 scores and accuracy curves over 10 federated rounds to surface class-level learning failures masked by aggregate metrics.
Key takeaway
Conducted under the United Nations PETLab. FedAvg and FedYogi are strong baselines that work well even without hyperparameter tuning. Differential privacy degrades weaker strategies less predictably — noise can accidentally help poorly performing models. Published at the UNECE Conference of European Statisticians, September 2023.