The Differentially Private Stochastic Gradient Descent (DP-SGD) algorithm allows the training of machine learning (ML) models with formal Differential Privacy (DP) guarantees.
A novel DP auditing procedure has been introduced to analyze DP-SGD with shuffling and it has been shown that DP models trained with this approach have considerably overestimated privacy guarantees.
The study assesses the impact on privacy leakage of several parameters, including batch size, privacy budget, and threat model.
The usage of shuffling instead of Poisson sub-sampling in DP-SGD can lead to significant privacy leakage, as observed in this research.