Efficient Implementations of Differentially Private SGDA: A Study

Global: Study Benchmarks Efficient Implementations of Differentially Private SGD

A team of machine learning researchers released new benchmark results in June 2024 to assess the computational cost of training deep learning models with differential privacy. The study measured throughput differences between standard stochastic gradient descent (SGD) and its privacy-preserving counterpart, DP‑SGD, across multiple frameworks and hardware configurations. By focusing on Poisson subsampling—a requirement for formal DP guarantees—the researchers aimed to identify practical pathways for scalable private training.

Background on DP‑SGD

DP‑SGD has become the de‑facto algorithm for training models under differential privacy constraints, relying on Poisson subsampling to maintain rigorous privacy accounting. However, many existing implementations sacrifice theoretical correctness for speed by employing alternative subsampling techniques, potentially compromising privacy assurances.

Benchmarking Approach

The authors implemented and evaluated several DP‑SGD variants that preserve Poisson subsampling, including a baseline using the Opacus library in PyTorch and an optimized gradient‑clipping method known as Ghost Clipping. They also developed a comparable implementation in JAX to test cross‑framework performance. Benchmarks were run on configurations ranging from single‑GPU setups to clusters of up to 80 GPUs.

Performance Findings

Results indicated that the naive Opacus implementation of DP‑SGD achieved a throughput between 2.6 and 8 times lower than conventional SGD. Introducing Ghost Clipping roughly halved this performance gap, bringing DP‑SGD closer to non‑private training speeds. The JAX‑based implementation, which also employed Poisson subsampling, performed on par with the optimized PyTorch approaches.

Scalability Insights

When scaling to larger GPU counts, the study observed that DP‑SGD exhibited more favorable scaling characteristics than SGD, suggesting that the overhead associated with privacy mechanisms diminishes relative to overall compute as resources increase.

Implications for Privacy‑Preserving Machine Learning

These findings suggest that efficient algorithmic and engineering choices can substantially reduce the computational burden of privacy‑preserving training, potentially lowering barriers for broader adoption in industry and research settings.

Open‑Source Release

The researchers made their benchmarking library publicly available at https://github.com/DPBayes/Towards-Efficient-Scalable-Training-DP-DL, providing the community with tools to replicate and extend the analysis.

This report is based on information from arXiv, licensed under Academic Preprint / Open Access. Based on the abstract of the research paper. Full text available via ArXiv.

Study Benchmarks Efficient Implementations of Differentially Private SGD