Enhancing Membership Inference Attacks on Language Models Without Labeled Non-Members

Global: EM-MIA Enhances Membership Inference Attacks on Language Models Without Labeled Non-Members

Researchers have introduced EM-MIA, an expectation‑maximization based membership inference approach that operates without the need for labeled non‑member examples, and have unveiled a new benchmark called OLMoMIA to evaluate robustness under varied distributional conditions.

Background on Membership Inference

Membership inference attacks (MIAs) aim to determine whether a particular data point was included in the training set of a language model, raising privacy concerns for users whose text may be inadvertently exposed.

EM-MIA Methodology

The EM-MIA technique iteratively refines both the effectiveness of input prefixes and the estimated membership scores, employing an expectation‑maximization strategy that updates these components until convergence, thereby eliminating reliance on pre‑selected non‑member prompts.

Introducing OLMoMIA Benchmark

To support controlled evaluation, the authors present OLMoMIA, a benchmark designed to systematically vary distributional overlap and difficulty, enabling precise analysis of how MIA performance changes across different data similarity scenarios.

Experimental Findings

Empirical tests on the WikiMIA dataset and the newly created OLMoMIA benchmark indicate that EM-MIA outperforms existing baseline attacks, particularly in settings where the distributional separability between members and non‑members is pronounced.

Limitations and Future Directions

While EM-MIA demonstrates success in practical situations with partial distributional overlap, the authors also document failure cases where near‑identical training and non‑training distributions diminish attack effectiveness, highlighting fundamental constraints of current MIA methodologies.

Open‑Source Release

The research team has released the codebase and evaluation pipeline associated with EM-MIA and OLMoMIA, encouraging reproducibility and further investigation within the community.

This report is based on information from arXiv, licensed under Academic Preprint / Open Access. Based on the abstract of the research paper. Full text available via ArXiv.

EM-MIA Enhances Membership Inference Attacks on Language Models Without Labeled Non-Members