Study Finds SourceRank Scores Vulnerable to Evasion Tactics in PyPI Packages
Global: Study Finds SourceRank Scores Vulnerable to Evasion Tactics in PyPI Packages
Researchers have evaluated the reliability of SourceRank, an 18‑metric scoring system used to assess the popularity and quality of open‑source packages, against evasion attacks targeting malicious packages on the Python Package Index (PyPI). The analysis focuses on how well the metric distinguishes benign from malicious software in real‑world conditions.
Threat Model Overview
The authors propose a comprehensive threat model that identifies potential evasion approaches for each of SourceRank’s metrics. Among these, the “URL confusion” technique can manipulate five metrics by pointing to a legitimate repository that is unrelated to the malicious package, thereby inflating its score.
Empirical Evaluation on PyPI
To assess practical impact, the study compares SourceRank distributions for benign and malicious packages using the MalwareBench dataset and a separate real‑world collection of 122,398 packages. Historical data from the benchmark suggests a clear separation between benign and malicious scores.
Limitations of SourceRank
However, the real‑world analysis reveals substantial overlap between the two groups, primarily because SourceRank does not promptly reflect package removals. As a result, the metric cannot be reliably employed to discriminate between benign and malicious packages or to curate safe packages for users.
Emerging URL Confusion Attacks
The investigation also documents a rise in URL confusion attacks, increasing from 4.2% of malicious packages in MalwareBench to 7.0% in the broader dataset. This technique is frequently combined with other evasion methods, significantly boosting the SourceRank metrics of compromised packages.
This report is based on information from arXiv, licensed under Academic Preprint / Open Access. Based on the abstract of the research paper. Full text available via ArXiv.
Ende der Übertragung