SourceRank Vulnerable to Evasion Tactics in PyPI Packages

Global: Study Finds SourceRank Scores Vulnerable to Evasion Tactics in PyPI Packages

Researchers have evaluated the reliability of SourceRank, an 18‑metric scoring system used to assess the popularity and quality of open‑source packages, against evasion attacks targeting malicious packages on the Python Package Index (PyPI). The analysis focuses on how well the metric distinguishes benign from malicious software in real‑world conditions.

Threat Model Overview

The authors propose a comprehensive threat model that identifies potential evasion approaches for each of SourceRank’s metrics. Among these, the “URL confusion” technique can manipulate five metrics by pointing to a legitimate repository that is unrelated to the malicious package, thereby inflating its score.

Empirical Evaluation on PyPI

To assess practical impact, the study compares SourceRank distributions for benign and malicious packages using the MalwareBench dataset and a separate real‑world collection of 122,398 packages. Historical data from the benchmark suggests a clear separation between benign and malicious scores.

Limitations of SourceRank

However, the real‑world analysis reveals substantial overlap between the two groups, primarily because SourceRank does not promptly reflect package removals. As a result, the metric cannot be reliably employed to discriminate between benign and malicious packages or to curate safe packages for users.

Emerging URL Confusion Attacks

The investigation also documents a rise in URL confusion attacks, increasing from 4.2% of malicious packages in MalwareBench to 7.0% in the broader dataset. This technique is frequently combined with other evasion methods, significantly boosting the SourceRank metrics of compromised packages.

This report is based on information from arXiv, licensed under Academic Preprint / Open Access. Based on the abstract of the research paper. Full text available via ArXiv.

Study Finds SourceRank Scores Vulnerable to Evasion Tactics in PyPI Packages

Threat Model Overview

Empirical Evaluation on PyPI

Limitations of SourceRank

Emerging URL Confusion Attacks

Data and Protocol

Privacy Protocol