Researchers Reveal Multi-Targeted Backdoor Attack Framework for Graph Neural Networks
Global: Multi-Targeted Backdoor Attacks Demonstrated on Graph Neural Networks Researchers have presented the first multi-targeted backdoor attack designed for graph classification tasks, enabling multiple distinct triggers to simultaneously redirect model predictions to different target labels. The approach replaces traditional subgraph replacement with subgraph injection, preserving the original graph structure while embedding malicious patterns. Extensive testing across five benchmark datasets shows high attack success rates with negligible impact on clean‑graph accuracy, highlighting a new class of vulnerabilities in graph neural networks (GNNs).
Background on GNN Vulnerabilities
Graph neural networks have achieved state‑of‑the‑art results in domains such as social network analysis, drug discovery, and recommendation systems, yet their susceptibility to adversarial manipulation remains a growing concern. Prior work on backdoor attacks for graph classification has been limited to single‑target scenarios, where a lone trigger forces the model to misclassify inputs to a predetermined label. This limitation left open the possibility of more sophisticated, multi‑label manipulation techniques.
Novel Multi-Target Attack Methodology
The authors introduce a subgraph injection strategy that embeds multiple, distinct trigger subgraphs into clean graphs without altering the overall topology. Each trigger is associated with a different target label, allowing the compromised model to produce a range of malicious outputs depending on the injected pattern. By avoiding wholesale subgraph replacement, the method maintains the statistical properties of the original data, reducing the likelihood of detection during training.
Experimental Validation Across Datasets
The framework was evaluated on five widely used graph classification datasets, including MUTAG, PROTEINS, and ENZYMES. Results indicate attack success rates exceeding 90 % for all target labels while preserving clean‑graph accuracy within a 1‑2 % margin of the baseline. These findings demonstrate that the multi‑target approach can be reliably deployed across diverse graph domains.
Model‑Agnostic Effectiveness
Tests were conducted on four prominent GNN architectures—GCN, GraphSAGE, GIN, and DiffPool—under varying training hyperparameters. The attack consistently achieved high success rates irrespective of model depth, aggregation function, or optimizer settings, suggesting that the vulnerability is rooted in fundamental aspects of graph representation learning rather than specific implementation details.
Parameter Sensitivity Analysis
The study examined how injection method, number of connections, trigger size, edge density, and poisoning ratio influence attack performance. Larger trigger subgraphs and higher edge densities improved success rates, while modest poisoning ratios (as low as 5 %) were sufficient to maintain effectiveness, underscoring the efficiency of the attack vector.
Resistance to Existing Defenses
Evaluations against two leading defense mechanisms—randomized smoothing and fine‑pruning—showed that the multi‑targeted attacks remain robust, with only marginal reductions in success probability. This resilience suggests that current defensive strategies may need to be re‑engineered to address the broader threat landscape introduced by multi‑trigger backdoors.
Implications and Future Directions
The findings expose a critical gap in the security of graph‑based machine learning systems, particularly as GNNs are increasingly deployed in high‑stakes applications. The authors recommend further research into detection algorithms tailored for multi‑trigger scenarios and the development of training protocols that can mitigate backdoor insertion without sacrificing model performance.This report is based on information from arXiv, licensed under Academic Preprint / Open Access. Based on the abstract of the research paper. Full text available via ArXiv.
Ende der Übertragung