NeoChainDaily
NeoChainDaily
Uplink
Initialising Data Stream...
29.12.2025 • 15:09 Artificial Intelligence & Ethics

Researchers Develop Tailored Backdoor Attack Framework for Prompt-Driven Video Segmentation Models

Global: Researchers Develop Tailored Backdoor Attack Framework for Prompt-Driven Video Segmentation Models

A new study posted to arXiv reveals that a team of researchers has designed a specialized backdoor attack method, named BadVSFM, targeting prompt-driven video segmentation foundation models (VSFMs). The work outlines how the two‑stage approach can embed malicious behavior while preserving normal segmentation performance, and it highlights the limited effectiveness of existing defenses.

Background on Prompt‑Driven Video Segmentation

Video segmentation foundation models such as SAM2 have become integral to high‑stakes domains like autonomous driving and digital pathology. These systems rely on prompts—textual or visual cues—to generate masks that delineate objects across video frames, offering flexibility and scalability for real‑world applications.

Limitations of Existing Backdoor Techniques

Prior attempts to inject backdoors, exemplified by classic BadNet attacks, have shown attack success rates (ASR) below 5% when directly applied to VSFMs. Analysis of encoder gradients and attention maps indicates that conventional training keeps gradients for clean and triggered inputs largely aligned, and attention continues to focus on the true object, preventing the encoder from learning a distinct trigger‑related representation.

Proposed BadVSFM Framework

BadVSFM addresses these shortcomings through a two‑stage strategy. First, the image encoder is guided so that frames containing the trigger map to a designated target embedding while clean frames stay aligned with a reference encoder. Second, the mask decoder is trained to produce a consistent target mask for triggered frame‑prompt pairs across various prompt types, whereas clean outputs remain close to a reference decoder.

Experimental Validation

Extensive experiments on two public datasets and five different VSFMs demonstrate that BadVSFM achieves strong, controllable backdoor effects under diverse triggers and prompts, all while maintaining high segmentation quality on clean data. Ablation studies confirm the robustness of the approach to variations in loss functions, stages, target selections, trigger designs, and poisoning rates.

Implications and Defense Challenges

Gradient‑conflict analysis and attention visualizations show that BadVSFM successfully separates triggered and clean representations and redirects attention toward trigger regions. Notably, four representative defense mechanisms evaluated in the study were largely ineffective, underscoring an underexplored vulnerability in current VSFMs.

This report is based on information from arXiv, licensed under Academic Preprint / Open Access. Based on the abstract of the research paper. Full text available via ArXiv.

Ende der Übertragung

Originalquelle

Privacy Protocol

Wir verwenden CleanNet Technology für maximale Datensouveränität. Alle Ressourcen werden lokal von unseren gesicherten deutschen Servern geladen. Ihre IP-Adresse verlässt niemals unsere Infrastruktur. Wir verwenden ausschließlich technisch notwendige Cookies.

Core SystemsTechnisch notwendig
External Media (3.Cookies)Maps, Video Streams
Analytics (Lokal mit Matomo)Anonyme Metriken
Datenschutz lesen