PromptScreen: High-Accuracy Defense Against LLM Prompt Attacks

Global: PromptScreen Demonstrates High-Accuracy, Low-Latency Defense Against LLM Prompt Attacks

Background and Motivation

Researchers have introduced PromptScreen, a defense architecture designed to mitigate prompt injection and jailbreaking attacks targeting large language model (LLM) applications. The system aims to address persistent security challenges by delivering both high detection precision and minimal processing delay.

Core Semantic Filter

The central component of PromptScreen is a semantic filter that employs text normalization, TF‑IDF vectorization, and a linear support‑vector‑machine (SVM) classifier. In held‑out testing, this filter achieved 93.4% accuracy and 96.5% specificity, indicating strong discrimination between benign inputs and malicious prompts.

Multi‑Stage Pipeline Performance

Built on the lightweight filter, the full pipeline incorporates additional detection and mitigation mechanisms that operate sequentially. This staged approach reduces attack throughput while adding only negligible computational overhead, preserving the responsiveness of LLM‑driven services.

Benchmark Comparison

Comparative experiments showed that the SVM‑based configuration raised overall accuracy from 35.1% to 93.4% and cut average time‑to‑completion from approximately 450 seconds to 47 seconds. These results represent more than a ten‑fold reduction in latency relative to the previously reported ShieldGemma system.

Evaluation Dataset

The authors evaluated PromptScreen on a curated corpus of over 30,000 labeled prompts, encompassing benign queries, jailbreak attempts, and application‑layer injection examples. Across this diverse set, the staged defense consistently maintained robust security performance.

Implications for LLM Security

By delivering high‑precision detection with substantially lower latency, PromptScreen addresses a core limitation of existing model‑based moderators. The architecture offers a scalable solution for protecting modern LLM‑driven applications against sophisticated prompt‑based threats.This report is based on information from arXiv, licensed under Academic Preprint / Open Access. Based on the abstract of the research paper. Full text available via ArXiv.

PromptScreen Demonstrates High-Accuracy, Low-Latency Defense Against LLM Prompt Attacks