New Benchmark Highlights LLM Web Agents' Weakness to Malicious URLs

Global: New Benchmark Highlights LLM Web Agents’ Susceptibility to Malicious URLs

Researchers led by Dezhang Kong released MalURLBench on January 26, 2026, a comprehensive benchmark designed to assess large‑language‑model (LLM) web agents’ ability to detect malicious web addresses. The study, updated on January 30, 2026, addresses growing concerns that LLM‑driven assistants may inadvertently follow disguised harmful URLs, exposing users and service providers to security threats.

Benchmark Overview

MalURLBench is positioned as the first systematic evaluation suite targeting the specific risk of URL‑based attacks on LLM agents. The authors argue that existing security tests do not adequately capture the nuanced ways malicious links can be crafted to bypass model defenses.

Dataset Composition

The benchmark comprises 61,845 attack instances drawn from ten real‑world scenarios and seven categories of authentic malicious websites. Each instance pairs a benign‑looking URL with a concealed payload, reflecting tactics observed in phishing, drive‑by downloads, and other web‑based exploits.

Evaluation Findings

Experiments involving twelve widely used LLMs revealed that current models frequently fail to flag elaborately disguised URLs. Success rates for detecting malicious links varied widely, with several leading models misclassifying a substantial portion of the test set.

Key Vulnerability Factors

The authors identify several factors that influence attack success, including URL length, use of homograph characters, and the presence of benign‑looking subdomains. Analysis suggests that models rely heavily on surface‑level token patterns rather than deeper semantic reasoning about URL safety.

Proposed Defense Mechanism

To mitigate the identified weaknesses, the paper introduces URLGuard, a lightweight module that preprocesses URLs before they are fed to the LLM. Preliminary results indicate that integrating URLGuard can improve detection rates without significantly increasing inference latency.

Implications for Future Research

The release of MalURLBench provides a foundational resource for the security community to benchmark and improve LLM resilience against web‑based threats. The authors anticipate that the dataset will spur the development of more robust detection strategies and encourage broader scrutiny of LLM‑driven web agents.

This report is based on information from arXiv, licensed under Academic Preprint / Open Access. Based on the abstract of the research paper. Full text available via ArXiv.

New Benchmark Highlights LLM Web Agents’ Susceptibility to Malicious URLs