FusionLog Enables Zero-Label Cross-System Log Anomaly Detection
Global: FusionLog Enables Zero-Label Cross-System Log Anomaly Detection
Researchers have introduced FusionLog, a method that detects anomalies in web system logs without any labeled data from the target environment. The approach combines general anomaly patterns learned from mature systems with proprietary characteristics of the new system, aiming to improve stability and reliability. Experiments conducted on three publicly available log datasets demonstrate that FusionLog attains an F1‑score exceeding 90 percent under a fully zero‑label configuration. The work was submitted to arXiv in November 2025 and addresses the persistent challenge of scarce labeled logs in emerging web services.
The Challenge of Labeled Log Data
Log‑based anomaly detection relies heavily on annotated datasets to train models that can differentiate normal from abnormal behavior. New or rapidly evolving web applications often lack sufficient labeled logs, which hampers the timely deployment of effective monitoring solutions.
Limitations of Existing Transfer Learning Methods
Current cross‑system techniques typically transfer general knowledge from a source system that possesses abundant labeled logs to a target system with limited annotations. While these methods capture shared anomaly patterns, they frequently overlook the mismatch between transferred knowledge and the unique, proprietary behaviors of the target environment, resulting in suboptimal detection accuracy.
FusionLog Architecture Overview
FusionLog introduces a training‑free router that assesses semantic similarity to dynamically split unlabeled target logs into two categories: “general logs” and “proprietary logs.” This partitioning enables the system to apply distinct processing pathways that respect both shared and system‑specific characteristics.
General Knowledge Component
For logs classified as general, FusionLog employs a compact model built on system‑agnostic representation meta‑learning. The model is trained on source‑domain data and can be directly applied to target logs, leveraging common anomaly signatures without additional fine‑tuning.
Proprietary Knowledge Fusion
Logs identified as proprietary undergo an iterative pseudo‑labeling process. A large language model (LLM) generates tentative anomaly labels, which are then used to fine‑tune the small model through multi‑round collaborative knowledge distillation. This cycle refines the model’s ability to recognize anomalies that are unique to the target system.
Experimental Evaluation
The authors evaluated FusionLog on three public log datasets representing distinct web services. Each dataset was treated as a target system with no labeled logs, while the remaining datasets served as source domains. The experiments measured precision, recall, and F1‑score across multiple runs.
Performance Highlights
FusionLog consistently achieved F1‑scores above 90 percent, surpassing state‑of‑the‑art cross‑system anomaly detection baselines by a notable margin. The results indicate that the dual‑knowledge fusion strategy effectively mitigates the drawbacks of pure transfer learning.
Broader Impact
By eliminating the need for target‑specific labeled logs, FusionLog could accelerate the deployment of security monitoring in newly launched or rapidly scaling web applications, potentially reducing downtime and improving user trust.
Next Steps
Future research may explore extending the router’s semantic similarity metrics, integrating additional large‑model architectures, and testing the framework on real‑world production environments to validate scalability and robustness.
This report is based on information from arXiv, licensed under Academic Preprint / Open Access. Based on the abstract of the research paper. Full text available via ArXiv.
Ende der Übertragung