NeoChainDaily
NeoChainDaily
Uplink
Initialising Data Stream...
29.01.2026 • 05:25 Research & Innovation

OmegaUse GUI Agent Achieves State-of-the-Art Scores on Mobile and Desktop Benchmarks

Global: OmegaUse GUI Agent Achieves State-of-the-Art Scores on Mobile and Desktop Benchmarks

Researchers have unveiled OmegaUse, a general‑purpose graphical user interface (GUI) agent designed for autonomous task execution across both mobile and desktop platforms. In offline evaluations, the model attained a 96.3% score on the ScreenSpot‑V2 benchmark and a 79.1% step‑success rate on AndroidControl, indicating strong cross‑environment capabilities.

Data Construction Pipeline

To support the model, the team assembled a data pipeline that combines rigorously curated open‑source datasets with an automated synthesis framework. The framework merges bottom‑up autonomous exploration of interfaces with top‑down taxonomy‑guided generation, producing high‑fidelity synthetic interaction data.

Two‑Stage Training Strategy

The training regimen follows a decoupled two‑stage approach. First, Supervised Fine‑Tuning (SFT) establishes fundamental interaction syntax. Subsequently, Group Relative Policy Optimization (GRPO) refines spatial grounding and sequential planning, enhancing the agent’s decision‑making in complex GUI contexts.

Model Architecture

OmegaUse is built on a Mixture‑of‑Experts (MoE) backbone, a design choice intended to balance computational efficiency with the capacity for sophisticated agentic reasoning.

Cross‑Platform Benchmark Suite

The authors introduced OS‑Nav, a benchmark suite that spans multiple operating systems. OS‑Nav includes ChiM‑Nav, targeting Chinese Android mobile environments, and Ubu‑Nav, focusing on routine desktop interactions on Ubuntu. OmegaUse achieved a 74.24% step‑success rate on ChiM‑Nav and a 55.9% average success rate on Ubu‑Nav.

Performance Relative to Existing Standards

On established GUI benchmarks, OmegaUse set a new state‑of‑the‑art result of 96.3% on ScreenSpot‑V2 and led with a 79.1% step‑success rate on AndroidControl, surpassing previously reported figures for comparable agents.

Implications and Future Directions

The reported results suggest that advanced GUI agents like OmegaUse could streamline human‑computer interaction and boost productivity by autonomously handling routine tasks on diverse devices. The authors note ongoing work to improve real‑time adaptability and to expand evaluation to additional operating systems.

This report is based on information from arXiv, licensed under Academic Preprint / Open Access. Based on the abstract of the research paper. Full text available via ArXiv.

Ende der Übertragung

Originalquelle

Privacy Protocol

Wir verwenden CleanNet Technology für maximale Datensouveränität. Alle Ressourcen werden lokal von unseren gesicherten deutschen Servern geladen. Ihre IP-Adresse verlässt niemals unsere Infrastruktur. Wir verwenden ausschließlich technisch notwendige Cookies.

Core SystemsTechnisch notwendig
External Media (3.Cookies)Maps, Video Streams
Analytics (Lokal mit Matomo)Anonyme Metriken
Datenschutz lesen