Back
H CompanyLaunchH Company2026-04-05

H Company's Holo3 Tops OSWorld at 78.85% — Beating GPT-5.4 at 1/10th the Cost

Paris-based H Company released Holo3, a GUI-specialist VLM that scores 78.85% on OSWorld-Verified — the gold standard for computer-use AI. It outperforms GPT-5.4 Thinking and Claude Opus 4.6 while being significantly cheaper to run, with Apache 2.0 weights available for self-hosting.

Original source

## H Company's Holo3 Tops OSWorld-Verified at 78.85% — Open Source and 10x Cheaper Than GPT

Paris-based AI startup H Company today released Holo3, a vision-language model specifically designed for GUI agents — AI systems that can autonomously navigate websites, desktop apps, and mobile interfaces. Its flagship 35B-A3B mixture-of-experts variant scores **78.85% on OSWorld-Verified**, the most rigorous and uncontaminated benchmark for evaluating computer-use AI agents.

That number puts Holo3 above both **GPT-5.4 Thinking** (which H Company claims runs at roughly 10x the inference cost) and **Claude Opus 4.6** on the same benchmark. OSWorld-Verified is deliberately constructed to prevent benchmark contamination — it uses tasks that can't easily be gamed by memorizing web pages or common UI patterns.

### What makes Holo3 different

Unlike general-purpose VLMs adapted for GUI tasks, Holo3 was built from the ground up for screen understanding. Its sparse MoE architecture separates visual perception (what's on screen) from action planning (what to click next), allowing it to handle complex multi-step workflows with fewer errors than larger general models.

H Company is releasing the **35B-A3B weights under Apache 2.0** — fully open, including commercial use. A free API tier is available at hub.hcompany.ai for developers who don't want to run their own infrastructure.

### Why this matters

The GUI agent space has been dominated by OpenAI and Anthropic's computer-use offerings, which are expensive and proprietary. Holo3's open-source release at SOTA performance changes the calculus for enterprises building browser automation, RPA replacement pipelines, and AI-powered QA workflows. If the numbers hold up in production, H Company has just disrupted a nascent but rapidly growing market before it fully formed.

Founded by former DeepMind researchers, H Company has been building quietly since 2023. Holo3 is their public debut — and it's a strong one.

Panel Takes

The Builder

The Builder

Developer Perspective

SOTA computer-use benchmark + Apache 2.0 + free API tier is a devastating combination for the incumbents. This lands at exactly the moment enterprise teams are evaluating which GUI agent model to standardize on. H Company just jumped to the top of every RFP.

The Skeptic

The Skeptic

Reality Check

OSWorld is a controlled benchmark — real-world GUI agents deal with dynamic content, session timeouts, CAPTCHAs, and edge cases the benchmark doesn't capture. A small startup also can't iterate on this as fast as OpenAI or Anthropic when production bugs emerge. The benchmark lead may not survive contact with real enterprise workflows.

The Futurist

The Futurist

Big Picture

Vertical AI specialists beating general-purpose frontier labs at specific tasks is the defining trend of 2026. GUI agents are a $50B+ market opportunity — they automate every piece of software that doesn't have an API. Holo3 opening that market to open-source deployment is a significant moment.