Trinity-Large-Thinking

399B open MoE reasoning model that's 96% cheaper than Claude Opus

Trinity-Large-Thinking is a 399-billion-parameter open mixture-of-experts (MoE) reasoning model from Arcee AI, released under Apache 2.0. It's designed specifically for long-horizon multi-turn tool use and autonomous agentic tasks — thinking before responding with an explicit reasoning chain. The model ranked #2 on PinchBench (behind only Claude Opus 4.6) while costing $0.90/M output tokens via the Arcee API — roughly 96% cheaper than Opus. The full weights are freely downloadable from Hugging Face, making it one of the most capable openly-downloadable models available anywhere. Architecturally it draws on MoE efficiency to activate only a fraction of parameters per forward pass, enabling the massive 399B count without proportional compute cost. For teams building production agents that need serious reasoning but can't afford closed-model pricing at scale, Trinity-Large-Thinking is the most compelling open alternative that's appeared in a long time.

Panel Reviews

The Builder

Developer Perspective

Ship

“Near-Opus-level reasoning at $0.90/M tokens is the pricing inflection I've been waiting for. Apache 2.0 weights mean I can self-host for compliance-sensitive use cases. Already benchmarking it as a drop-in for my agent evaluation pipeline.”

The Skeptic

Reality Check

Skip

“Preview weights and PinchBench rankings tell part of the story — real-world agentic performance on messy production tasks is another matter. Arcee AI isn't Anthropic or Google; sustaining a 399B model with quality ongoing RLHF is expensive and the preview label is a yellow flag.”

The Futurist

Big Picture

Ship

“A US-built, Apache-licensed frontier reasoning model competitive with closed offerings fundamentally changes the open-source AI landscape. The talent and capital required to do this was thought to only exist at the biggest labs. Arcee just proved otherwise.”

The Creator

Content & Design

Ship

“The thinking chain output is remarkably coherent for creative briefs and long-form narrative planning. At this price point I can run draft-then-refine pipelines at scale without budget anxiety. A genuine Ship for creative workflows.”

Community Sentiment

Overall890 mentions

71% positive20% neutral9% negative

Hacker News190 mentions

“Apache 2.0 licensing and 96% cost reduction vs. Opus”

Reddit280 mentions

“Benchmark rankings and real-world agent performance”

Twitter/X420 mentions

“PinchBench #2 ranking and open weights”