GLM-4.7

China's open-source coding model beats Claude on SWE-bench at $3/month — or run it free locally

GLM-4.7 is an open-source coding large language model from Zhipu AI (Z.ai) that scores 73.8% on SWE-bench Verified — the highest among open-source models — and 84.9% on LiveCodeBench, ahead of Claude Sonnet 4.5. The model introduces 'Preserved Thinking,' maintaining reasoning chains across multiple turns rather than resetting context between requests — directly targeting the biggest frustration in multi-turn agentic coding workflows. Weights are available on Hugging Face and ModelScope, compatible with vLLM and SGLang for local deployment. API access starts at $3/month via chat.z.ai. A companion GLM-4.7-Flash (30B-A3B MoE) is available for efficient local deployment.

Panel Reviews

Ship

“SWE-bench Verified at 73.8% from an open-weight model you can run on your own hardware is a genuine milestone. The Preserved Thinking feature addresses a real pain point — agents that forget their reasoning chain mid-task are less useful. Worth benchmarking on your actual codebase before committing.”

Skip

“Benchmark scores and production performance are different things. SWE-bench is a curated dataset; real codebases are messier. GLM-4.7 was released in December 2025 — 'April 2026 trending' is a repackaging of older news. And any model from a Chinese lab raises due-diligence questions for enterprise deployments.”

Skip

“The price-performance curve is collapsing. $3/month for a model that matches frontier closed models on coding benchmarks signals where the market is heading: open weights with commercial performance, freely auditable, locally deployable. GLM-4.7 is today's proof; next year's open models will be better.”

Ship

“For indie builders who can't afford $200/month for Copilot or frontier API costs, GLM-4.7 via vLLM on a rented GPU is a credible alternative. The multi-turn reasoning retention makes it particularly good for long coding sessions where context matters.”

Community Sentiment

Overall3,420 mentions

70% positive19% neutral11% negative

HackerNews520 mentions

“73.8% SWE-bench from an open-weight model is legitimately impressive”

Reddit1100 mentions

“'Preserved Thinking' across turns is the feature I wish every model had”

Twitter/X1800 mentions

“This runs free locally with vLLM and beats paid tools on real coding benchmarks”