Back to reviews
TurboQuant-WASM

TurboQuant-WASM

Google's TurboQuant vector compression running at 3 bits/dim in your browser

TurboQuant-WASM brings Google's ICLR 2026 vector quantization algorithm to the browser via WebAssembly and relaxed SIMD instructions. It achieves 3 bits per dimension with fast dot product — meaning you can run vector search, image similarity, and 3D Gaussian Splatting compression entirely client-side, with no server round trips and no API keys required. Created by Steven (teamchong) and published on April 4, 2026, the project ships as an npm package and includes a TypeScript API with init, encode, decode, and dot functions. Under the hood it's mostly Zig compiled to WASM, with SIMD-vectorized operations using the relaxed SIMD spec. Requires Chrome 114+, Firefox 128+, Safari 18+, or Node 20+. This is a genuinely small but sharp tool: it takes a research paper that runs on data-center GPUs and puts it in a browser tab. The implications for privacy-first semantic search and on-device embedding workflows are real — no data leaves the user's machine.

Panel Reviews

Ship

I've been looking for a way to do semantic search in a static site without hitting a backend. TurboQuant-WASM solves it. npm install, encode your embeddings once, and ship a compressed index alongside the JS. The dot product speed is surprisingly good for a WASM build.

Skip

Cool proof of concept, but the relaxed SIMD requirement still excludes a non-trivial slice of browsers. The 85 GitHub stars suggest it's very early. Also — 3 bits/dim is great for compression, but the quality degradation versus float32 for niche embedding tasks hasn't been benchmarked widely. Use with caution in production.

Skip

On-device vector search is the primitive that enables a whole class of privacy-first AI apps. TurboQuant-WASM is a stepping stone toward fully local RAG in the browser. When WebGPU matures further, this pattern becomes a cornerstone of the client-side AI stack.

Ship

I used this to add 'find similar images' to my portfolio site with zero backend. Compressed 2,000 CLIP embeddings to under 100KB, built a dot product index in JS, and it works offline. For the size of the package, the capability is wild.

Community Sentiment

OverallNaN mentions
NaN% positiveNaN% neutralNaN% negative
HN mentions

TurboQuant in the browser is the kind of thing that shouldn't be possible yet

Reddit mentions

3 bits/dim client-side opens up a lot for offline-capable apps

Twitter/X mentions

npm install turboquant-wasm and you have Google's ICLR paper running in your tab