TurboQuant-WASM
Google's TurboQuant vector compression running at 3 bits/dim in your browser
TurboQuant-WASM brings Google's ICLR 2026 vector quantization algorithm to the browser via WebAssembly and relaxed SIMD instructions. It achieves 3 bits per dimension with fast dot product — meaning you can run vector search, image similarity, and 3D Gaussian Splatting compression entirely client-side, with no server round trips and no API keys required. Created by Steven (teamchong) and published on April 4, 2026, the project ships as an npm package and includes a TypeScript API with init, encode, decode, and dot functions. Under the hood it's mostly Zig compiled to WASM, with SIMD-vectorized operations using the relaxed SIMD spec. Requires Chrome 114+, Firefox 128+, Safari 18+, or Node 20+. This is a genuinely small but sharp tool: it takes a research paper that runs on data-center GPUs and puts it in a browser tab. The implications for privacy-first semantic search and on-device embedding workflows are real — no data leaves the user's machine.
Panel Reviews
“I've been looking for a way to do semantic search in a static site without hitting a backend. TurboQuant-WASM solves it. npm install, encode your embeddings once, and ship a compressed index alongside the JS. The dot product speed is surprisingly good for a WASM build.”
“Cool proof of concept, but the relaxed SIMD requirement still excludes a non-trivial slice of browsers. The 85 GitHub stars suggest it's very early. Also — 3 bits/dim is great for compression, but the quality degradation versus float32 for niche embedding tasks hasn't been benchmarked widely. Use with caution in production.”
“On-device vector search is the primitive that enables a whole class of privacy-first AI apps. TurboQuant-WASM is a stepping stone toward fully local RAG in the browser. When WebGPU matures further, this pattern becomes a cornerstone of the client-side AI stack.”
“I used this to add 'find similar images' to my portfolio site with zero backend. Compressed 2,000 CLIP embeddings to under 100KB, built a dot product index in JS, and it works offline. For the size of the package, the capability is wild.”
Community Sentiment
“TurboQuant in the browser is the kind of thing that shouldn't be possible yet”
“3 bits/dim client-side opens up a lot for offline-capable apps”
“npm install turboquant-wasm and you have Google's ICLR paper running in your tab”