Products/Developer Tool / WebAssembly / Vector Search/Show HN: TurboQuant-WASM – Google's vector quantization in the browser

Show HN: TurboQuant-WASM – Google's vector quantization in the browser

SIMD vector compression — 3 bits/dim with fast dot product

Developer Tool / WebAssembly / Vector Searchnpm package with embedded WASMRelaxed SIMD support (@mulAdd/FMA maps to f32x4.relaxed_madd)SIMD-vectorized QJL sign packing/unpacking and scalingTypeScript API (init/encode/decode/dot/dotBatch)Golden-value tests with byte-identical outputFast dot product without decodingBatch search (dotBatch) - 83x faster than looping dot()No training required - encode any vector immediately

Visit Show HN: TurboQuant-WASM – Google's vector quantization in the browser →

Our Take

This tackles embedding bloat in a way that actually matters — the kind of thing that makes a 1.5GB index basically unusable on mobile. TeamChong squeezed float32 vectors 6x using WASM and Google's quantization trick, and the real flex is skipping the training entirely. Most quantization approaches need hours of preprocessing, but this just wants a dim and a seed, then you're ready to encode. Direct dot product on compressed data without decoding is the move here, and relaxed SIMD support in modern browsers finally makes it viable. 211 stars in a few weeks isn't bad signal for something this niche.

WASM-based vector quantization library that compresses float32 embeddings 6x (1.5GB → 240MB) and enables direct search on compressed data without decompression

Problem It Solves

Float32 embedding indexes are too large for mobile RAM, take minutes to download, and gzip only saves ~7% due to high entropy

Target Customer

Developers building browser/edge applications needing vector compression without training overhead

Use Cases

Vector search, Image similarity, 3D Gaussian Splatting compression, LLM KV cache compression, Real-time indexing, Browser and edge deployment

Pricing Details

Open source (MIT license)

Differentiator

No training step required - unlike PQ/OPQ, just init(dim, seed) and encode any vector immediately. Each vector is self-contained for streaming data.

Why Now

Browser support for relaxed SIMD is now available (Chrome 114+, Firefox 128+, Safari 18+, Node 20+)

Traction

Notable Metrics: 211 stars, 7 forks, 96 commits

Key Facts

The people behind Show HN: TurboQuant-WASM – Google's vector quantization in the browser

botirk38

profile

GitHub

Links

Website GitHub Source: hacker-news

Want products like this in your inbox every morning?

Five products. Every morning. Written by someone who actually cares whether they're good or not. Free forever, unsubscribe whenever.