Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-39626

Pluggable quantization framework for VECTOR INDEX with TurboQuant as first method Description

    XMLWordPrintable

Details

    Description

      Background
      MariaDB Vector stores MHNSW index entries at int16 precision. There is currently no way to reduce per-vector memory footprint via quantization. For large vector workloads (millions of embeddings at 768–1536 dimensions), memory consumption is a significant cost and scalability constraint.

      TurboQuant (Google Research, 2025) is a data-oblivious vector quantization algorithm: it applies a randomized Hadamard rotation followed by scalar quantization, then corrects inner-product bias with a 1-bit QJL transform. It requires no training data, no codebook learning, and near-zero preprocessing time — making it well suited for online index builds. At 4-bit precision it achieves roughly 8× compression versus float32 with recall degradation typically within 1–3% of uncompressed search.

      Proposed Changes
      1. Pluggable quantization API in MHNSW
      Add an internal API that allows quantization methods to be registered and selected per-index. The API handles: vector encoding/decoding, quantized distance computation, and metadata exposure via information_schema and SHOW INDEX. Designed to be method-agnostic so future quantization algorithms can be added without changes to the core MHNSW graph traversal.

      2. TurboQuant implementation
      Implement TurboQuant at configurable bit-widths (1-bit, 2-bit, 4-bit) for cosine, Euclidean, and dot-product distance metrics. Includes Hadamard rotation, scalar quantization, QJL bias correction, and length renormalization (per RaBitQ). SIMD kernels for AVX2/AVX512/ARM/POWER10, with scalar fallback.

      References

      1. TurboQuant paper: https://arxiv.org/abs/2504.19874
      2. Google Research blog: https://research.google/blog/turboquant-redefining-ai-efficiency-with-extreme-compression/
      3. RaBitQ (renormalization technique): https://arxiv.org/abs/2405.12497

      Attachments

        Activity

          People

            serg Sergei Golubchik
            adamluciano Adam Luciano
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated:

              Git Integration

                Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.