Retrieval at Scale | Drop for 2026-04-08

TL;DR

Five meaningful updates for large‑scale retrieval since your last drop: (1) Lucene 10.4.0 is GA with new scalar/binary quantization formats, bulk vector scorers, and filtered‑HNSW speedups—directly impacting Elasticsearch/OpenSearch stacks; (2) Qdrant 1.17 adds a scalable relevance‑feedback query type and latency improvements; (3) GEM proposes a native graph index for multi‑vector retrieval (accepted to SIGMOD 2026), reporting large speedups; (4) Fiber‑Navigable Search introduces a geometric method for filtered‑ANN with predictable failure regimes and gains over FAISS‑HNSW; (5) DuckDB ships an hnsw_acorn community extension that brings ACORN‑1 filtered‑HNSW and RaBitQ quantization into SQL.

Apache Lucene 10.4.0 GA: new quantization codecs + bulk vector scoring

  • Key facts and current state of the topic
    • Lucene underpins Elasticsearch/OpenSearch; its vector/HNSW changes translate directly to production stacks. 10.4.0 artifacts were posted on Feb 25, 2026. (downloads.apache.org)
  • Important context and background information
    • Prior 10.3.x brought native late‑interaction reranking and broad speedups. 10.4 continues the vector focus with quantization and scorer improvements. (lucene.apache.org)
  • Recent developments or changes
    • 10.4.0 adds 4/8‑bit Optimized Scalar Quantization formats (including HNSW variants), asymmetric binary quantization modes, a VectorScorer.Bulk API for faster exact/bulk vector scoring, off‑heap scoring for low‑bit vectors, and bulk scoring in filtered‑HNSW/entry‑point stages; plus dynamic pruning fixes for sorted queries and HNSW build/merge optimizations. Evaluate for higher recall at fixed latency and safer filtered pipelines. (lucene.apache.org)

Qdrant 1.17: relevance feedback and lower search latency

  • Key facts and current state of the topic
    • Qdrant continues to push filtered‑ANN and production ops (ACORN arrived in 1.16). (qdrant.tech)
  • Important context and background information
    • Feedback‑aware retrieval and robust reranking are common needs in ad/search stacks where intent evolves over sessions.
  • Recent developments or changes
    • 1.17 introduces a Vector‑native Relevance Feedback Query operator, Weighted‑RRF fusion, and latency/observability improvements—useful for boosting early‑stage recall without heavy model passes. (qdrant.tech)

GEM: a native graph index for multi‑vector (late‑interaction) retrieval

  • Key facts and current state of the topic
    • Multi‑vector/late‑interaction excels in relevance but lacks purpose‑built indexes. GEM constructs a proximity graph over vector sets (not single vectors). Accepted to SIGMOD 2026. (arxiv.org)
  • Important context and background information
    • Decoupling the construction metric from final scoring and using set‑level clustering/beam search targets token‑ or patch‑level representations (e.g., ColBERT/ColPali‑style). (arxiv.org)
  • Recent developments or changes
    • Reports up to 16× speedups vs. strong multi‑vector baselines at similar or better accuracy; includes quantized distance estimates to trim verification cost. Pilot as a candidate‑gen layer before multi‑vector rerankers. (arxiv.org)

Fiber‑Navigable Search: geometric filtered‑ANN with predictable failure modes

  • Key facts and current state of the topic
    • Filtered‑ANN on HNSW often suffers from recall cliffs under selective predicates. (arxiv.org)
  • Important context and background information
    • Paper frames filtered subgraphs (“fibers”) and classifies failure regimes (topological cuts, geometric folds, genuine basins), guiding restart strategies. (arxiv.org)
  • Recent developments or changes
    • Proposes local signals + lightweight anchors to switch between full‑graph and filtered‑neighbor descent; empirically outperforms FAISS‑HNSW on filtered tasks with clean selectivity behavior—useful for metadata‑heavy ad retrieval. (arxiv.org)

DuckDB hnsw_acorn: ACORN‑1 filtered‑HNSW + RaBitQ inside SQL

  • Key facts and current state of the topic
    • ACORN‑style filtered search is spreading beyond search engines; DuckDB now lists a community extension adding ACORN‑1 traversal and RaBitQ binary quantization to its HNSW. (duckdb.org)
  • Important context and background information
    • Brings pushed‑down filters, selectivity‑aware strategy switching, and metadata joins into a relational setting—handy for hybrid analytics + retrieval. (duckdb.org)
  • Recent developments or changes
    • Extension supports per‑group nearest neighbors and ~21× memory reduction via RaBitQ with rescoring. Consider for embedded/ETL scenarios or as a lightweight A/B bench against service‑based vector stores. (duckdb.org)

If you want me to prioritize hands‑on A/B plans (e.g., Lucene 10.4 codecs vs. current PQ/RaBitQ; Qdrant feedback vs. baseline fusion; GEM as a candidate stage ahead of ColBERT‑style rerankers), say the word and I’ll draft them.