Details
-
New Feature
-
Status: Closed (View Workflow)
-
Critical
-
Resolution: Won't Fix
Description
Summary
Add per-index ef_construction parameter to control the number of candidates evaluated during HNSW graph construction. Higher values create more accurate index graphs at the cost of slower inserts.
This is a focused PR that only adds ef_construction configurability (split from #4589 based on reviewer feedback to separate features).
Benchmark Results (SIFT dataset, 50K vectors, 128 dims)
ef_construction ef_search recall@10 query_time
10 (current) 10 99.75% 20.95ms
100 10 100% 9.81ms
Key findings:
2.1x faster queries with ef_construction=100 at same ef_search
Build time: 1.6x slower (559s vs 353s)
Break-even: ~18,500 queries
Use Case
For workloads where the index is built once but queried millions of times, higher ef_construction saves significant total time by enabling faster queries at the same recall level.
Changes
Replace static constexpr ef_construction=10 with per-index option
Add mhnsw_default_ef_construction system variable (range 1-10000, default 10)
Add ef_construction to ha_index_option_struct and MHNSW_Share
Add HA_IOPTION_SYSVAR for ef_construction in mhnsw_index_options
Add test for ef_construction parameter
Usage
– Via system variable (affects new indexes)
SET mhnsw_default_ef_construction = 100;
CREATE TABLE t (v VECTOR(128) NOT NULL, VECTOR INDEX (v));
– Via index option
CREATE TABLE t (v VECTOR(128) NOT NULL, VECTOR INDEX (v) ef_construction=100);
– Combined with M parameter
CREATE TABLE t (v VECTOR(128) NOT NULL, VECTOR INDEX (v) M=16 ef_construction=100);