Details
-
New Feature
-
Status: Open (View Workflow)
-
Critical
-
Resolution: Unresolved
-
None
-
Q3/2026 Server Development
Description
This is mainly for ALTER ..., although can be applied to other bulk insert cases.
- don't write nodes to disk, only build the graph in memory
- build it in background with multiple threads
- at the end of the statement write the graph down
Here's an idea of the solution that does the above (for the bulk insert into the empty table case, can be generalized to non-empty):
- in mhnsw_bulk_insert_row() the vector is preprocessed and stored in the node_cache.
- At this point tref is already known, but gref is not, so let's use some unique number (a counter or whatever).
- The node is not linked into the graph at this point.
- mhnsw_bulk_insert_row() appends the node to the work queue of one of the graph builder threads and returns.
- multiple graph builder threads pick unlinked nodes from their work queues and link them into the graph
- when all rows are inserted in the table, mhnsw_bulk_insert_end()
- writes all nodes to the index table, that's how they get their actual correct grefs. This step can be skipped if the storage engine can assign grefs without actually inserting a row (innodb hidden pk)
- waits for all graph builder threads to finish
- writes the final graph to the table
Attachments
Issue Links
- blocks
-
MDEV-38929 DISABLE/ENABLE KEYS support for vector indexes to allow batch HNSW construction
-
- Open
-
- relates to
-
MDEV-33411 OPTIMIZE TABLE for graph indexes
-
- Open
-
-
MDEV-32887 Vector Search
-
- Closed
-
- links to