Details
-
Task
-
Status: Closed (View Workflow)
-
Critical
-
Resolution: Fixed
-
None
Description
Use some real-life million-size dataset
Benchmark goals:
- Index creation time
- Index update time
- Index lookup for 1, 10, 100, 1000 entries for the same "query" vector repeated N times.
- This tests speed of graph lookup with hopefully most data in cache.
- Index lookup for million different query vectors (top 1, 10, 100, 1000) results, no repetition.
- This tests speed of graph lookup when data may not all fit in cache.
- Compute average recall of the algorithm for such queries.
Compare to papers using the same algorithm.
Attachments
Issue Links
- blocks
-
MDEV-33413 cache k-ANN graph in memory
- Closed
-
MDEV-33415 graph index search: heuristical edge choice
- Closed
-
MDEV-33416 graph index: use smaller floating point numbers
- Closed
-
MDEV-33418 graph index insert: stronger selection of neighbors
- Closed
-
MDEV-33419 graph index insert: consider more neighbors
- Open
- is blocked by
-
MDEV-33406 basic optimizer support for k-NN searches
- Closed
-
MDEV-33407 Parser support for vector indexes
- Closed
-
MDEV-33408 HNSW for k-ANN vector searches
- Closed
- is part of
-
MDEV-34939 vector search in 11.7
- Closed
- relates to
-
MDEV-33404 Engine-independent indexes: subtable method
- Closed
-
MDEV-33405 Engine-independent indexes: low-level API method
- Closed
-
MDEV-32887 vector search
- Stalled