|
Use some real-life million-size dataset
Benchmark goals:
- Index creation time
- Index update time
- Index lookup for 1, 10, 100, 1000 entries for the same "query" vector repeated N times.
- This tests speed of graph lookup with hopefully most data in cache.
- Index lookup for million different query vectors (top 1, 10, 100, 1000) results, no repetition.
- This tests speed of graph lookup when data may not all fit in cache.
- Compute average recall of the algorithm for such queries.
Compare to papers using the same algorithm.
|