[MDEV-33413] cache k-ANN graph in memory - Jira

Details

Type: Task
Status: Closed (View Workflow)
Priority: Critical
Resolution: Fixed
Fix Version/s: 11.7.1
Component/s: Vector search
Labels:
None

Description

Let's keep the graph in memory. Like, in TABLE_SHARE.
Add nodes there as they're read from the table, e.g. on searches.
Update on inserts.

When it'll grow beyond some (configurable) limit, start removing nodes that are close to each other. Or remove the bottom level in HNSW. The goal is to still keep enough in memory to find any row with no more than one table access.

Not clear how to do transaction isolation.

Attachments

Issue Links

is blocked by

MDEV-33414 benchmark vector indexes

Closed

is part of

MDEV-34939 vector search in 11.7

Closed

relates to

MDEV-32887 vector search

Stalled

Activity

Ascending order - Click to sort in descending order

Sergei Golubchik added a comment - 2024-07-10 08:20 - edited

Update:

cache works
it's shared
concurrent searches (SELECT) work
readers can populate the cache (from disk) concurrently
inserts are not concurrent, MyISAM-only (protected by the table lock)

Next step: InnoDB. Problems:

rollbacks, add few nodes, then rollback
transaction isolation, long running transaction won't see nodes added after it started
concurrent writes, two connections adding new nodes to the graph
savepoints

note that two concurrent transactions cannot modify the same node — InnoDB row locks will guarantee that it doesn't happen

Sergei Golubchik added a comment - 2024-07-10 08:20 - edited Update: cache works it's shared concurrent searches (SELECT) work readers can populate the cache (from disk) concurrently inserts are not concurrent, MyISAM-only (protected by the table lock) Next step: InnoDB. Problems: rollbacks, add few nodes, then rollback transaction isolation, long running transaction won't see nodes added after it started concurrent writes, two connections adding new nodes to the graph savepoints note that two concurrent transactions cannot modify the same node — InnoDB row locks will guarantee that it doesn't happen

Sergei Golubchik added a comment - 2024-07-12 11:55

InnoDB solutions:

we'll use a local cache
- tt must be the same for all instances of the same table in a transaction and flushed on commit
- so it'll be a pseudo-hton, registered for this transaction, and the cache will be in thd_ha_data.
- on rollback the cache is discarded, on commit it's merged into the shared cache
- for simplicity in the first version we'll only support one vector index per transaction
- for simplicity in the first version we won't support savepoints
this will make the shared graph slightly more "approximate"
- in the first version it's fine, if the benchmark will agree
- later we can extend the handler API to force read-committed for a specific table
solved by 1
see 1
- will be implemented by deducting local cache from global, instead of merging it in if a savepoint rollback was used

Sergei Golubchik added a comment - 2024-07-12 11:55 InnoDB solutions: we'll use a local cache tt must be the same for all instances of the same table in a transaction and flushed on commit so it'll be a pseudo-hton, registered for this transaction, and the cache will be in thd_ha_data . on rollback the cache is discarded, on commit it's merged into the shared cache for simplicity in the first version we'll only support one vector index per transaction for simplicity in the first version we won't support savepoints this will make the shared graph slightly more "approximate" in the first version it's fine, if the benchmark will agree later we can extend the handler API to force read-committed for a specific table solved by 1 see 1 will be implemented by deducting local cache from global, instead of merging it in if a savepoint rollback was used

Sergei Golubchik added a comment - 2024-07-14 16:24 - edited

savepoints work
many vector indexes per transaction work
on commit the trx cache is not merged into ctx cache, but deducted, like for savepoints
- because ann_benchmarks inserts into an empty table, so merging causes a slowdown
- we might want to add merging later again, protected by a heuristic and supported by an adequate benchmark
- same for the optimization where trx cache would load nodes from the ctx cache, not from the table

Sergei Golubchik added a comment - 2024-07-14 16:24 - edited savepoints work many vector indexes per transaction work on commit the trx cache is not merged into ctx cache, but deducted, like for savepoints because ann_benchmarks inserts into an empty table, so merging causes a slowdown we might want to add merging later again, protected by a heuristic and supported by an adequate benchmark same for the optimization where trx cache would load nodes from the ctx cache, not from the table

People

Assignee:: Sergei Golubchik

Reporter:: Sergei Golubchik

Votes:: 0 Vote for this issue

Watchers:: 3 Start watching this issue

Dates

Created:: 2024-02-07 11:41

Updated:: 2024-11-06 13:56

Resolved:: 2024-11-06 13:56

Git Integration

Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.

MariaDB Server