Details
-
New Feature
-
Status: Closed (View Workflow)
-
Critical
-
Resolution: Fixed
-
None
Description
For the first iteration of Vector search, we will implement HNSW algorithm.
The implementation will only support Euclidean distance initially.
Basic plan:
Graph construction will be done according to HNSW paper.
Storage wise, we'll store the graph as part of a subtable (MDEV-33404).
The table's definition will be something along these lines:
CREATE TABLE i ( |
level int unsigned not null, |
src varbinary(255) not null, |
dst varbinary(255) not null, |
index (level,src), |
index (level,dst)); |
For each link in the graph, there will be a corresponding entry in the table.
- src and dst will store handler::position, a quick link to the actual vector blob in the main table.
The index (level,src) will allow for quick jumping between nodes.
To go deeper in search, one just needs to decrement the level and search using the same "src" value.
If src is found on level n, then it is also found on level n - 1 and so on. Level 0 is the base level with all the nodes.
Performance considerations:
- Storing the vector in the subtable might be required. Looking up the blob value in the base table might be too costly.
Attachments
Issue Links
- blocks
-
MDEV-33411 OPTIMIZE for graph indexes
-
- Open
-
-
MDEV-33414 benchmark vector indexes
-
- Closed
-
-
MDEV-33415 graph index search: heuristical edge choice
-
- Closed
-
-
MDEV-33416 graph index: use smaller floating point numbers
-
- Closed
-
-
MDEV-33418 graph index insert: stronger selection of neighbors
-
- Closed
-
-
MDEV-33419 graph index insert: consider more neighbors
-
- Open
-
- is blocked by
-
MDEV-33404 Engine-independent indexes: subtable method
-
- Closed
-
-
MDEV-36317 vector search with Cosine Distance, the recall rate of the returned results is very low
-
- Open
-
-
MDEV-36338 vector search with Cosine Distance is slow
-
- Closed
-
- is part of
-
MDEV-34939 vector search in 11.7
-
- Closed
-
- relates to
-
MDEV-32887 vector search
-
- Stalled
-
Activity
Field | Original Value | New Value |
---|---|---|
Link | This issue relates to MDEV-32887 [ MDEV-32887 ] |
Description | basic HNSW implementation | basic HNSW implementation. Euclidean distance only. |
Link | This issue blocks MDEV-33411 [ MDEV-33411 ] |
Link |
This issue is blocked by |
Link |
This issue blocks |
Link |
This issue blocks |
Link |
This issue blocks |
Link |
This issue blocks |
Link | This issue blocks MDEV-33419 [ MDEV-33419 ] |
Description | basic HNSW implementation. Euclidean distance only. |
For the first iteration of Vector search, we will implement HNSW algorithm.
The implementation will only support Euclidean distance initially. Basic plan: Store the graph as part of a subtable |
Description |
For the first iteration of Vector search, we will implement HNSW algorithm.
The implementation will only support Euclidean distance initially. Basic plan: Store the graph as part of a subtable |
For the first iteration of Vector search, we will implement HNSW algorithm.
The implementation will only support Euclidean distance initially. Basic plan: Store the graph as part of a subtable |
Description |
For the first iteration of Vector search, we will implement HNSW algorithm.
The implementation will only support Euclidean distance initially. Basic plan: Store the graph as part of a subtable |
For the first iteration of Vector search, we will implement HNSW algorithm.
The implementation will only support Euclidean distance initially. *Basic plan:* Store the graph as part of a subtable ( The table's definition will be something along these lines: {code:sql} CREATE TABLE i ( level int unsigned not null, src varbinary(255) not null, dst varbinary(255) not null, index (level,src), index (level,dst)); {code} For each link in the graph, there will be a corresponding entry in the table. * {{src}} and {{dst}} will store {{handler::position}}, a quick link to the actual vector blob in the main table. The index (level,src) will allow for quick jumping between nodes. To go deeper in search, one just needs to increment the level. If src is found on level {{n}}, then it is also found on level {{n - 1}} and so on. Level {{0}} is the base level with all the nodes. |
Description |
For the first iteration of Vector search, we will implement HNSW algorithm.
The implementation will only support Euclidean distance initially. *Basic plan:* Store the graph as part of a subtable ( The table's definition will be something along these lines: {code:sql} CREATE TABLE i ( level int unsigned not null, src varbinary(255) not null, dst varbinary(255) not null, index (level,src), index (level,dst)); {code} For each link in the graph, there will be a corresponding entry in the table. * {{src}} and {{dst}} will store {{handler::position}}, a quick link to the actual vector blob in the main table. The index (level,src) will allow for quick jumping between nodes. To go deeper in search, one just needs to increment the level. If src is found on level {{n}}, then it is also found on level {{n - 1}} and so on. Level {{0}} is the base level with all the nodes. |
For the first iteration of Vector search, we will implement HNSW algorithm.
The implementation will only support Euclidean distance initially. *Basic plan:* Store the graph as part of a subtable ( The table's definition will be something along these lines: {code:sql} CREATE TABLE i ( level int unsigned not null, src varbinary(255) not null, dst varbinary(255) not null, index (level,src), index (level,dst)); {code} For each link in the graph, there will be a corresponding entry in the table. * {{src}} and {{dst}} will store {{handler::position}}, a quick link to the actual vector blob in the main table. The index (level,src) will allow for quick jumping between nodes. To go deeper in search, one just needs to increment the level. If {{src}} is found on level {{n}}, then it is also found on level {{n - 1}} and so on. Level {{0}} is the base level with all the nodes. |
Description |
For the first iteration of Vector search, we will implement HNSW algorithm.
The implementation will only support Euclidean distance initially. *Basic plan:* Store the graph as part of a subtable ( The table's definition will be something along these lines: {code:sql} CREATE TABLE i ( level int unsigned not null, src varbinary(255) not null, dst varbinary(255) not null, index (level,src), index (level,dst)); {code} For each link in the graph, there will be a corresponding entry in the table. * {{src}} and {{dst}} will store {{handler::position}}, a quick link to the actual vector blob in the main table. The index (level,src) will allow for quick jumping between nodes. To go deeper in search, one just needs to increment the level. If {{src}} is found on level {{n}}, then it is also found on level {{n - 1}} and so on. Level {{0}} is the base level with all the nodes. |
For the first iteration of Vector search, we will implement HNSW algorithm.
The implementation will only support Euclidean distance initially. *Basic plan:* Store the graph as part of a subtable ( The table's definition will be something along these lines: {code:sql} CREATE TABLE i ( level int unsigned not null, src varbinary(255) not null, dst varbinary(255) not null, index (level,src), index (level,dst)); {code} For each link in the graph, there will be a corresponding entry in the table. * {{src}} and {{dst}} will store {{handler::position}}, a quick link to the actual vector blob in the main table. The index {{(level,src)}} will allow for quick jumping between nodes. To go deeper in search, one just needs to decrement the level and search using the same "src" value. If {{src}} is found on level {{n}}, then it is also found on level {{n - 1}} and so on. Level {{0}} is the base level with all the nodes. |
Description |
For the first iteration of Vector search, we will implement HNSW algorithm.
The implementation will only support Euclidean distance initially. *Basic plan:* Store the graph as part of a subtable ( The table's definition will be something along these lines: {code:sql} CREATE TABLE i ( level int unsigned not null, src varbinary(255) not null, dst varbinary(255) not null, index (level,src), index (level,dst)); {code} For each link in the graph, there will be a corresponding entry in the table. * {{src}} and {{dst}} will store {{handler::position}}, a quick link to the actual vector blob in the main table. The index {{(level,src)}} will allow for quick jumping between nodes. To go deeper in search, one just needs to decrement the level and search using the same "src" value. If {{src}} is found on level {{n}}, then it is also found on level {{n - 1}} and so on. Level {{0}} is the base level with all the nodes. |
For the first iteration of Vector search, we will implement HNSW algorithm.
The implementation will only support Euclidean distance initially. *Basic plan:* Graph construction will be done according to HNSW paper. Storage wise, we'll store the graph as part of a subtable ( The table's definition will be something along these lines: {code:sql} CREATE TABLE i ( level int unsigned not null, src varbinary(255) not null, dst varbinary(255) not null, index (level,src), index (level,dst)); {code} For each link in the graph, there will be a corresponding entry in the table. * {{src}} and {{dst}} will store {{handler::position}}, a quick link to the actual vector blob in the main table. The index {{(level,src)}} will allow for quick jumping between nodes. To go deeper in search, one just needs to decrement the level and search using the same "src" value. If {{src}} is found on level {{n}}, then it is also found on level {{n - 1}} and so on. Level {{0}} is the base level with all the nodes. *Performance considerations:* * Storing the vector in the subtable might be required. Looking up the blob value in the base table might be too costly. |
Assignee | Vicențiu Ciorbaru [ cvicentiu ] |
Fix Version/s | 11.5 [ 29506 ] |
Status | Open [ 1 ] | In Progress [ 3 ] |
Fix Version/s | 11.6 [ 29515 ] | |
Fix Version/s | 11.5 [ 29506 ] |
Priority | Major [ 3 ] | Critical [ 2 ] |
Issue Type | Task [ 3 ] | New Feature [ 2 ] |
Status | In Progress [ 3 ] | In Testing [ 10301 ] |
Labels | vector |
Fix Version/s | 11.7 [ 29815 ] | |
Fix Version/s | 11.6 [ 29515 ] |
Link |
This issue is part of |
Labels | vector |
Component/s | Vector search [ 20205 ] |
Fix Version/s | 11.7.1 [ 29913 ] | |
Fix Version/s | 11.7 [ 29815 ] | |
Resolution | Fixed [ 1 ] | |
Status | In Testing [ 10301 ] | Closed [ 6 ] |
Link | This issue is blocked by MDEV-36317 [ MDEV-36317 ] |
Link |
This issue is blocked by |
Insert and search now are in a functioning state, although some refactoring is needed.