Details
-
New Feature
-
Status: Closed (View Workflow)
-
Critical
-
Resolution: Fixed
Description
An umbrella task for all vector search features that are planned to make it into 11.7
Attachments
Issue Links
- causes
-
MDEV-34967 MSAN failure in main.vector
-
- Closed
-
-
MDEV-34970 Vector search fails to compile on s390x
-
- Closed
-
-
MDEV-34971 Vector search fails to compile on x86_32
-
- Closed
-
-
MDEV-34989 After selecting from empty table with vector key the next insert hangs
-
- Closed
-
-
MDEV-35005 vec_distance_cosine can return negative values
-
- Closed
-
-
MDEV-35006 Using varbinary as vector-storing column results in assertion failures
-
- Closed
-
-
MDEV-35020 After a failed attempt to create vector index temporary file remains and prevents further operation
-
- Closed
-
-
MDEV-35021 Behavior for RTREE indexes changed, assertion fails
-
- Closed
-
-
MDEV-35028 Unexpected ER_DUP_ENTRY/ER_DUP_KEY, ASAN errors after TRUNCATE on table with vector index
-
- Closed
-
-
MDEV-35029 ASAN errors in Lex_ident<Compare_ident_ci>::is_valid_ident upon DDL on table with vector index
-
- Closed
-
-
MDEV-35031 Update on vector column returns error but modifies the value, results in further ER_KEY_NOT_FOUND
-
- Closed
-
-
MDEV-35033 LeakSanitizer errors in my_malloc / safe_mutex_lazy_init_deadlock_detection / MHNSW_Context::alloc_node and alike
-
- Closed
-
-
MDEV-35034 Non-debug assertion failure after unsuccessful attempt to add vector index
-
- Closed
-
-
MDEV-35035 Assertion failure in ha_blackhole::position upon INSERT into blackhole table with vector index
-
- Closed
-
-
MDEV-35036 Assertion failure in myrocks::ha_rocksdb::position upon INSERT into RocksDB table with vector index
-
- Closed
-
-
MDEV-35037 Invalid (old?) table or database name 't#i#00' upon creating RocksDB table with vector index
-
- Closed
-
-
MDEV-35038 Server crash in Index_statistics::get_avg_frequency upon EITS collection for vector index
-
- Closed
-
-
MDEV-35039 Number of indexes inside InnoDB differs from that defined in MariaDB after altering table with vector key
-
- Closed
-
-
MDEV-35042 Vector indexes are allowed for MERGE tables, but do not work
-
- Closed
-
-
MDEV-35043 Unsuitable error upon an attempt to create MEMORY table with vector key
-
- Closed
-
-
MDEV-35044 ALTER on a table with vector index attempts to bypass unsupported locking limitation, server crashes in THD::free_tmp_table_share
-
- Closed
-
-
MDEV-35055 ASAN errors in TABLE_SHARE::lock_share upon committing transaction after FLUSH on table with vector key
-
- Closed
-
-
MDEV-35058 Non-debug assertion failure upon concurrent vector index creation and select
-
- Closed
-
-
MDEV-35060 Assertion failure upon DML on table with vector under lock
-
- Closed
-
-
MDEV-35061 XA PREPARE "not supported by the engine" from storage engine mhnsw, memory leak
-
- Closed
-
-
MDEV-35063 Assertion `v->distance_to_target >= threshold' fails upon adding certain values to vector key
-
- Closed
-
-
MDEV-35069 IMPORT TABLESPACE does not work for tables with vector, although allowed
-
- Closed
-
-
MDEV-35071 Poor recall upon vector search (300 dimensions, 10K rows)
-
- Closed
-
-
MDEV-35077 Assertion failure in myrocks::ha_rocksdb::position_to_correct_key upon using unique hash key
-
- Closed
-
-
MDEV-35078 Server crash or ASAN errors in mhnsw_insert
-
- Closed
-
-
MDEV-35081 Assertion `!n_mysql_tables_in_use' failed after error upon binary logging of DML involving vector table
-
- Closed
-
-
MDEV-35083 ER_UNSUPPORTED_EXTENSION upon using HASH keys with InnoDB
-
- Closed
-
-
MDEV-35084 Assertion `v->distance_to_target >= threshold' fails upon adding overflowing values to vector key #2
-
- Closed
-
-
MDEV-35087 Server crash or ASAN errors in _mi_write_blob_record upon using BINARY of certain lengths as vector column
-
- Closed
-
-
MDEV-35092 Server crash, hang or ASAN errors in mysql_create_frm_image upon using non-default table options and system variables
-
- Closed
-
-
MDEV-35105 Assertion `tab->join->order' fails upon vector search with DISTINCT
-
- Closed
-
-
MDEV-35130 Assertion fails in trx_t::check_bulk_buffer upon CREATE.. SELECT with vector key
-
- Closed
-
-
MDEV-35131 Assertion `std::isnan(v->distance_to_target) || v->distance_to_target >= threshold' failed upon SELECT
-
- Closed
-
-
MDEV-35141 Server crashes in Field_vector::report_wrong_value upon statistic collection
-
- Closed
-
-
MDEV-35146 Vector-related error messages worth improving when possible
-
- Closed
-
-
MDEV-35147 Inconsistent NULL handling in vector type
-
- Closed
-
-
MDEV-35148 Foreign key on vector column refuses to be created inconsistently and on a wrong reason
-
- Open
-
-
MDEV-35150 Column containing non-vector values can be modified to VECTOR type without warnings
-
- Closed
-
-
MDEV-35151 Alter table with vector operations is not atomic, temporary files remain
-
- Closed
-
-
MDEV-35152 DATA/INDEX DIRECTORY options are ignored for vector index
-
- In Testing
-
-
MDEV-35158 Assertion `res->length() > 0 && res->length() % 4 == 0' fails upon increasing length of vector column
-
- Closed
-
-
MDEV-35159 Assertion `tab->join->select_limit < (~ (ha_rows) 0)' fails upon forcing vector key
-
- Closed
-
-
MDEV-35160 RBR does not work with vector type, ER_SLAVE_CONVERSION_FAILED
-
- Closed
-
-
MDEV-35161 UPDATE and DELETE do not use vector key
-
- Open
-
-
MDEV-35175 Vector functions re-use JSON warnings
-
- Open
-
-
MDEV-35176 ASAN errors in Field_vector::store with optimizer_trace enabled
-
- Closed
-
-
MDEV-35177 Unexpected ER_TRUNCATED_WRONG_VALUE_FOR_FIELD, diagnostics area assertion failures upon EITS collection with vector type
-
- Closed
-
-
MDEV-35178 Assertion failure in Field_vector::store upon INSERT IGNORE with a wrong data
-
- Closed
-
-
MDEV-35182 crash in online_alter_end_trans with XA over vector indexes
-
- Closed
-
-
MDEV-35184 Corruption errors upon creation or usage of Federated table with vector key
-
- Open
-
-
MDEV-35185 Query cache used for results of vector search conflicts with the purpose of mhnsw_min_limit
-
- Open
-
-
MDEV-35186 IGNORED attribute has no effect on vector keys
-
- Closed
-
-
MDEV-35191 Assertion failure in Create_tmp_table::finalize upon DISTINCT with vector type
-
- Closed
-
-
MDEV-35192 Distance functions on vectors of different length return NULL without warnings
-
- Open
-
-
MDEV-35194 non-BNL join fails on assertion
-
- Closed
-
-
MDEV-35195 Assertion `tab->join->order' fails upon vector search with DISTINCT #2
-
- Closed
-
-
MDEV-35198 ER_CRASHED_ON_USAGE or assertion failure after myisampack on table with vector key
-
- Open
-
-
MDEV-35203 ASAN errors or assertion failures in row_sel_convert_mysql_key_to_innobase upon query from table with usual key on vector field
-
- Closed
-
-
MDEV-35204 mysqlbinlog --verbose fails on row events with vector type
-
- Closed
-
-
MDEV-35205 Server crash in online alter upon concurrent ALTER and DML on table with vector field
-
- Closed
-
-
MDEV-35210 Vector type cannot store values which VEC_FromText produces and VEC_ToText accepts
-
- Closed
-
-
MDEV-35211 VEC_FromText does not return vector type but varbinary
-
- Open
-
-
MDEV-35212 Server crashes in Item_func_vec_fromtext::val_str upon query from empty table
-
- Closed
-
-
MDEV-35213 Server crash or assertion failure upon query with high value of mhnsw_min_limit
-
- Closed
-
-
MDEV-35214 Server crashes in FVectorNode::gref_len with insufficient mhnsw_max_cache_size
-
- Closed
-
-
MDEV-35215 ASAN errors in Item_func_vec_fromtext::val_str upon VEC_FROMTEXT with an invalid argument
-
- Closed
-
-
MDEV-35219 Unexpected ER_DUP_KEY after OPTIMIZE on MyISAM table with vector key
-
- Closed
-
-
MDEV-35220 Assertion `!item->null_value' failed upon VEC_TOTEXT call
-
- Closed
-
-
MDEV-35221 Vector values do not survive mariadb-dump / restore
-
- Closed
-
-
MDEV-35223 REPAIR does not fix MyISAM table with vector key after crash recovery
-
- Closed
-
-
MDEV-35230 ASAN errors upon reading from joined temptable views with vector type
-
- Closed
-
-
MDEV-35241 DROP TABLE on table with vector key not atomic, leads to ER_NO_SUCH_TABLE_IN_ENGINE
-
- Open
-
-
MDEV-35244 Vector-related system variables could use better names
-
- Closed
-
-
MDEV-35245 SHOW CREATE TABLE produces unusable statement for vector fields with constant default value
-
- Closed
-
-
MDEV-35246 Vector search skips a row in the table
-
- Closed
-
-
MDEV-35258 Mariabackup does not work with MyISAM tables with vector keys
-
- Closed
-
-
MDEV-35263 rpl.vector fails when executed in a group of tests
-
- Closed
-
-
MDEV-35267 Server crashes in _ma_reset_history upon altering on Aria table with vector key under lock
-
- Closed
-
-
MDEV-35271 XA behavior changed, assertion fails in Ha_trx_info::is_trx_read_write
-
- Stalled
-
-
MDEV-35284 Server crash or ASAN errors in mhnsw_read_next upon using vectors within transaction
-
- Closed
-
-
MDEV-35287 ER_KEY_NOT_FOUND upon INSERT into InnoDB table with vector key under READ COMMITTED
-
- Closed
-
-
MDEV-35292 ALTER TABLE re-creating vector key is no-op with non-copying alter algorithms (default)
-
- Closed
-
-
MDEV-35296 DESC does not work in ORDER BY with vector key
-
- Closed
-
-
MDEV-35302 ASAN errors or assertion failure in mhnsw_read_first upon vector search with join
-
- Closed
-
-
MDEV-35305 Vector search queries are written into slow log as "not using index"
-
- Open
-
-
MDEV-35308 NO_KEY_OPTIONS SQL mode has no effect on engine key options
-
- Closed
-
-
MDEV-35309 ALTER performs vector truncation without WARN_DATA_TRUNCATED or similar warnings/errors
-
- In Review
-
-
MDEV-35317 Server crashes in mhnsw_insert upon using vector key on a Spider table
-
- Closed
-
-
MDEV-35319 ER_LOCK_DEADLOCK not detected upon DML on table with vector key, server crashes
-
- Closed
-
-
MDEV-35320 Non-default distance function and M are not replicated
-
- Open
-
-
MDEV-35321 INDEX_STATISTICS does not show the use of a vector key
-
- Open
-
-
MDEV-35322 Vector search is not shown in perfschema, queries are counted as not using index
-
- Open
-
-
MDEV-35323 ER_TOO_BIG_FIELDLENGTH shows wrong maximum length for vector field
-
- Open
-
-
MDEV-35324 Different index type shown in SHOW INDEX vs SHOW CREATE TABLE
-
- Closed
-
-
MDEV-35325 DROP TABLE on Mroonga table with vector key fails with ER_NO_SUCH_TABLE
-
- Open
-
-
MDEV-35328 Corruption-like errors upon and after REPAIR .. USE_FRM on table with vector key
-
- Open
-
-
MDEV-35337 Server crash or assertion failure in join_read_first upon using vector distance in group by
-
- Closed
-
-
MDEV-35338 Non-copying ALTER does not pad VECTOR column, vector search further does not work
-
- Closed
-
-
MDEV-35339 Different behavior of implicit vector conversion comparing to other types and DDL vs DML
-
- Open
-
-
MDEV-35340 In Oracle-styled SPs unspecified length of vector field defaults to 1000
-
- Open
-
-
MDEV-35354 InnoDB: Failing assertion: node->pcur->rel_pos == BTR_PCUR_ON upon LOAD DATA REPLACE with unique blob
-
- Closed
-
-
MDEV-35769 ER_SQL_DISCOVER_ERROR upon updating vector key column using incorrect value
-
- Closed
-
-
MDEV-35792 Adding a regular index on a vector column leads to invalid table structure
-
- Closed
-
-
MDEV-35793 Server crashes in Item_func_vec_distance_common::get_const_arg
-
- Closed
-
-
MDEV-35834 Server crash in FVector::distance_to upon concurrent SELECT
-
- Closed
-
-
MDEV-36005 Server crashes when checking/updatng a table having vector key after enabling innodb_force_primary_key
-
- Closed
-
-
MDEV-36011 Server crashes in Charset::mbminlen / Item_func_vec_fromtext::val_str upon mixing vector type with string
-
- Closed
-
- includes
-
MDEV-32885 VEC_DISTANCE() function
-
- Closed
-
-
MDEV-32886 VEC_FromText() and VEC_ToText() functions
-
- Closed
-
-
MDEV-33404 Engine-independent indexes: subtable method
-
- Closed
-
-
MDEV-33406 basic optimizer support for k-NN searches
-
- Closed
-
-
MDEV-33407 Parser support for vector indexes
-
- Closed
-
-
MDEV-33408 HNSW for k-ANN vector searches
-
- Closed
-
-
MDEV-33413 cache k-ANN graph in memory
-
- Closed
-
-
MDEV-33414 benchmark vector indexes
-
- Closed
-
-
MDEV-33416 graph index: use smaller floating point numbers
-
- Closed
-
-
MDEV-33417 VEC_DISTANCE_COSINE() function
-
- Closed
-
-
MDEV-33418 graph index insert: stronger selection of neighbors
-
- Closed
-
-
MDEV-34436 DDL: per-index attributes
-
- Closed
-
-
MDEV-34698 mhnsw: support AVX-512 instructions
-
- Closed
-
-
MDEV-34811 handlerton refactoring
-
- Closed
-
-
MDEV-34942 packaging dependency for eigen3
-
- Closed
-
- relates to
-
MDBF-796 Add Eigen onto BB workers
-
- Closed
-
-
MDEV-35082 HANDLER with FULLTEXT keys is not always rejected
-
- Closed
-
In my opinion, the feature in its current shape can be pushed into the main branch and released with 11.7.1.
In short, it appears stable enough for the RC after all the bugfixing, and we need the community to start experimenting with it on realistic datasets and use cases for possible further tuning before GA. The internal feature-focused testing will also be continued on the main/11.7 branch before and after 11.7.1 release.
Long version:
The main shortage of internal feature testing in this case was (and still is) that there is no usable criteria/requirements for "sufficient result correctness".
Normally correctness is a fixed characteristic which does not cause much controversy and to a large extent can be tested on a variety of datasets, not necessarily real-life ones, while performance remains relative and measured either on standard benchmarks (with the common understanding that they don't necessarily represent realistic use cases) or, in some cases, on actual real-life scenarios.
In case of vector search with its results being approximate by nature, we have two flexible characteristics which depend on each other (better correctness leads to worse performance and vice versa), and for neither of which we can set the hard limit "it cannot go worse than that under any circumstances" on any given dataset.
Whatever we know now about the comparative performance/recall of the current implementation was already presented in public talks and blog posts by feature developers. This stage of internal testing was mainly focused on stability and other less controversial aspects of the feature. I cannot claim such testing to be sufficient and I don't believe it will ever be, which is why I think it is important to get the feature out to the public and gather as much information as possible about what users consider more important in which cases, how much precision can be sacrificed for the sake of performance, and so on. I expect there will always be a fair amount of dissatisfaction as different use cases have different requirements, but hopefully we will get a bigger picture than we have now.
Meanwhile, below are some notes from the testing, mostly for documentation and other "user must be aware" purposes.
I won't list those limitations or issues which are immediately obvious, only some which can remain unnoticed but cause troubles. The list is dynamic, so some notes can become outdated quickly. In no particular order.
MDEV-35296: fixed by disabling);MDEV-35287,MDEV-35130: fixed by disabling);MDEV-35069);MDEV-35186);MDEV-35210, fixed with the note "VEC_ToText still prints everything");MDEV-35292,MDEV-35338);MDEV-35221)