[MDEV-371] Unique indexes for blobs - Jira

Details

Type: Task
Status: Closed (View Workflow)
Priority: Critical
Resolution: Fixed
Fix Version/s: 10.4.3
Component/s: Data Definition - Create Table
Labels:
- gsoc14
- gsoc15
- gsoc16

Description

Allow a user to create unique constraints of arbitrary length. This will be done on the upper layer, in the server, not in the engine. The server will create the invisible virtual column with a hash over the to-be-unique columns. And a normal BTREE index over this column. On insert or update it'll check the index for hash collisions and, if needed, will retrieve the actual rows to compare the data.

original bug report:

hi guys, i was reading about index... and i have a interesting problem...
i need to check if a file (ok a big row... no problem...) is inside my table...
what i´m thinking....
create table a(b int not null default '',c longblob not null, primary key b)
...
ok no problems....
the problem is... how to know if a file, let´s sai a file of 16MB is inside my table...
first solution is... MD5 and check each row... OK nice work....
but could be a other nicer solution?!
i was thinking something like:

alter table a
add index some_index(c) using hash;

could this work? since it´s a hash index, i don´t see why should i use a part of c value like c (100) for example...

could check if this is possible? today not... i tryed and it return:
/* SQL Error (1170): Coluna BLOB 'hash_automatico' usada na especificação de chave sem o comprimento da chave */ (in portugues PT_BR)

i think that´s all

Attachments

Issue Links

causes

MDEV-18707 Server crash in my_hash_sort_bin, ASAN heap-use-after-free in Field::is_null, server hang, corrupted double-linked list

Closed

MDEV-18708 Server crash in Item_field::register_field_in_read_map upon modifying column into a blob

Closed

MDEV-18709 Server crash or ASAN heap-buffer-overflow in create_index upon modifying column into a blob

Closed

MDEV-18710 Unexpected ER_EXPRESSION_REFERS_TO_UNINIT_FIELD (Expression for field `DB_ROW_HASH_1` is refering to uninitialized field `DB_ROW_HASH_1`) upon modifying column into a blob

Closed

MDEV-18711 Assertion `key_info->key_part->field->flags & (1<< 30)' failed in setup_keyinfo_hash upon rebuilding table with index on blob

Closed

MDEV-18712 InnoDB indexes are inconsistent with what defined in .frm for table after rebuilding table with index on blob

Closed

MDEV-18713 Assertion `strcmp(share->unique_file_name,filename) || share->last_version' failed in test_if_reopen upon REPLACE into table with key on blob

Closed

MDEV-18720 Assertion `inited==NONE' failed in ha_index_init upon update on versioned table with key on blob

Closed

MDEV-18722 Assertion `templ->mysql_null_bit_mask' failed in row_sel_store_mysql_rec upon modifying indexed column into blob

Closed

MDEV-18725 Assertion failure in file storage/innobase/fts/fts0fts.cc upon modifying column into a blob

Closed

MDEV-18747 InnoDB: Failing assertion: table->get_ref_count() == 0 upon dropping temporary table with unique blob

Closed

MDEV-18748 REPLACE doesn't work with unique blobs on MyISAM table

Closed

MDEV-18763 mi_rrnd: Conditional jump or move depends on uninitialised value upon inserting into blob with long key

Closed

MDEV-18790 Server crash in fields_in_hash_keyinfo after unsuccessful attempt to drop BLOB with long index

Closed

MDEV-18791 Wrong error upon creating Aria table with long index on BLOB

Closed

MDEV-18792 ASAN unknown-crash in _mi_pack_key upon UPDATE after failed ALTER on a table with long BLOB key

Closed

MDEV-18793 Assertion `0' failed in row_sel_convert_mysql_key_to_innobase, ASAN unknown-crash in row_mysql_store_col_in_innobase_format, warning " InnoDB: Using a partial-field key prefix in search"

Closed

MDEV-18795 InnoDB: Failing assertion: field->prefix_len > 0 upon DML on table with BLOB index

Closed

MDEV-18798 InnoDB: No matching column for `DB_ROW_HASH_1`and server crash in ha_innobase::commit_inplace_alter_table upon ALTER on table with UNIQUE key

Closed

MDEV-18800 Server crash in instant_alter_column_possible or Assertion `!pk->has_virtual()' failed in instant_alter_column_possible upon adding key

Closed

MDEV-18801 InnoDB: Failing assertion: field->col->mtype == type or ASAN heap-buffer-overflow in row_sel_convert_mysql_key_to_innobase upon SELECT on table with long index

Closed

MDEV-18809 Server crash in fields_in_hash_keyinfo or Assertion `key_info->key_part->field->flags & (1<< 30)' failed in setup_keyinfo_hash

Closed

MDEV-18820 Assertion `lock_table_has(trx, index->table, LOCK_IX)' failed in lock_rec_insert_check_and_lock upon INSERT into table with blob key

Closed

MDEV-18887 ha_key_cmp: Conditional jump or move depends on uninitialised value

Closed

MDEV-18888 Server crashes in Item_field::register_field_in_read_map upon MODIFY COLUMN

Closed

MDEV-18889 Long unique on virtual fields crashes server

Closed

MDEV-18891 ASAN heap-use-after-free in innobase_get_computed_value on concurrent DELETE from table with long index

Closed

MDEV-18897 InnoDB indexes are inconsistent with what defined in .frm for table

Closed

MDEV-18901 Wrong results after ADD UNIQUE INDEX(blob_column)

Closed

MDEV-18904 Assertion `m_part_spec.start_part >= m_part_spec.end_part' failed in ha_partition::index_read_idx_map

Closed

MDEV-18910 Hash value unique long column is miscalculated from versioning timestamp

Closed

MDEV-18922 Alter on long unique varchar column makes result null

Closed

MDEV-18967 Load data in system version with long unique does not work

Closed

MDEV-19011 Assertion `file->s->base.reclength < file->s->vreclength' failed in ha_myisam::setup_vcols_for_repair

Closed

MDEV-19045 Change in behavior upon creation of unlimited non-unique indexes

Closed

MDEV-19049 Server crashes in check_duplicate_long_entry_key, ASAN stack-buffer-overflow in Field_blob::get_key_image

Closed

MDEV-20131 Assertion `!pk->has_virtual ()' failed in dict_index_build_internal_clust

Closed

MDEV-21540 Initialization of already inited long unique index on reorganize partition

Closed

MDEV-21624 Unique index length is able to exceed max key length

Closed

MDEV-22676 InnoDB: Failing assertion: result != FTS_INVALID upon DML on table with unique blob

Open

MDEV-22759 Failing assertion: !cursor->index->is_committed() upon update on table with HASH index

Closed

MDEV-22760 Bulk INSERT...ON DUPLICATE KEY UPDATE updates only a fraction of rows

Confirmed

MDEV-23218 InnoDB: Flagged corruption, Assertion `0' failed in Row_sel_get_clust_rec_for_mysql::operator

Closed

MDEV-23264 Unique blobs allow duplicate values upon UPDATE

Closed

MDEV-25654 Unexpected ER_CRASHED_ON_USAGE and Assertion `limit >= trx_id' failed in purge_node_t::skip

Closed

MDEV-26020 Server crash on converting table to utf8mb3

Closed

MDEV-26035 Assertion `file->s->base.reclength < file- >s->vreclength || !table->s->stored_fields' upon creating HASH index

Confirmed

MDEV-26253 ERROR 1032 (HY000): Can't find record with MyISAM, (too?) large key and DISABLE KEYS

Closed

MDEV-28190 sql_mode makes MDEV-371 virtual column expressions nondeterministic

Closed

MDEV-28192 ERROR 1901 During ALTER Leading to frm table corruption

Closed

MDEV-28238 Incorrect information in file: './test/#sql-alter-......frm' on ALTER TABLE ... ADD UNIQUE ... USING HASH

Closed

MDEV-28710 Replication broken after upgrading to 10.7.4

Open

MDEV-29199 Unique hash key is ignored upon INSERT ... SELECT into non-empty MyISAM table

Closed

MDEV-29203 Incorrect information in file: ... #sql-alter ... .frm on ALTER when adding unique hash index, ERROR 1901

Closed

MDEV-29345 update case insensitive (large) unique key with insensitive change of value - duplicate key

Closed

MDEV-29954 Unique hash key on column prefix is computed incorrectly

Closed

MDEV-30034 UNIQUE USING HASH accepts duplicate entries for tricky collations

Closed

MDEV-31093 "ON DUPLICATE KEY UPDATE" saves wrong data to the database

Open

MDEV-32668 tables with UNIQUE blob columns cannot be alter_algorithm=INSTANT modified

Open

MDEV-33658 Cannot add a foreign key on a table with a long UNIQUE multi-column index, that contains a foreign key as a prefix.

Closed

MDEV-34304 Improve FK constraint error messages was: Foreign key constraint is incorrectly formed on 11.4.2. It works on 10.11.7

Stalled

MDEV-34344 Replica index corruption and InnoDB: Failing assertion: !cursor->index->is_committed() in row_ins_sec_index_entry_by_modify when using generated columns

Confirmed

includes

MDEV-5795 Full-Column Unique Index

Closed

MDEV-21606 Improve update handler (long unique keys on blobs)

Closed

relates to

MDEV-10177 Invisible columns

Closed

MDEV-10178 correct nullability for generated columns

Open

MDEV-19252 Warning about assertion failure marked_for_write_or_computed() printed by release build with DBUG_ASSERT_AS_PRINTF, but no failure on debug build

Closed

MDEV-22277 LeakSanitizer: detected memory leaks in mem_heap_create_block_func after attempt to create foreign key

Closed

MDEV-25047 SIGSEGV in mach_read_from_n_little_endian

Closed

MDEV-29949 Unique blobs allow duplicate values upon concurrent INSERT

Open

MDEV-6547 select count(*) on hash partitions does a sorted index scan

Open

MDEV-13445 Hash-Index Type for InnoDB

Open

MDEV-20001 Potential dangerous regression: INSERT INTO >=100 rows fail for myisam table with HASH indexes

Closed

MDEV-20661 Virtual fields are not recalculated on system fields value assignment

Closed

MDEV-20918 INSTANT algorithm doesn't work with unique HASH index

Open

MDEV-22722 Assertion "inited==NONE" failed in handler::ha_index_init on the slave during UPDATE

Closed

MDEV-22756 SQL Error (1364): Field 'DB_ROW_HASH_1' doesn't have a default value

Closed

MDEV-23206 SIGSEGV in btr_search_sys_t::get_part on DROP TABLE with innodb_adaptive_hash_index enabled

Open

MDEV-23547 InnoDB: Failing assertion: *len in row_upd_ext_fetch

Closed

MDEV-23713 Replication stops with "Index for table is corrupt", table with HASH index, assertion: !cursor->index->is_committed() fails in row_ins_sec_index_entry_by_modify

Closed

MDEV-24096 Server crash, InnoDB fatal error, Assertion `first_free <= srv_page_size - 8' failed in trx_undo_page_report_modify

Closed

MDEV-24522 Assertion `inited==NONE' fails upon UPDATE on versioned table with unique blob

Closed

MDEV-25779 long uniques aren't 32/64-bit portable

Closed

MDEV-27160 main.long_unique failed on ppc64el on Ubuntu 21.10 autopkgtest

Closed

MDEV-27371 REPLACE INTO creates 0 value on autoincrement column

Confirmed

MDEV-27653 long uniques don't work with unicode collations

Closed

MDEV-28098 incorrect key in "dup value" error after long unique

Closed

MDEV-30046 wrong row targeted with "insert ... on duplicate" and "replace", leading to data corruption

Closed

MDEV-30087 Prefix UNIQUE HASH raises unexpected Specified key part was too long

Stalled

MDEV-30095 Unexpected duplicate entry error for UNIQUE USING HASH + NOPAD

Open

MDEV-30441 ASAN heap-use-after-free in Field_blob::pack/pack_row

Open

MDEV-30588 Failed to update duplicate data when running insert... on duplicate key update.

Closed

MDEV-31072 InnoDB is USING HASH and Optimizer is confused

Open

MDEV-31290 Document index lengths, HASH index usage

Open

MDEV-32190 Index corruption with unique key and nopad collation (without DESC or HASH keys)

Closed

MDEV-33470 Unique hash index is broken on DML for system-versioned table

Closed

MDEV-35297 Table will encounter rebuild during minor version upgrade if contains hashing index on TEXT type unique key

Confirmed

(57 causes, 2 includes, 33 relates to)

Activity

Ascending order - Click to sort in descending order

Sergei Golubchik added a comment - 2012-06-28 01:09

yes, this was discussed for quite a while, and has a long history in MySQL bug database.
there was even an attempt to implement this, but somehow it failed, don't really know why.
I guess, we can give it another try.

Sergei Golubchik added a comment - 2012-06-28 01:09 yes, this was discussed for quite a while, and has a long history in MySQL bug database. there was even an attempt to implement this, but somehow it failed, don't really know why. I guess, we can give it another try.

roberto spadim added a comment - 2012-06-28 03:58 - edited

humm, nice =)
well i think the problem was sending big quanty of bytes in/out database, but... maybe a function to help index could be nice... i think that many guys implement something like
select count(*) from files where some_hash_field=(length || ';' || some_hash_value_calculated_in_a_script)
if count(*)> 0
select * from files where some_hash_field=(length || ';' || some_hash_value_calculated_in_a_script)
well, the first part is realy nice... but if it return >1 well send file is nicer than read many files with same size...

i don´t know if it could be nice, but implementation is close to hash with any binary char field, since it´s a hash not a btree... (i´m wrong?)
well =) let´s see what happen =)
it´s a feature request, don´t think that´s really needed but a 'formal' or a 'recommended' way to check if a 'file' (blob) is in database could be good, at least in documentation...

roberto spadim added a comment - 2012-06-28 03:58 - edited humm, nice =) well i think the problem was sending big quanty of bytes in/out database, but... maybe a function to help index could be nice... i think that many guys implement something like select count(*) from files where some_hash_field=(length || ';' || some_hash_value_calculated_in_a_script) if count(*)> 0 select * from files where some_hash_field=(length || ';' || some_hash_value_calculated_in_a_script) well, the first part is realy nice... but if it return >1 well send file is nicer than read many files with same size... i don´t know if it could be nice, but implementation is close to hash with any binary char field, since it´s a hash not a btree... (i´m wrong?) well =) let´s see what happen =) it´s a feature request, don´t think that´s really needed but a 'formal' or a 'recommended' way to check if a 'file' (blob) is in database could be good, at least in documentation...

roberto spadim added a comment - 2013-06-09 23:45

hi sergey any idea where the patch about this try was saved? i want to see if it's easy to implement but i don't know how to start

roberto spadim added a comment - 2013-06-09 23:45 hi sergey any idea where the patch about this try was saved? i want to see if it's easy to implement but i don't know how to start

Sergei Golubchik added a comment - 2013-06-10 00:47

No, I don't. Any anyway, if I would like to do it, I would rather start
from scratch, than from some old incomplete patch.

Sergei Golubchik added a comment - 2013-06-10 00:47 No, I don't. Any anyway, if I would like to do it, I would rather start from scratch, than from some old incomplete patch.

roberto spadim added a comment - 2013-06-10 00:56

ok, at least a hash index/unique index could help a lot

roberto spadim added a comment - 2013-06-10 00:56 ok, at least a hash index/unique index could help a lot

roberto spadim added a comment - 2013-06-10 01:28

this should be done in each storage engine, or index is a general "feature" of mariadb?

roberto spadim added a comment - 2013-06-10 01:28 this should be done in each storage engine, or index is a general "feature" of mariadb?

Sergei Golubchik added a comment - 2013-06-10 08:20

in each storage engine. in particular, MyISAM and Aria almost support this already, and that "attempt" that I was referring to was exactly about making them support it fully.

Sergei Golubchik added a comment - 2013-06-10 08:20 in each storage engine. in particular, MyISAM and Aria almost support this already, and that "attempt" that I was referring to was exactly about making them support it fully.

roberto spadim added a comment - 2013-06-10 08:45

hummm nice, in first step make myisam and aria blob index possible, innodb and others is a second step, right?

roberto spadim added a comment - 2013-06-10 08:45 hummm nice, in first step make myisam and aria blob index possible, innodb and others is a second step, right?

smit hinsu added a comment - 2014-03-17 15:46

Hi Sergei,

I am interested in working on this feature as part of GSoC. From Google search it seems that this feature is really important as many people report problem related to having BLOB/TEXT as primary key or creating index for it.

I have good experience with databases but currently I am new to mariadb source code. I would appreciate if you can help me in getting started.

Thanks

smit hinsu added a comment - 2014-03-17 15:46 Hi Sergei, I am interested in working on this feature as part of GSoC. From Google search it seems that this feature is really important as many people report problem related to having BLOB/TEXT as primary key or creating index for it. I have good experience with databases but currently I am new to mariadb source code. I would appreciate if you can help me in getting started. Thanks

Rick James added a comment - 2022-01-27 02:23

How does this handle COLLATION of a large TEXT field? Is there a way to "hash" while honoring complex UTF-* case and accent handling?

Rick James added a comment - 2022-01-27 02:23 How does this handle COLLATION of a large TEXT field? Is there a way to "hash" while honoring complex UTF-* case and accent handling?

Sergei Golubchik added a comment - 2022-01-27 22:02

It should do that automatically. But that part has a bug, though so it doesn't always work. Reported as ~~MDEV-27653~~

Sergei Golubchik added a comment - 2022-01-27 22:02 It should do that automatically. But that part has a bug, though so it doesn't always work. Reported as MDEV-27653

Marko Mäkelä added a comment - 2023-05-19 07:58

For the record, the following patch should disable the ~~MDEV-371~~ functionality:

diff --git a/sql/sql_table.cc b/sql/sql_table.cc

index 6e8a4795f21..f8f3eefc114 100644

--- a/sql/sql_table.cc

+++ b/sql/sql_table.cc

@@ -2441,6 +2441,8 @@ static inline void make_long_hash_field_name(LEX_CSTRING *buf, uint num)

 static Create_field * add_hash_field(THD * thd, List<Create_field> *create_list,

                                       KEY *key_info)

+  my_error(ER_TOO_LONG_KEY, MYF(0), 1000);

+  return nullptr;

   List_iterator<Create_field> it(*create_list);

   Create_field *dup_field, *cf= new (thd->mem_root) Create_field();

   cf->flags|= UNSIGNED_FLAG | LONG_UNIQUE_HASH_FIELD;

This may be useful for testing, because there are many open bugs related to indexed virtual columns, and ~~MDEV-371~~ is internally creating hidden indexed virtual columns.

Marko Mäkelä added a comment - 2023-05-19 07:58 For the record, the following patch should disable the MDEV-371 functionality: diff --git a/sql/sql_table.cc b/sql/sql_table.cc index 6e8a4795f21..f8f3eefc114 100644 --- a/sql/sql_table.cc +++ b/sql/sql_table.cc @@ -2441,6 +2441,8 @@ static inline void make_long_hash_field_name(LEX_CSTRING *buf, uint num) static Create_field * add_hash_field(THD * thd, List<Create_field> *create_list, KEY *key_info) { + my_error(ER_TOO_LONG_KEY, MYF(0), 1000); + return nullptr; List_iterator<Create_field> it(*create_list); Create_field *dup_field, *cf= new (thd->mem_root) Create_field(); cf->flags|= UNSIGNED_FLAG | LONG_UNIQUE_HASH_FIELD; This may be useful for testing, because there are many open bugs related to indexed virtual columns, and MDEV-371 is internally creating hidden indexed virtual columns.

People

Assignee:: Sachin Setiya (Inactive)

Reporter:: roberto spadim

Votes:: 1 Vote for this issue

Watchers:: 10 Start watching this issue

Dates

Created:: 2012-06-28 00:32

Updated:: 2024-12-17 17:02

Resolved:: 2019-02-23 12:13

Git Integration

Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.

MariaDB Server