[MDEV-31072] InnoDB is USING HASH and Optimizer is confused Created: 2023-04-18  Updated: 2024-01-29

Status: Open
Project: MariaDB Server
Component/s: Documentation, Optimizer
Fix Version/s: N/A

Type: Task Priority: Major
Reporter: Oli Sennhauser Assignee: Ian Gilfillan
Resolution: Unresolved Votes: 0
Labels: UNIQUE, documentation, hash, index, optimizer

Issue Links:
PartOf
Relates
relates to MDEV-371 Unique indexes for blobs Closed
relates to MDEV-13445 Hash-Index Type for InnoDB Open
relates to MDEV-22756 SQL Error (1364): Field 'DB_ROW_HASH_... Closed

 Description   

MariaDB KB states: "HASH is only supported by the MEMORY storage engine."

Sources:

When we create a table like this:

CREATE TABLE `aaa_device` (
  `device_id` int(10) unsigned NOT NULL AUTO_INCREMENT,
  `device_token` varchar(750) CHARACTER SET utf8mb4 COLLATE utf8mb4_bin NOT NULL,
  `device_app` varchar(24) NOT NULL,
  `device_app_version` smallint(5) unsigned DEFAULT NULL,
  `device_platform` varchar(16) NOT NULL,
  `device_user_agent` varchar(512) DEFAULT NULL,
  `device_push_enabled` tinyint(1) unsigned DEFAULT NULL,
  `device_name` varchar(64) DEFAULT NULL,
  `pg_id` smallint(5) unsigned DEFAULT NULL,
  `device_user_last` int(10) unsigned DEFAULT NULL,
  `device_user_last_logged_in` int(10) unsigned DEFAULT NULL,
  `device_cookie` text DEFAULT NULL,
  `device_session_id` varchar(32) DEFAULT NULL,
  `device_insert` datetime NOT NULL DEFAULT current_timestamp(),
  `device_update` datetime NOT NULL DEFAULT current_timestamp() ON UPDATE current_timestamp(),
  `device_restart` datetime DEFAULT NULL,
  `device_use` datetime NOT NULL DEFAULT current_timestamp(),
  `device_cookie_len` int(10) unsigned GENERATED ALWAYS AS (octet_length(`device_cookie`)) STORED,
  PRIMARY KEY (`device_id`),
  UNIQUE KEY `ucTokenApp` (`device_token`,`device_app`),
  KEY `device_app_device_app_version` (`device_app`,`device_app_version`),
  KEY `device_user_last` (`device_user_last`),
  KEY `device_user_last_logged_in` (`device_user_last_logged_in`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_unicode_520_ci
;

MariaDB converts this to this:

SHOW CREATE TABLE aaa_device\G

CREATE TABLE `aaa_device` (
  `device_id` int(10) unsigned NOT NULL AUTO_INCREMENT,
  `device_token` varchar(750) CHARACTER SET utf8mb4 COLLATE utf8mb4_bin NOT NULL,
  `device_app` varchar(24) NOT NULL,
  `device_app_version` smallint(5) unsigned DEFAULT NULL,
  `device_platform` varchar(16) NOT NULL,
  `device_user_agent` varchar(512) DEFAULT NULL,
  `device_push_enabled` tinyint(1) unsigned DEFAULT NULL,
  `device_name` varchar(64) DEFAULT NULL,
  `pg_id` smallint(5) unsigned DEFAULT NULL,
  `device_user_last` int(10) unsigned DEFAULT NULL,
  `device_user_last_logged_in` int(10) unsigned DEFAULT NULL,
  `device_cookie` text DEFAULT NULL,
  `device_session_id` varchar(32) DEFAULT NULL,
  `device_insert` datetime NOT NULL DEFAULT current_timestamp(),
  `device_update` datetime NOT NULL DEFAULT current_timestamp() ON UPDATE current_timestamp(),
  `device_restart` datetime DEFAULT NULL,
  `device_use` datetime NOT NULL DEFAULT current_timestamp(),
  `device_cookie_len` int(10) unsigned GENERATED ALWAYS AS (octet_length(`device_cookie`)) STORED,
  PRIMARY KEY (`device_id`),
  UNIQUE KEY `ucTokenApp` (`device_token`,`device_app`) USING HASH,
  KEY `device_app_device_app_version` (`device_app`,`device_app_version`),
  KEY `device_user_last` (`device_user_last`),
  KEY `device_user_last_logged_in` (`device_user_last_logged_in`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_unicode_520_ci
;

Note: UNIQUE KEY `ucTokenApp` (`device_token`,`device_app`) USING HASH,

So far so good, I can live with it... But: After inserting some data:

Error formatting macro: code: java.lang.StackOverflowError

INSERT INTO `aaa_device` (`device_id`, `device_token`, `device_app`, `device_app_version`, `device_platform`, `device_user_agent`, `device_push_enabled`, `device_name`, `pg_id`, `device_user_last`, `device_user_last_logged_in`, `device_cookie`, `device_session_id`, `device_insert`, `device_update`, `device_restart`, `device_use`, `device_cookie_len`)
VALUES
(5, 'cpM6TlH5-VI:APA91bG5gEyiw8psmZGQzYCPFgYQU0TNsCFb5jeivYkLybTlL5p6844yrqBZVsGZENsSxTvZwb9MyTItT5oHz0eCWjvC42oixOm6HIKg-IPPi-Mt1bJRTdd0p-0FVKzppC3BTAvqITzg', 'pcco', NULL, 'ios', NULL, NULL, NULL, NULL, NULL, NULL, NULL, NULL, '2020-02-19 13:28:25', '2022-07-20 19:05:49', NULL, '2022-07-16 14:39:04', NULL)
, (8, 'dJMAAFywpGk:APA91bELVRD74PdZG5Vd8flXGeE0teS3iEzTL3vaW17S4CCBSeNQpb1CPU32s7fGfa35KLEQEuq1-W8SWssghQu26u5uDPrHiMGzHjisnGKZp_7l5F3q_uMAtZEjqUioHiGCFXws-r0v', 'pcco', 202, 'ios', 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_6) AppleWebKit/605.1.15 (KHTML, like Gecko) Mobile/15E148', 1, 'Apple-iPad5,3', 679, 43204, 43204, '{\"auth\":

{\"authBy\":\"login\",\"login\":\"hans.meier@t-online.de\",\"password\":\"402c9fe50e9e3ff6dc01742df5fff05c\"}

,\"club\":\"129\",\"device_id\":\"8\",\"lang\":\"de\",\"visitor_uid\":\"79f3eca97c2ad28230521288429f8bf8\"}', 'vf3cvvBb1hZmDx7sVaDl2wUPYV', '2020-02-19 13:28:53', '2023-04-16 08:31:36', '2023-04-16 08:30:50', '2023-04-16 08:31:36', 200)
, (11, 'dQHXAXyQFwM:APA91bG7pY8qOtHdLTKJ41uhi87mwBEy2RybQyHy1nLNyGAW7oBWE4TZZ2HHhLabLxtRHmk1CBMibLOGeKp0M1sGGB-BkCqFdZ1UImvxcI3BXFuUO4V3PrN8M2VdFeKHkPYnkXeypoJC', 'pcco', NULL, 'ios', NULL, NULL, NULL, NULL, NULL, NULL, NULL, NULL, '2020-02-19 13:29:00', '2022-07-20 19:05:49', NULL, '2022-07-16 14:39:04', NULL)
, (14, 'edtFfXjgpRk:APA91bEKHLjwAW2RWb7dMwWgT3eRlmyaWYMlbXLhLrlkhFj2dgh-hOp3lb1nLMgEEqXwV6k_zb5qduMQcZoSLDo-NmYjvdl2CIeyHzElhBRbmGo5RE-WFH5oL6JacgJ-wZEhEWlOzjaU', 'pcco', NULL, 'android', NULL, NULL, NULL, NULL, NULL, NULL, NULL, NULL, '2020-02-19 13:29:10', '2022-07-20 19:05:49', NULL, '2022-07-16 14:39:04', NULL)
, (17, 'evOBMlbektw:APA91bFfkMTua4aYNcOxNpjmVCq9VCMnqvJjC5bll5cxpQR9MT7BKdMeOYXj1kBYYpPLEUgRZcmXfD9kguu0kvnybmbC4jaW86s59ynlKXAuKjT3eGQ2Pk7pWc_gs6eIrFAyYAkf8_wL', 'pcco', NULL, 'android', NULL, NULL, NULL, NULL, NULL, NULL, NULL, NULL, '2020-02-19 13:29:11', '2022-07-20 19:05:49', NULL, '2022-07-16 14:39:04', NULL)
, (20, 'eGx98rjhVa4:APA91bEdCz9XGtmhasDlXnUilCFiBE2KHyc6oAHqvAITAaDd-e-ntnlM5rrqo7KY7BZapBh26rsWW_zy3T0wIwf2nAnEEoyrXXrDoLogBZ0GcBvSJi9mdEKfWot4XusqYZv11JI_ldah', 'pcco', NULL, 'android', NULL, NULL, NULL, NULL, NULL, NULL, NULL, NULL, '2020-02-19 13:29:12', '2022-07-20 19:05:49', NULL, '2022-07-16 14:39:04', NULL)
, (23, 'f5cBwRzHlcE:APA91bFsms_dQ1XY5NjjrTq-y0nzqArkR7u7mZ02IA5fejxsDWmoTZbclkzcRaNzRO3LEqMfQXGJZG-tswAnOKtgk2If73sAqezXOA_azqem4JjtSwONjYOYeEHqdbzeiQBuqJoladXS', 'pcco', 202, 'ios', 'Mozilla/5.0 (iPhone; CPU iPhone OS 15_7_3 like Mac OS X) AppleWebKit/605.1.15 (KHTML, like Gecko) Mobile/15E148', 1, 'Apple-iPhone8,4', 679, 433213, 433213, '{\"auth\":

{\"authBy\":\"login\",\"login\":\"fritz.müller@web.de\",\"password\":\"6469e8701df05ecd7736343296a383a8\"}

,\"club\":\"694\",\"device_id\":\"23\",\"lang\":\"de\",\"visitor_uid\":\"b7519c67ad0320908d418c5a05ef3dfb\"}', 'ulBUTArKfWqKOwUWeD7lEGC4nW', '2020-02-19 13:29:18', '2023-02-14 17:01:02', '2023-02-14 17:00:26', '2023-02-14 17:01:02', 187)
;

The Query Execution Plan looks completely wrong:

EXPLAIN
SELECT * FROM aaa_device
 WHERE device_app = 'pcco'
   AND device_token = 'f5cBwRzHlcE:APA91bFsms_dQ1XY5NjjrTq-y0nzqArkR7u7mZ02IA5fejxsDWmoTZbclkzcRaNzRO3LEqMfQXGJZG-tswAnOKtgk2If73sAqezXOA_azqem4JjtSwONjYOYeEHqdbzeiQBuqJoladXS'
;
+------+-------------+------------+------+------------------------------------------+------+---------+------+------+-------------+
| id   | select_type | table      | type | possible_keys                            | key  | key_len | ref  | rows | Extra       |
+------+-------------+------------+------+------------------------------------------+------+---------+------+------+-------------+
|    1 | SIMPLE      | aaa_device | ALL  | ucTokenApp,device_app_device_app_version | NULL | NULL    | NULL | 7    | Using where |
+------+-------------+------------+------+------------------------------------------+------+---------+------+------+-------------+

After changing the field length of device_token from 750 to 700 VARCHAR:

ALTER TABLE aaa_device MODIFY COLUMN `device_token` varchar(700) CHARACTER SET utf8mb4 COLLATE utf8mb4_bin NOT NULL;

The USING HASH disappeared! And the Optimizer did the right choice:

 
EXPLAIN
SELECT * FROM aaa_device
 WHERE device_app = 'pcco'
   AND device_token = 'f5cBwRzHlcE:APA91bFsms_dQ1XY5NjjrTq-y0nzqArkR7u7mZ02IA5fejxsDWmoTZbclkzcRaNzRO3LEqMfQXGJZG-tswAnOKtgk2If73sAqezXOA_azqem4JjtSwONjYOYeEHqdbzeiQBuqJoladXS'
;
+------+-------------+------------+-------+------------------------------------------+------------+---------+-------------+------+-------+
| id   | select_type | table      | type  | possible_keys                            | key        | key_len | ref         | rows | Extra |
+------+-------------+------------+-------+------------------------------------------+------------+---------+-------------+------+-------+
|    1 | SIMPLE      | aaa_device | const | ucTokenApp,device_app_device_app_version | ucTokenApp | 2900    | const,const | 1    |       |
+------+-------------+------------+-------+------------------------------------------+------------+---------+-------------+------+-------+

So either there is something wrong in the documentation and/or the Optimizer does something really odd...

Tested it reproducible with 10.11.2 and 10.6.11.

Bonus information:

SQL> INSERT INTO aaa_device (device_id, device_token, device_app, device_platform)  select device_id, device_token, device_app, device_platform from test.aaa_device limit 100;
ERROR 1364 (HY000): Field 'DB_ROW_HASH_1' doesn't have a default value



 Comments   
Comment by Oli Sennhauser [ 2023-04-18 ]

Other bonus info:

SQL> alter table aaa_device drop index ucTokenApp, add unique index `ucTokenApp` (`device_token`,`device_app`) USING BTREE;
ERROR 1071 (42000): Specified key was too long; max key length is 3072 bytes

Comment by Marko Mäkelä [ 2023-04-18 ]

MDEV-371 implemented a kind of "fake hash index" by defining a B-tree indexed hidden virtual column, using a hash function on a BLOB column. It might also have enabled generic "hash index" for InnoDB using the same technique.

Can you check via information_schema.innodb_sys_virtual if that could be the case here? If you see any DB_ROW_HASH_ columns listed, those should be created by MDEV-371.

Comment by Oli Sennhauser [ 2023-04-18 ]

select * from INNODB_SYS_TABLES AS ist join innodb_sys_virtual AS isv on isv.table_id = ist.table_id;
+----------+-----------------+------+--------+-------+------------+---------------+------------+----------+-------+----------+
| TABLE_ID | NAME            | FLAG | N_COLS | SPACE | ROW_FORMAT | ZIP_PAGE_SIZE | SPACE_TYPE | TABLE_ID | POS   | BASE_POS |
+----------+-----------------+------+--------+-------+------------+---------------+------------+----------+-------+----------+
|      393 | test/aaa_device |   33 |     21 |   380 | Dynamic    |             0 | Single     |      393 | 65554 |        1 |
|      393 | test/aaa_device |   33 |     21 |   380 | Dynamic    |             0 | Single     |      393 | 65554 |        2 |
+----------+-----------------+------+--------+-------+------------+---------------+------------+----------+-------+----------+

select * from INNODB_SYS_COLUMNS where table_id = 393;
+----------+----------------------------+-------+-------+----------+------+
| TABLE_ID | NAME                       | POS   | MTYPE | PRTYPE   | LEN  |
+----------+----------------------------+-------+-------+----------+------+
|      393 | device_id                  |     0 |     6 |     1795 |    4 |
|      393 | device_token               |     1 |    12 |  3019023 | 3000 |
|      393 | device_app                 |     2 |    12 | 16122127 |   96 |
|      393 | device_app_version         |     3 |     6 |     1538 |    2 |
|      393 | device_platform            |     4 |    12 | 16122127 |   64 |
|      393 | device_user_agent          |     5 |    12 | 16125967 | 2048 |
|      393 | device_push_enabled        |     6 |     6 |     1537 |    1 |
|      393 | device_name                |     7 |    12 | 16125967 |  256 |
|      393 | pg_id                      |     8 |     6 |     1538 |    2 |
|      393 | device_user_last           |     9 |     6 |     1539 |    4 |
|      393 | device_user_last_logged_in |    10 |     6 |     1539 |    4 |
|      393 | device_cookie              |    11 |     5 | 16122108 |   10 |
|      393 | device_session_id          |    12 |    12 | 16121871 |  128 |
|      393 | device_insert              |    13 |     3 |   525580 |    5 |
|      393 | device_update              |    14 |     3 |   525580 |    5 |
|      393 | device_restart             |    15 |     3 |   525324 |    5 |
|      393 | device_use                 |    16 |     3 |   525580 |    5 |
|      393 | device_cookie_len          |    17 |     6 |     1539 |    4 |
|      393 | DB_ROW_HASH_1              | 65554 |     6 |     9992 |    8 |
+----------+----------------------------+-------+-------+----------+------+

Comment by Sergei Golubchik [ 2023-05-26 ]

At the moment optimizer can only use indexes that say USING HASH in the MEMORY engine.

Generated at Thu Feb 08 10:21:04 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.