Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-14698

mariadb hangs on a simple query with a dependent sub query innodb

Details

    • 5.5.59

    Description

      explain select distinct zip_code from croutes where type not like 'PO%' and zip_code not in (select distinct zip_code from carrier_routes);
      +------+--------------------+----------------+-------+-----------------------------+-----------------------------+---------+------+--------+-------------------------------------------+
      | id   | select_type        | table          | type  | possible_keys               | key                         | key_len | ref  | rows   | Extra                                     |
      +------+--------------------+----------------+-------+-----------------------------+-----------------------------+---------+------+--------+-------------------------------------------+
      |    1 | PRIMARY            | croutes        | index | NULL                        | IDX_croutes_TYPE            | 67      | NULL | 433960 | Using where; Using index; Using temporary |
      |    2 | DEPENDENT SUBQUERY | carrier_routes | index | IDX_carrier_routes_ZIP_CODE | IDX_carrier_routes_ZIP_CODE | 10      | NULL | 231089 | Using where; Using index                  |
      

      This proves that the tables are properly indexed. The tables are actually smalll, 30K+ records. Yet the query hangs forever and "show processlist" says "Sending data" on the column "state".

      show variables like 'innodb%'; 
      +---------------------------------------------+-------------------------------+
      | Variable_name                               | Value                         |
      +---------------------------------------------+-------------------------------+
      | innodb_adaptive_flushing                    | ON                            |
      | innodb_adaptive_flushing_lwm                | 10.000000                     |
      | innodb_adaptive_hash_index                  | ON                            |
      | innodb_adaptive_hash_index_partitions       | 8                             |
      | innodb_adaptive_hash_index_parts            | 8                             |
      | innodb_adaptive_max_sleep_delay             | 150000                        |
      | innodb_autoextend_increment                 | 64                            |
      | innodb_autoinc_lock_mode                    | 1                             |
      | innodb_background_scrub_data_check_interval | 3600                          |
      | innodb_background_scrub_data_compressed     | OFF                           |
      | innodb_background_scrub_data_interval       | 604800                        |
      | innodb_background_scrub_data_uncompressed   | OFF                           |
      | innodb_buf_dump_status_frequency            | 0                             |
      | innodb_buffer_pool_chunk_size               | 134217728                     |
      | innodb_buffer_pool_dump_at_shutdown         | OFF                           |
      | innodb_buffer_pool_dump_now                 | OFF                           |
      | innodb_buffer_pool_dump_pct                 | 25                            |
      | innodb_buffer_pool_filename                 | ib_buffer_pool                |
      | innodb_buffer_pool_instances                | 16                            |
      | innodb_buffer_pool_load_abort               | OFF                           |
      | innodb_buffer_pool_load_at_startup          | OFF                           |
      | innodb_buffer_pool_load_now                 | OFF                           |
      | innodb_buffer_pool_populate                 | OFF                           |
      | innodb_buffer_pool_size                     | 10737418240                   |
      | innodb_change_buffer_max_size               | 25                            |
      | innodb_change_buffering                     | all                           |
      | innodb_checksum_algorithm                   | crc32                         |
      | innodb_checksums                            | ON                            |
      | innodb_cleaner_lsn_age_factor               | DEPRECATED                    |
      | innodb_cmp_per_index_enabled                | OFF                           |
      | innodb_commit_concurrency                   | 0                             |
      | innodb_compression_algorithm                | zlib                          |
      | innodb_compression_default                  | OFF                           |
      | innodb_compression_failure_threshold_pct    | 5                             |
      | innodb_compression_level                    | 6                             |
      | innodb_compression_pad_pct_max              | 50                            |
      | innodb_concurrency_tickets                  | 5000                          |
      | innodb_corrupt_table_action                 | deprecated                    |
      | innodb_data_file_path                       | ibdata1:12M:autoextend        |
      | innodb_data_home_dir                        |                               |
      | innodb_deadlock_detect                      | ON                            |
      | innodb_default_encryption_key_id            | 1                             |
      | innodb_default_row_format                   | dynamic                       |
      | innodb_defragment                           | OFF                           |
      | innodb_defragment_fill_factor               | 0.900000                      |
      | innodb_defragment_fill_factor_n_recs        | 20                            |
      | innodb_defragment_frequency                 | 40                            |
      | innodb_defragment_n_pages                   | 7                             |
      | innodb_defragment_stats_accuracy            | 0                             |
      | innodb_disable_sort_file_cache              | OFF                           |
      | innodb_disallow_writes                      | OFF                           |
      | innodb_doublewrite                          | OFF                           |
      | innodb_empty_free_list_algorithm            | DEPRECATED                    |
      | innodb_encrypt_log                          | OFF                           |
      | innodb_encrypt_tables                       | OFF                           |
      | innodb_encryption_rotate_key_age            | 1                             |
      | innodb_encryption_rotation_iops             | 100                           |
      | innodb_encryption_threads                   | 0                             |
      | innodb_fake_changes                         | OFF                           |
      | innodb_fast_shutdown                        | 1                             |
      | innodb_fatal_semaphore_wait_threshold       | 600                           |
      | innodb_file_format                          | Barracuda                     |
      | innodb_file_format_check                    | ON                            |
      | innodb_file_format_max                      | Barracuda                     |
      | innodb_file_per_table                       | ON                            |
      | innodb_fill_factor                          | 100                           |
      | innodb_flush_log_at_timeout                 | 120                           |
      | innodb_flush_log_at_trx_commit              | 0                             |
      | innodb_flush_method                         | O_DIRECT                      |
      | innodb_flush_neighbors                      | 1                             |
      | innodb_flush_sync                           | ON                            |
      | innodb_flushing_avg_loops                   | 30                            |
      | innodb_force_load_corrupted                 | OFF                           |
      | innodb_force_primary_key                    | OFF                           |
      | innodb_force_recovery                       | 0                             |
      | innodb_foreground_preflush                  | DEPRECATED                    |
      | innodb_ft_aux_table                         |                               |
      | innodb_ft_cache_size                        | 8000000                       |
      | innodb_ft_enable_diag_print                 | OFF                           |
      | innodb_ft_enable_stopword                   | ON                            |
      | innodb_ft_max_token_size                    | 84                            |
      | innodb_ft_min_token_size                    | 3                             |
      | innodb_ft_num_word_optimize                 | 2000                          |
      | innodb_ft_result_cache_limit                | 2000000000                    |
      | innodb_ft_server_stopword_table             |                               |
      | innodb_ft_sort_pll_degree                   | 2                             |
      | innodb_ft_total_cache_size                  | 640000000                     |
      | innodb_ft_user_stopword_table               |                               |
      | innodb_idle_flush_pct                       | 100                           |
      | innodb_immediate_scrub_data_uncompressed    | OFF                           |
      | innodb_instrument_semaphores                | OFF                           |
      | innodb_io_capacity                          | 200                           |
      | innodb_io_capacity_max                      | 35000                         |
      | innodb_kill_idle_transaction                | 0                             |
      | innodb_large_prefix                         | ON                            |
      | innodb_lock_schedule_algorithm              | vats                          |
      | innodb_lock_wait_timeout                    | 50                            |
      | innodb_locking_fake_changes                 | OFF                           |
      | innodb_locks_unsafe_for_binlog              | OFF                           |
      | innodb_log_arch_dir                         |                               |
      | innodb_log_arch_expire_sec                  | 0                             |
      | innodb_log_archive                          | OFF                           |
      | innodb_log_block_size                       | 0                             |
      | innodb_log_buffer_size                      | 16777216                      |
      | innodb_log_checksum_algorithm               | DEPRECATED                    |
      | innodb_log_checksums                        | ON                            |
      | innodb_log_compressed_pages                 | ON                            |
      | innodb_log_file_size                        | 50331648                      |
      | innodb_log_files_in_group                   | 2                             |
      | innodb_log_group_home_dir                   | ./                            |
      | innodb_log_write_ahead_size                 | 8192                          |
      | innodb_lru_scan_depth                       | 2000                          |
      | innodb_max_bitmap_file_size                 | 0                             |
      | innodb_max_changed_pages                    | 0                             |
      | innodb_max_dirty_pages_pct                  | 75.000000                     |
      | innodb_max_dirty_pages_pct_lwm              | 0.000000                      |
      | innodb_max_purge_lag                        | 0                             |
      | innodb_max_purge_lag_delay                  | 0                             |
      | innodb_max_undo_log_size                    | 10485760                      |
      | innodb_mirrored_log_groups                  | 0                             |
      | innodb_monitor_disable                      |                               |
      | innodb_monitor_enable                       |                               |
      | innodb_monitor_reset                        |                               |
      | innodb_monitor_reset_all                    |                               |
      | innodb_mtflush_threads                      | 8                             |
      | innodb_old_blocks_pct                       | 37                            |
      | innodb_old_blocks_time                      | 1000                          |
      | innodb_online_alter_log_max_size            | 134217728                     |
      | innodb_open_files                           | 2000                          |
      | innodb_optimize_fulltext_only               | OFF                           |
      | innodb_page_cleaners                        | 1                             |
      | innodb_page_size                            | 16384                         |
      | innodb_prefix_index_cluster_optimization    | OFF                           |
      | innodb_print_all_deadlocks                  | OFF                           |
      | innodb_purge_batch_size                     | 300                           |
      | innodb_purge_rseg_truncate_frequency        | 128                           |
      | innodb_purge_threads                        | 4                             |
      | innodb_random_read_ahead                    | OFF                           |
      | innodb_read_ahead_threshold                 | 56                            |
      | innodb_read_io_threads                      | 32                            |
      | innodb_read_only                            | OFF                           |
      | innodb_replication_delay                    | 0                             |
      | innodb_rollback_on_timeout                  | OFF                           |
      | innodb_rollback_segments                    | 128                           |
      | innodb_sched_priority_cleaner               | 0                             |
      | innodb_scrub_log                            | OFF                           |
      | innodb_scrub_log_speed                      | 256                           |
      | innodb_show_locks_held                      | 0                             |
      | innodb_show_verbose_locks                   | 0                             |
      | innodb_sort_buffer_size                     | 1048576                       |
      | innodb_spin_wait_delay                      | 6                             |
      | innodb_stats_auto_recalc                    | ON                            |
      | innodb_stats_include_delete_marked          | OFF                           |
      | innodb_stats_method                         | nulls_equal                   |
      | innodb_stats_modified_counter               | 0                             |
      | innodb_stats_on_metadata                    | OFF                           |
      | innodb_stats_persistent                     | ON                            |
      | innodb_stats_persistent_sample_pages        | 20                            |
      | innodb_stats_sample_pages                   | 8                             |
      | innodb_stats_traditional                    | ON                            |
      | innodb_stats_transient_sample_pages         | 8                             |
      | innodb_status_output                        | OFF                           |
      | innodb_status_output_locks                  | OFF                           |
      | innodb_strict_mode                          | ON                            |
      | innodb_support_xa                           | ON                            |
      | innodb_sync_array_size                      | 1                             |
      | innodb_sync_spin_loops                      | 30                            |
      | innodb_table_locks                          | ON                            |
      | innodb_temp_data_file_path                  | ibtmp1:12M:autoextend:max:10G |
      | innodb_thread_concurrency                   | 32                            |
      | innodb_thread_sleep_delay                   | 0                             |
      | innodb_tmpdir                               |                               |
      | innodb_track_changed_pages                  | OFF                           |
      | innodb_track_redo_log_now                   | OFF                           |
      | innodb_undo_directory                       | ./                            |
      | innodb_undo_log_truncate                    | OFF                           |
      | innodb_undo_logs                            | 128                           |
      | innodb_undo_tablespaces                     | 0                             |
      | innodb_use_atomic_writes                    | ON                            |
      | innodb_use_fallocate                        | OFF                           |
      | innodb_use_global_flush_log_at_trx_commit   | OFF                           |
      | innodb_use_mtflush                          | OFF                           |
      | innodb_use_native_aio                       | ON                            |
      | innodb_use_stacktrace                       | OFF                           |
      | innodb_use_trim                             | ON                            |
      | innodb_version                              | 5.7.20                        |
      | innodb_write_io_threads                     | 8                             |
      +---------------------------------------------+-------------------------------+
      

      I can give access to Elena so she can verify the issue.

      Attachments

        Activity

          I need to report that MySQL 5.7 has the same exact bug. So it is not a MariaDB bug.
          I have an independent machine with MySQL 5.7 and the tables if somebody needs to file a MySQL bug and needs a test-bed already installed. Alternatively, you may download my tables and reproduce it. The information thereof is not secret but it is proprietary, has commercial value. I trust you will never make it public.

          philip_38 Philip orleans added a comment - I need to report that MySQL 5.7 has the same exact bug. So it is not a MariaDB bug. I have an independent machine with MySQL 5.7 and the tables if somebody needs to file a MySQL bug and needs a test-bed already installed. Alternatively, you may download my tables and reproduce it. The information thereof is not secret but it is proprietary, has commercial value. I trust you will never make it public.
          alice Alice Sherepa added a comment -

          explain select distinct zip_code from croutes where type not like 'PO%' and zip_code not in (select distinct zip_code from carrier_routes);
          

          According to my testing the problem behaviour is a bug, which is triggered when columns have different collations. So I expect that query will be fast if you change collation of column or alter table to the same collation.

          Following testcase demonstrates the problem:

          create table t1 (z varchar(5), key (z)) DEFAULT CHARSET=utf8
          	as SELECT floor(rand()*10000) as z FROM seq_1_to_400000;
           
          create table t2 (z varchar(5), key (z))  DEFAULT CHARSET=latin1
          	as SELECT floor((rand()+0.5)*10000) as z FROM seq_1_to_200000;
           
          explain select distinct z from t1 left join t2 using (z) where t2.z is NULL;
           
          ### returns results after <2s with the same character set, but hangs with different.
          ### 
          select distinct z from t1 left join t2 using (z) where t2.z is NULL;
          select distinct z from t1 where  z not in (select distinct z from t2);
          

          If you need some further minor improvements, this could also be efficient:
          1) subquery can be rewritten as a LEFT JOIN
          2) not using DISTINCT clause in subquery
          3) not using AVG_ROW_LENGTH in table definition - AVG_ROW_LENGTH is the average rows size, and is only useful for tables using the FIXED format.

          select distinct zip_code from croutes left join carrier_routes using (zip_code) where carrier_routes.zip_code is NULL and croutes.type not like 'PO%';
          

          I need some more investigation to verify the bug.

          alice Alice Sherepa added a comment - explain select distinct zip_code from croutes where type not like 'PO%' and zip_code not in ( select distinct zip_code from carrier_routes); According to my testing the problem behaviour is a bug, which is triggered when columns have different collations. So I expect that query will be fast if you change collation of column or alter table to the same collation. Following testcase demonstrates the problem: create table t1 (z varchar (5), key (z)) DEFAULT CHARSET=utf8 as SELECT floor(rand()*10000) as z FROM seq_1_to_400000;   create table t2 (z varchar (5), key (z)) DEFAULT CHARSET=latin1 as SELECT floor((rand()+0.5)*10000) as z FROM seq_1_to_200000;   explain select distinct z from t1 left join t2 using (z) where t2.z is NULL ;   ### returns results after <2s with the same character set , but hangs with different. ### select distinct z from t1 left join t2 using (z) where t2.z is NULL ; select distinct z from t1 where z not in ( select distinct z from t2); If you need some further minor improvements, this could also be efficient: 1) subquery can be rewritten as a LEFT JOIN 2) not using DISTINCT clause in subquery 3) not using AVG_ROW_LENGTH in table definition - AVG_ROW_LENGTH is the average rows size, and is only useful for tables using the FIXED format. select distinct zip_code from croutes left join carrier_routes using (zip_code) where carrier_routes.zip_code is NULL and croutes.type not like 'PO%' ; I need some more investigation to verify the bug.

          I confirm that once I changed the character set to Latin1, which involves the collation latin1_swedish_ci, the query works.
          Thanks to Alice Sherepa for figuring this out.

          philip_38 Philip orleans added a comment - I confirm that once I changed the character set to Latin1, which involves the collation latin1_swedish_ci, the query works. Thanks to Alice Sherepa for figuring this out.
          alice Alice Sherepa added a comment -

          testcase, that suitable for 5.5

           
          drop if exists table t1, t2;
          create table t1 (z varchar(25), key (z)) DEFAULT CHARSET=utf8
          	as SELECT floor(rand()*10000) as z FROM mysql.help_topic;
           
          insert into t1 	SELECT floor(rand()*10000) FROM t1;
          insert into t1 	SELECT floor(rand()*10000) FROM t1;
          insert into t1 	SELECT floor(rand()*10000) FROM t1;
          insert into t1 	SELECT floor(rand()*10000) FROM t1;
          insert into t1 	SELECT floor(rand()*10000) FROM t1;
          insert into t1 	SELECT floor(rand()*10000) FROM t1;
          insert into t1 	SELECT floor(rand()*10000) FROM t1;
          insert into t1 	SELECT floor(rand()*10000) FROM t1;
          insert into t1 	SELECT floor(rand()*10000) FROM t1;
           
           
          create table t2 (z varchar(5), key (z))  DEFAULT CHARSET=latin1
          	as SELECT floor(rand()*10000) as z FROM mysql.help_topic;
           
          insert into t2 	SELECT floor(rand()*10000) FROM t2;
          insert into t2 	SELECT floor(rand()*10000) FROM t2;
          insert into t2 	SELECT floor(rand()*10000) FROM t2;
          insert into t2 	SELECT floor(rand()*10000) FROM t2;
          insert into t2 	SELECT floor(rand()*10000) FROM t2;
          insert into t2 	SELECT floor(rand()*10000) FROM t2;
          insert into t2 	SELECT floor(rand()*10000) FROM t2;
          insert into t2 	SELECT floor(rand()*10000) FROM t2;
           
          explain select distinct z from t1 left join t2 using (z) where t2.z is NULL;
           
          ### returns results after <2s with the same character set, but hangs with different.
          select distinct z from t1 left join t2 using (z) where t2.z is NULL;
          select distinct z from t1 where  z not in (select distinct z from t2);
          

          alice Alice Sherepa added a comment - testcase, that suitable for 5.5   drop if exists table t1, t2; create table t1 (z varchar (25), key (z)) DEFAULT CHARSET=utf8 as SELECT floor(rand()*10000) as z FROM mysql.help_topic;   insert into t1 SELECT floor(rand()*10000) FROM t1; insert into t1 SELECT floor(rand()*10000) FROM t1; insert into t1 SELECT floor(rand()*10000) FROM t1; insert into t1 SELECT floor(rand()*10000) FROM t1; insert into t1 SELECT floor(rand()*10000) FROM t1; insert into t1 SELECT floor(rand()*10000) FROM t1; insert into t1 SELECT floor(rand()*10000) FROM t1; insert into t1 SELECT floor(rand()*10000) FROM t1; insert into t1 SELECT floor(rand()*10000) FROM t1;     create table t2 (z varchar (5), key (z)) DEFAULT CHARSET=latin1 as SELECT floor(rand()*10000) as z FROM mysql.help_topic;   insert into t2 SELECT floor(rand()*10000) FROM t2; insert into t2 SELECT floor(rand()*10000) FROM t2; insert into t2 SELECT floor(rand()*10000) FROM t2; insert into t2 SELECT floor(rand()*10000) FROM t2; insert into t2 SELECT floor(rand()*10000) FROM t2; insert into t2 SELECT floor(rand()*10000) FROM t2; insert into t2 SELECT floor(rand()*10000) FROM t2; insert into t2 SELECT floor(rand()*10000) FROM t2;   explain select distinct z from t1 left join t2 using (z) where t2.z is NULL ;   ### returns results after <2s with the same character set , but hangs with different. select distinct z from t1 left join t2 using (z) where t2.z is NULL ; select distinct z from t1 where z not in ( select distinct z from t2);
          bar Alexander Barkov added a comment - - edited

          The problem is also repeatable with a similar script without using mysql.help_topic, using a smaller amount of records:

          DROP TABLE IF EXISTS t1, t2, t3;
          CREATE TABLE t1 (z varchar(25), key (z)) DEFAULT CHARSET=utf8;
          INSERT INTO t1 VALUES (0),(1),(2),(3),(4),(5),(6),(7),(8),(9),(10),(11),(12),(13),(14),(15);
          insert into t1 	SELECT floor(rand()*10000) FROM t1;
          insert into t1 	SELECT floor(rand()*10000) FROM t1;
          insert into t1 	SELECT floor(rand()*10000) FROM t1;
          insert into t1 	SELECT floor(rand()*10000) FROM t1;
          insert into t1 	SELECT floor(rand()*10000) FROM t1;
          insert into t1 	SELECT floor(rand()*10000) FROM t1;
          insert into t1 	SELECT floor(rand()*10000) FROM t1;
          SELECT COUNT(*) FROM t1;
           
          CREATE TABLE t2 (z varchar(25), key (z)) DEFAULT CHARSET=utf8;
          INSERT INTO t2 VALUES (0),(1),(2),(3),(4),(5),(6),(7),(8),(9),(10),(11),(12),(13),(14),(15);
          insert into t2 	SELECT floor(rand()*10000) FROM t1;
          insert into t2 	SELECT floor(rand()*10000) FROM t1;
          insert into t2 	SELECT floor(rand()*10000) FROM t1;
          insert into t2 	SELECT floor(rand()*10000) FROM t1;
          insert into t2 	SELECT floor(rand()*10000) FROM t1;
          insert into t2 	SELECT floor(rand()*10000) FROM t1;
          insert into t2 	SELECT floor(rand()*10000) FROM t1;
          SELECT COUNT(*) FROM t2;
           
          CREATE TABLE t3 (z VARCHAR(25) CHARACTER SET latin1, KEY(z)) AS SELECT * FROM t2;
          

          Now the query without character set conversion needs around 0.06 seconds:

          SELECT DISTINCT z FROM t1 LEFT JOIN t2 USING (z) WHERE t2.z IS NULL;
          

          The query with character set conversion needs 13.22 seconds:

          SELECT DISTINCT z FROM t1 LEFT JOIN t3 USING (z) WHERE t3.z IS NULL;
          

          So it does not hung. It just becomes slow when character set conversion takes place.

          The difference happens because in case of character set conversion the query gets effectively rewritten as:

          SELECT DISTINCT t1.z FROM t1 LEFT JOIN t3 ON t1.z=CONVERT(t3.z USING utf8) WHERE t3.z IS NULL;
          

          so it cannot use the index on t3.z any more.

          These explains prove this:

          # No character set conversion
          EXPLAIN SELECT DISTINCT z FROM t1 LEFT JOIN t2 USING (z) WHERE t2.z IS NULL;
          

          +------+-------------+-------+-------+---------------+------+---------+-----------+------+------------------------------------+
          | id   | select_type | table | type  | possible_keys | key  | key_len | ref       | rows | Extra                              |
          +------+-------------+-------+-------+---------------+------+---------+-----------+------+------------------------------------+
          |    1 | SIMPLE      | t1    | index | NULL          | z    | 78      | NULL      | 2048 | Using index; Using temporary       |
          |    1 | SIMPLE      | t2    | ref   | z             | z    | 78      | test.t1.z |   10 | Using where; Using index; Distinct |
          +------+-------------+-------+-------+---------------+------+---------+-----------+------+------------------------------------+
          

          # Implicit character set conversion
          EXPLAIN SELECT DISTINCT z FROM t1 LEFT JOIN t3 USING (z) WHERE t3.z IS NULL;
          

          +------+-------------+-------+-------+---------------+------+---------+------+-------+--------------------------------------------------------------+
          | id   | select_type | table | type  | possible_keys | key  | key_len | ref  | rows  | Extra                                                        |
          +------+-------------+-------+-------+---------------+------+---------+------+-------+--------------------------------------------------------------+
          |    1 | SIMPLE      | t1    | index | NULL          | z    | 78      | NULL |  2048 | Using index; Using temporary                                 |
          |    1 | SIMPLE      | t3    | index | NULL          | z    | 28      | NULL | 14352 | Using where; Using index; Using join buffer (flat, BNL join) |
          +------+-------------+-------+-------+---------------+------+---------+------+-------+--------------------------------------------------------------+
          

          # Explicit character set conversion
          EXPLAIN SELECT DISTINCT t1.z FROM t1 LEFT JOIN t3 ON t1.z=CONVERT(t3.z USING utf8) WHERE t3.z IS NULL;
          

          +------+-------------+-------+-------+---------------+------+---------+------+-------+--------------------------------------------------------------+
          | id   | select_type | table | type  | possible_keys | key  | key_len | ref  | rows  | Extra                                                        |
          +------+-------------+-------+-------+---------------+------+---------+------+-------+--------------------------------------------------------------+
          |    1 | SIMPLE      | t1    | index | NULL          | z    | 78      | NULL |  2048 | Using index; Using temporary                                 |
          |    1 | SIMPLE      | t3    | index | NULL          | z    | 28      | NULL | 14352 | Using where; Using index; Using join buffer (flat, BNL join) |
          +------+-------------+-------+-------+---------------+------+---------+------+-------+--------------------------------------------------------------+
          

          Notice:

          • The first EXPLAIN tells that the index is used.
          • The second EXPLAIN (with implicit character set conversion) does not use the index.
          • The third EXPLAIN (with explicit character set conversion) does not use the index.

          I'd say this behaviour is expected. But the optimizer can be in theory improved to catch such cases.

          bar Alexander Barkov added a comment - - edited The problem is also repeatable with a similar script without using mysql.help_topic, using a smaller amount of records: DROP TABLE IF EXISTS t1, t2, t3; CREATE TABLE t1 (z varchar (25), key (z)) DEFAULT CHARSET=utf8; INSERT INTO t1 VALUES (0),(1),(2),(3),(4),(5),(6),(7),(8),(9),(10),(11),(12),(13),(14),(15); insert into t1 SELECT floor(rand()*10000) FROM t1; insert into t1 SELECT floor(rand()*10000) FROM t1; insert into t1 SELECT floor(rand()*10000) FROM t1; insert into t1 SELECT floor(rand()*10000) FROM t1; insert into t1 SELECT floor(rand()*10000) FROM t1; insert into t1 SELECT floor(rand()*10000) FROM t1; insert into t1 SELECT floor(rand()*10000) FROM t1; SELECT COUNT (*) FROM t1;   CREATE TABLE t2 (z varchar (25), key (z)) DEFAULT CHARSET=utf8; INSERT INTO t2 VALUES (0),(1),(2),(3),(4),(5),(6),(7),(8),(9),(10),(11),(12),(13),(14),(15); insert into t2 SELECT floor(rand()*10000) FROM t1; insert into t2 SELECT floor(rand()*10000) FROM t1; insert into t2 SELECT floor(rand()*10000) FROM t1; insert into t2 SELECT floor(rand()*10000) FROM t1; insert into t2 SELECT floor(rand()*10000) FROM t1; insert into t2 SELECT floor(rand()*10000) FROM t1; insert into t2 SELECT floor(rand()*10000) FROM t1; SELECT COUNT (*) FROM t2;   CREATE TABLE t3 (z VARCHAR (25) CHARACTER SET latin1, KEY (z)) AS SELECT * FROM t2; Now the query without character set conversion needs around 0.06 seconds: SELECT DISTINCT z FROM t1 LEFT JOIN t2 USING (z) WHERE t2.z IS NULL ; The query with character set conversion needs 13.22 seconds: SELECT DISTINCT z FROM t1 LEFT JOIN t3 USING (z) WHERE t3.z IS NULL ; So it does not hung. It just becomes slow when character set conversion takes place. The difference happens because in case of character set conversion the query gets effectively rewritten as: SELECT DISTINCT t1.z FROM t1 LEFT JOIN t3 ON t1.z= CONVERT (t3.z USING utf8) WHERE t3.z IS NULL ; so it cannot use the index on t3.z any more. These explains prove this: # No character set conversion EXPLAIN SELECT DISTINCT z FROM t1 LEFT JOIN t2 USING (z) WHERE t2.z IS NULL ; +------+-------------+-------+-------+---------------+------+---------+-----------+------+------------------------------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | +------+-------------+-------+-------+---------------+------+---------+-----------+------+------------------------------------+ | 1 | SIMPLE | t1 | index | NULL | z | 78 | NULL | 2048 | Using index; Using temporary | | 1 | SIMPLE | t2 | ref | z | z | 78 | test.t1.z | 10 | Using where; Using index; Distinct | +------+-------------+-------+-------+---------------+------+---------+-----------+------+------------------------------------+ # Implicit character set conversion EXPLAIN SELECT DISTINCT z FROM t1 LEFT JOIN t3 USING (z) WHERE t3.z IS NULL ; +------+-------------+-------+-------+---------------+------+---------+------+-------+--------------------------------------------------------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | +------+-------------+-------+-------+---------------+------+---------+------+-------+--------------------------------------------------------------+ | 1 | SIMPLE | t1 | index | NULL | z | 78 | NULL | 2048 | Using index; Using temporary | | 1 | SIMPLE | t3 | index | NULL | z | 28 | NULL | 14352 | Using where; Using index; Using join buffer (flat, BNL join) | +------+-------------+-------+-------+---------------+------+---------+------+-------+--------------------------------------------------------------+ # Explicit character set conversion EXPLAIN SELECT DISTINCT t1.z FROM t1 LEFT JOIN t3 ON t1.z= CONVERT (t3.z USING utf8) WHERE t3.z IS NULL ; +------+-------------+-------+-------+---------------+------+---------+------+-------+--------------------------------------------------------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | +------+-------------+-------+-------+---------------+------+---------+------+-------+--------------------------------------------------------------+ | 1 | SIMPLE | t1 | index | NULL | z | 78 | NULL | 2048 | Using index; Using temporary | | 1 | SIMPLE | t3 | index | NULL | z | 28 | NULL | 14352 | Using where; Using index; Using join buffer (flat, BNL join) | +------+-------------+-------+-------+---------------+------+---------+------+-------+--------------------------------------------------------------+ Notice: The first EXPLAIN tells that the index is used. The second EXPLAIN (with implicit character set conversion) does not use the index. The third EXPLAIN (with explicit character set conversion) does not use the index. I'd say this behaviour is expected. But the optimizer can be in theory improved to catch such cases.

          People

            bar Alexander Barkov
            philip_38 Philip orleans
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Git Integration

                Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.