Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-9097

Engine independent statistics only allows utf8 values, making query plans fail

Details

    • Bug
    • Status: Closed (View Workflow)
    • Major
    • Resolution: Duplicate
    • 10.0.16, 10.0.22, 10.0(EOL), 10.1(EOL)
    • 10.0.23
    • Optimizer
    • 10.1.9-3

    Description

      When using use_stat_tables = PREFERABLY, and then run ANALYZE TABLE on a table with a an indexed column containing UTF-8 characters (but not mysql's 'utf8'), some query plans can be altered, sometimes with horrible performance consequences.

      What it is worse, EXPLAIN does not show the new query plan from broken statistics, but it shows the original query plan, while we can check after execution than we have a very different execution plan not only on the query latency but on the Handler statistics.

      How to reproduce:

      Error formatting macro: code: java.lang.StackOverflowError

      CREATE DATABASE zhwiki;
      use zhwiki;
      CREATE TABLE `recentchanges` (
      `rc_id` int(8) NOT NULL AUTO_INCREMENT,
      `rc_timestamp` varbinary(14) NOT NULL DEFAULT '',
      `rc_cur_time` varbinary(14) NOT NULL DEFAULT '',
      `rc_user` int(10) unsigned NOT NULL DEFAULT '0',
      `rc_user_text` varbinary(255) NOT NULL DEFAULT '',
      `rc_namespace` int(11) NOT NULL DEFAULT '0',
      `rc_title` varbinary(255) NOT NULL DEFAULT '',
      `rc_comment` varbinary(255) NOT NULL DEFAULT '',
      `rc_minor` tinyint(3) unsigned NOT NULL DEFAULT '0',
      `rc_bot` tinyint(3) unsigned NOT NULL DEFAULT '0',
      `rc_new` tinyint(3) unsigned NOT NULL DEFAULT '0',
      `rc_cur_id` int(10) unsigned NOT NULL DEFAULT '0',
      `rc_this_oldid` int(10) unsigned NOT NULL DEFAULT '0',
      `rc_last_oldid` int(10) unsigned NOT NULL DEFAULT '0',
      `rc_type` tinyint(3) unsigned NOT NULL DEFAULT '0',
      `rc_source` varbinary(16) NOT NULL DEFAULT '',
      `rc_moved_to_ns` tinyint(3) unsigned NOT NULL DEFAULT '0',
      `rc_moved_to_title` varbinary(255) NOT NULL DEFAULT '',
      `rc_patrolled` tinyint(3) unsigned NOT NULL DEFAULT '0',
      `rc_ip` varbinary(40) NOT NULL DEFAULT '',
      `rc_old_len` int(10) DEFAULT NULL,
      `rc_new_len` int(10) DEFAULT NULL,
      `rc_deleted` tinyint(1) unsigned NOT NULL DEFAULT '0',
      `rc_logid` int(10) unsigned NOT NULL DEFAULT '0',
      `rc_log_type` varbinary(255) DEFAULT NULL,
      `rc_log_action` varbinary(255) DEFAULT NULL,
      `rc_params` blob NOT NULL,
      PRIMARY KEY (`rc_id`),
      KEY `rc_timestamp` (`rc_timestamp`),
      KEY `rc_namespace_title` (`rc_namespace`,`rc_title`),
      KEY `rc_cur_id` (`rc_cur_id`),
      KEY `new_name_timestamp` (`rc_new`,`rc_namespace`,`rc_timestamp`),
      KEY `rc_ip` (`rc_ip`),
      KEY `rc_ns_usertext` (`rc_namespace`,`rc_user_text`),
      KEY `rc_user_text` (`rc_user_text`,`rc_timestamp`)
      ) ENGINE=InnoDB AUTO_INCREMENT=52457042 DEFAULT CHARSET=binary

      – Note the charset binary, this allows any character encoding

      INSERT INTO `recentchanges` VALUES (51267205,'20151023001221','',0,'Reinheitsgebot',0,'��鯰科','添加语言链接:[[:specieswiki:Pangasiidae]]',1,1,0,566064,36471314,36471314,5,'wb',0,'',1,'',1254,1254,0,0,NULL,'','a:3:{s:20:\"wikibase-repo-change\";a:14:

      {s:2:\"id\";i:259555851;s:4:\"type\";s:20:\"wikibase-item~update\";s:4:\"time\";s:14:\"20151023001221\";s:9:\"object_id\";s:7:\"q901080\";s:7:\"user_id\";s:6:\"940976\";s:11:\"revision_id\";s:9:\"261218043\";s:11:\"entity_type\";s:4:\"item\";s:9:\"user_text\";s:14:\"Reinheitsgebot\";s:7:\"page_id\";i:851866;s:9:\"parent_id\";i:255341522;s:7:\"comment\";s:49:\"/* wbsetsitelink-add:1|specieswiki */ Pangasiidae\";s:6:\"rev_id\";i:261218043;s:3:\"bot\";i:1;s:7:\"site_id\";s:12:\"wikidatawiki\";}

      s:7:\"comment\";s:49:\"添加语言链接:[[:specieswiki:Pangasiidae]]\";s:12:\"comment-html\";s:123:\"添加语言链接:<a class=\"external\" href=\"//www.wikidata.org/wiki/Specieswiki:Pangasiidae\">specieswiki:Pangasiidae</a>\";}');

      SET GLOBAL use_stat_tables=PREFERABLY;

      – reload connection for change to take effect to your current session

      MariaDB [zhwiki]> ANALYZE TABLE recentchanges;
      --------------------------------------------------------------------------------------------------------------------------+

      Table Op Msg_type Msg_text

      --------------------------------------------------------------------------------------------------------------------------+

      zhwiki.recentchanges analyze status Engine-independent statistics collected
      zhwiki.recentchanges analyze Warning Incorrect string value: '\xF0\xA9\xB7\xB6\xE9\xAF...' for column 'min_value' at row 1
      zhwiki.recentchanges analyze Warning Incorrect string value: '\xF0\xA9\xB7\xB6\xE9\xAF...' for column 'max_value' at row 1
      zhwiki.recentchanges analyze status OK

      --------------------------------------------------------------------------------------------------------------------------+

      We can check that effectively, the tables have the wrong encoding:

      CREATE TABLE `column_stats` (
      `db_name` varchar(64) COLLATE utf8_bin NOT NULL,
      `table_name` varchar(64) COLLATE utf8_bin NOT NULL,
      `column_name` varchar(64) COLLATE utf8_bin NOT NULL,
      `min_value` varchar(255) COLLATE utf8_bin DEFAULT NULL,
      `max_value` varchar(255) COLLATE utf8_bin DEFAULT NULL,
      `nulls_ratio` decimal(12,4) DEFAULT NULL,
      `avg_length` decimal(12,4) DEFAULT NULL,
      `avg_frequency` decimal(12,4) DEFAULT NULL,
      `hist_size` tinyint(3) unsigned DEFAULT NULL,
      `hist_type` enum('SINGLE_PREC_HB','DOUBLE_PREC_HB') COLLATE utf8_bin DEFAULT NULL,
      `histogram` varbinary(255) DEFAULT NULL,
      PRIMARY KEY (`db_name`,`table_name`,`column_name`)
      ) ENGINE=MyISAM DEFAULT CHARSET=utf8 COLLATE=utf8_bin COMMENT='Statistics on Columns'

      With 1.4 millons rows, a simple query like:

      FLUSH STATUS;
      SELECT /* ApiQueryRecentChanges::run SANITIZED */ rc_id,rc_timestamp,rc_namespace,rc_title,rc_cur_id,rc_type,rc_deleted,rc_this_oldid,rc_last_oldid,rc_user,rc_user_text,ts_tags,page_is_redirect FROM `recentchanges` LEFT JOIN `tag_summary` ON ((rc_id=ts_rc_id)) LEFT JOIN `page` ON ((rc_namespace=page_namespace) AND (rc_title=page_title)) WHERE rc_namespace = '0' AND rc_type = '1' AND (rc_patrolled = 0) AND (page_is_redirect = 0 OR page_is_redirect IS NULL) AND rc_new=1 ORDER BY rc_timestamp DESC,rc_id DESC LIMIT 51;
      SHOW STATUS like 'Hand%';

      goes from reading 3000 rows to read the full table (full index scan) using the timestamp index, 1.4 millon rows.

      As a workaround, disable use_stat_tables.

      3 things went wrong here:

      • most value-related columns from the stats tables should be utf8mb4, probably binary, as binary columns can be indexed too
      • Execute updates of use_stat_tables transactionally. Those are not warnings, they are errors and should rollback all updates to those tables
      • EXPLAIN should show the real query plan (is it ignoring use_stat_tables?). When stats are broken, rows: 51 is shown, when in reality 1.4 million rows are scanned.

      Attachments

        Issue Links

          Activity

            jcrespo Jaime Crespo created issue -
            jcrespo Jaime Crespo made changes -
            Field Original Value New Value
            Description When using use_stat_tables = PREFERABLY, and then run ANALYZE TABLE on a table with a an indexed column containing UTF-8 characters (but not mysql's 'utf8'), some query plans can be altered, sometimes with horrible performance consequences.

            What it is worse, EXPLAIN does not show the new query plan from broken statistics, but it shows the original query plan, while we can check after execution than we have a very different execution plan not only on the query latency but on the Handler statistics.

            How to reproduce:

            {code:sql}
            CREATE DATABASE zhwiki;
            use zhwiki;
            CREATE TABLE `recentchanges` (
              `rc_id` int(8) NOT NULL AUTO_INCREMENT,
              `rc_timestamp` varbinary(14) NOT NULL DEFAULT '',
              `rc_cur_time` varbinary(14) NOT NULL DEFAULT '',
              `rc_user` int(10) unsigned NOT NULL DEFAULT '0',
              `rc_user_text` varbinary(255) NOT NULL DEFAULT '',
              `rc_namespace` int(11) NOT NULL DEFAULT '0',
              `rc_title` varbinary(255) NOT NULL DEFAULT '',
              `rc_comment` varbinary(255) NOT NULL DEFAULT '',
              `rc_minor` tinyint(3) unsigned NOT NULL DEFAULT '0',
              `rc_bot` tinyint(3) unsigned NOT NULL DEFAULT '0',
              `rc_new` tinyint(3) unsigned NOT NULL DEFAULT '0',
              `rc_cur_id` int(10) unsigned NOT NULL DEFAULT '0',
              `rc_this_oldid` int(10) unsigned NOT NULL DEFAULT '0',
              `rc_last_oldid` int(10) unsigned NOT NULL DEFAULT '0',
              `rc_type` tinyint(3) unsigned NOT NULL DEFAULT '0',
              `rc_source` varbinary(16) NOT NULL DEFAULT '',
              `rc_moved_to_ns` tinyint(3) unsigned NOT NULL DEFAULT '0',
              `rc_moved_to_title` varbinary(255) NOT NULL DEFAULT '',
              `rc_patrolled` tinyint(3) unsigned NOT NULL DEFAULT '0',
              `rc_ip` varbinary(40) NOT NULL DEFAULT '',
              `rc_old_len` int(10) DEFAULT NULL,
              `rc_new_len` int(10) DEFAULT NULL,
              `rc_deleted` tinyint(1) unsigned NOT NULL DEFAULT '0',
              `rc_logid` int(10) unsigned NOT NULL DEFAULT '0',
              `rc_log_type` varbinary(255) DEFAULT NULL,
              `rc_log_action` varbinary(255) DEFAULT NULL,
              `rc_params` blob NOT NULL,
              PRIMARY KEY (`rc_id`),
              KEY `rc_timestamp` (`rc_timestamp`),
              KEY `rc_namespace_title` (`rc_namespace`,`rc_title`),
              KEY `rc_cur_id` (`rc_cur_id`),
              KEY `new_name_timestamp` (`rc_new`,`rc_namespace`,`rc_timestamp`),
              KEY `rc_ip` (`rc_ip`),
              KEY `rc_ns_usertext` (`rc_namespace`,`rc_user_text`),
              KEY `rc_user_text` (`rc_user_text`,`rc_timestamp`)
            ) ENGINE=InnoDB AUTO_INCREMENT=52457042 DEFAULT CHARSET=binary

            -- Note the charset binary, this allows any character encoding

            INSERT INTO `recentchanges` VALUES (51267205,'20151023001221','',0,'Reinheitsgebot',0,'��鯰科','添加语言链接:[[:specieswiki:Pangasiidae]]',1,1,0,566064,36471314,36471314,5,'wb',0,'',1,'',1254,1254,0,0,NULL,'','a:3:{s:20:\"wikibase-repo-change\";a:14:{s:2:\"id\";i:259555851;s:4:\"type\";s:20:\"wikibase-item~update\";s:4:\"time\";s:14:\"20151023001221\";s:9:\"object_id\";s:7:\"q901080\";s:7:\"user_id\";s:6:\"940976\";s:11:\"revision_id\";s:9:\"261218043\";s:11:\"entity_type\";s:4:\"item\";s:9:\"user_text\";s:14:\"Reinheitsgebot\";s:7:\"page_id\";i:851866;s:9:\"parent_id\";i:255341522;s:7:\"comment\";s:49:\"/* wbsetsitelink-add:1|specieswiki */ Pangasiidae\";s:6:\"rev_id\";i:261218043;s:3:\"bot\";i:1;s:7:\"site_id\";s:12:\"wikidatawiki\";}s:7:\"comment\";s:49:\"添加语言链接:[[:specieswiki:Pangasiidae]]\";s:12:\"comment-html\";s:123:\"添加语言链接:<a class=\"external\" href=\"//www.wikidata.org/wiki/Specieswiki:Pangasiidae\">specieswiki:Pangasiidae</a>\";}');

            SET GLOBAL use_stat_tables=PREFERABLY;

            -- reload connection for change to take effect to your current session

            MariaDB [zhwiki]> ANALYZE TABLE recentchanges;
            +----------------------+---------+----------+---------------------------------------------------------------------------------------+
            | Table | Op | Msg_type | Msg_text |
            +----------------------+---------+----------+---------------------------------------------------------------------------------------+
            | zhwiki.recentchanges | analyze | status | Engine-independent statistics collected |
            | zhwiki.recentchanges | analyze | Warning | Incorrect string value: '\xF0\xA9\xB7\xB6\xE9\xAF...' for column 'min_value' at row 1 |
            | zhwiki.recentchanges | analyze | Warning | Incorrect string value: '\xF0\xA9\xB7\xB6\xE9\xAF...' for column 'max_value' at row 1 |
            | zhwiki.recentchanges | analyze | status | OK |
            +----------------------+---------+----------+---------------------------------------------------------------------------------------+
            With 1.4 millons rows, a simple query like:

            FLUSH STATUS;
            SELECT /* ApiQueryRecentChanges::run SANITIZED */ rc_id,rc_timestamp,rc_namespace,rc_title,rc_cur_id,rc_type,rc_deleted,rc_this_oldid,rc_last_oldid,rc_user,rc_user_text,ts_tags,page_is_redirect FROM `recentchanges` LEFT JOIN `tag_summary` ON ((rc_id=ts_rc_id)) LEFT JOIN `page` ON ((rc_namespace=page_namespace) AND (rc_title=page_title)) WHERE rc_namespace = '0' AND rc_type = '1' AND (rc_patrolled = 0) AND (page_is_redirect = 0 OR page_is_redirect IS NULL) AND rc_new=1 ORDER BY rc_timestamp DESC,rc_id DESC LIMIT 51;
            SHOW STATUS like 'Hand%';

            goes from reading 3000 rows to read the full table (full index scan) using the timestamp index, 1.4 millon rows.
            {code}

            As a workaround, disable use_stat_tables.

            3 things went wrong here:

            * most value-related columns from the stats tables should be utf8mb4, probably binary, as binary columns can be indexed too
            * Execute updates of use_stat_tables transitionally. Those are not warnings, they are errors and should rollback all updates to those tables
            * EXPLAIN should show the real query plan (is it ignoring use_stat_tables?). When stats are broken, rows: 51 is shown, when in reality 1.4 million rows are scanned.
            When using use_stat_tables = PREFERABLY, and then run ANALYZE TABLE on a table with a an indexed column containing UTF-8 characters (but not mysql's 'utf8'), some query plans can be altered, sometimes with horrible performance consequences.

            What it is worse, EXPLAIN does not show the new query plan from broken statistics, but it shows the original query plan, while we can check after execution than we have a very different execution plan not only on the query latency but on the Handler statistics.

            How to reproduce:

            {code:sql}
            CREATE DATABASE zhwiki;
            use zhwiki;
            CREATE TABLE `recentchanges` (
              `rc_id` int(8) NOT NULL AUTO_INCREMENT,
              `rc_timestamp` varbinary(14) NOT NULL DEFAULT '',
              `rc_cur_time` varbinary(14) NOT NULL DEFAULT '',
              `rc_user` int(10) unsigned NOT NULL DEFAULT '0',
              `rc_user_text` varbinary(255) NOT NULL DEFAULT '',
              `rc_namespace` int(11) NOT NULL DEFAULT '0',
              `rc_title` varbinary(255) NOT NULL DEFAULT '',
              `rc_comment` varbinary(255) NOT NULL DEFAULT '',
              `rc_minor` tinyint(3) unsigned NOT NULL DEFAULT '0',
              `rc_bot` tinyint(3) unsigned NOT NULL DEFAULT '0',
              `rc_new` tinyint(3) unsigned NOT NULL DEFAULT '0',
              `rc_cur_id` int(10) unsigned NOT NULL DEFAULT '0',
              `rc_this_oldid` int(10) unsigned NOT NULL DEFAULT '0',
              `rc_last_oldid` int(10) unsigned NOT NULL DEFAULT '0',
              `rc_type` tinyint(3) unsigned NOT NULL DEFAULT '0',
              `rc_source` varbinary(16) NOT NULL DEFAULT '',
              `rc_moved_to_ns` tinyint(3) unsigned NOT NULL DEFAULT '0',
              `rc_moved_to_title` varbinary(255) NOT NULL DEFAULT '',
              `rc_patrolled` tinyint(3) unsigned NOT NULL DEFAULT '0',
              `rc_ip` varbinary(40) NOT NULL DEFAULT '',
              `rc_old_len` int(10) DEFAULT NULL,
              `rc_new_len` int(10) DEFAULT NULL,
              `rc_deleted` tinyint(1) unsigned NOT NULL DEFAULT '0',
              `rc_logid` int(10) unsigned NOT NULL DEFAULT '0',
              `rc_log_type` varbinary(255) DEFAULT NULL,
              `rc_log_action` varbinary(255) DEFAULT NULL,
              `rc_params` blob NOT NULL,
              PRIMARY KEY (`rc_id`),
              KEY `rc_timestamp` (`rc_timestamp`),
              KEY `rc_namespace_title` (`rc_namespace`,`rc_title`),
              KEY `rc_cur_id` (`rc_cur_id`),
              KEY `new_name_timestamp` (`rc_new`,`rc_namespace`,`rc_timestamp`),
              KEY `rc_ip` (`rc_ip`),
              KEY `rc_ns_usertext` (`rc_namespace`,`rc_user_text`),
              KEY `rc_user_text` (`rc_user_text`,`rc_timestamp`)
            ) ENGINE=InnoDB AUTO_INCREMENT=52457042 DEFAULT CHARSET=binary

            -- Note the charset binary, this allows any character encoding

            INSERT INTO `recentchanges` VALUES (51267205,'20151023001221','',0,'Reinheitsgebot',0,'��鯰科','添加语言链接:[[:specieswiki:Pangasiidae]]',1,1,0,566064,36471314,36471314,5,'wb',0,'',1,'',1254,1254,0,0,NULL,'','a:3:{s:20:\"wikibase-repo-change\";a:14:{s:2:\"id\";i:259555851;s:4:\"type\";s:20:\"wikibase-item~update\";s:4:\"time\";s:14:\"20151023001221\";s:9:\"object_id\";s:7:\"q901080\";s:7:\"user_id\";s:6:\"940976\";s:11:\"revision_id\";s:9:\"261218043\";s:11:\"entity_type\";s:4:\"item\";s:9:\"user_text\";s:14:\"Reinheitsgebot\";s:7:\"page_id\";i:851866;s:9:\"parent_id\";i:255341522;s:7:\"comment\";s:49:\"/* wbsetsitelink-add:1|specieswiki */ Pangasiidae\";s:6:\"rev_id\";i:261218043;s:3:\"bot\";i:1;s:7:\"site_id\";s:12:\"wikidatawiki\";}s:7:\"comment\";s:49:\"添加语言链接:[[:specieswiki:Pangasiidae]]\";s:12:\"comment-html\";s:123:\"添加语言链接:<a class=\"external\" href=\"//www.wikidata.org/wiki/Specieswiki:Pangasiidae\">specieswiki:Pangasiidae</a>\";}');

            SET GLOBAL use_stat_tables=PREFERABLY;

            -- reload connection for change to take effect to your current session

            MariaDB [zhwiki]> ANALYZE TABLE recentchanges;
            +----------------------+---------+----------+---------------------------------------------------------------------------------------+
            | Table | Op | Msg_type | Msg_text |
            +----------------------+---------+----------+---------------------------------------------------------------------------------------+
            | zhwiki.recentchanges | analyze | status | Engine-independent statistics collected |
            | zhwiki.recentchanges | analyze | Warning | Incorrect string value: '\xF0\xA9\xB7\xB6\xE9\xAF...' for column 'min_value' at row 1 |
            | zhwiki.recentchanges | analyze | Warning | Incorrect string value: '\xF0\xA9\xB7\xB6\xE9\xAF...' for column 'max_value' at row 1 |
            | zhwiki.recentchanges | analyze | status | OK |
            +----------------------+---------+----------+---------------------------------------------------------------------------------------+
            With 1.4 millons rows, a simple query like:

            FLUSH STATUS;
            SELECT /* ApiQueryRecentChanges::run SANITIZED */ rc_id,rc_timestamp,rc_namespace,rc_title,rc_cur_id,rc_type,rc_deleted,rc_this_oldid,rc_last_oldid,rc_user,rc_user_text,ts_tags,page_is_redirect FROM `recentchanges` LEFT JOIN `tag_summary` ON ((rc_id=ts_rc_id)) LEFT JOIN `page` ON ((rc_namespace=page_namespace) AND (rc_title=page_title)) WHERE rc_namespace = '0' AND rc_type = '1' AND (rc_patrolled = 0) AND (page_is_redirect = 0 OR page_is_redirect IS NULL) AND rc_new=1 ORDER BY rc_timestamp DESC,rc_id DESC LIMIT 51;
            SHOW STATUS like 'Hand%';

            goes from reading 3000 rows to read the full table (full index scan) using the timestamp index, 1.4 millon rows.
            {code}

            As a workaround, disable use_stat_tables.

            3 things went wrong here:

            * most value-related columns from the stats tables should be utf8mb4, probably binary, as binary columns can be indexed too
            * Execute updates of use_stat_tables transactionally. Those are not warnings, they are errors and should rollback all updates to those tables
            * EXPLAIN should show the real query plan (is it ignoring use_stat_tables?). When stats are broken, rows: 51 is shown, when in reality 1.4 million rows are scanned.
            jcrespo Jaime Crespo made changes -
            Description When using use_stat_tables = PREFERABLY, and then run ANALYZE TABLE on a table with a an indexed column containing UTF-8 characters (but not mysql's 'utf8'), some query plans can be altered, sometimes with horrible performance consequences.

            What it is worse, EXPLAIN does not show the new query plan from broken statistics, but it shows the original query plan, while we can check after execution than we have a very different execution plan not only on the query latency but on the Handler statistics.

            How to reproduce:

            {code:sql}
            CREATE DATABASE zhwiki;
            use zhwiki;
            CREATE TABLE `recentchanges` (
              `rc_id` int(8) NOT NULL AUTO_INCREMENT,
              `rc_timestamp` varbinary(14) NOT NULL DEFAULT '',
              `rc_cur_time` varbinary(14) NOT NULL DEFAULT '',
              `rc_user` int(10) unsigned NOT NULL DEFAULT '0',
              `rc_user_text` varbinary(255) NOT NULL DEFAULT '',
              `rc_namespace` int(11) NOT NULL DEFAULT '0',
              `rc_title` varbinary(255) NOT NULL DEFAULT '',
              `rc_comment` varbinary(255) NOT NULL DEFAULT '',
              `rc_minor` tinyint(3) unsigned NOT NULL DEFAULT '0',
              `rc_bot` tinyint(3) unsigned NOT NULL DEFAULT '0',
              `rc_new` tinyint(3) unsigned NOT NULL DEFAULT '0',
              `rc_cur_id` int(10) unsigned NOT NULL DEFAULT '0',
              `rc_this_oldid` int(10) unsigned NOT NULL DEFAULT '0',
              `rc_last_oldid` int(10) unsigned NOT NULL DEFAULT '0',
              `rc_type` tinyint(3) unsigned NOT NULL DEFAULT '0',
              `rc_source` varbinary(16) NOT NULL DEFAULT '',
              `rc_moved_to_ns` tinyint(3) unsigned NOT NULL DEFAULT '0',
              `rc_moved_to_title` varbinary(255) NOT NULL DEFAULT '',
              `rc_patrolled` tinyint(3) unsigned NOT NULL DEFAULT '0',
              `rc_ip` varbinary(40) NOT NULL DEFAULT '',
              `rc_old_len` int(10) DEFAULT NULL,
              `rc_new_len` int(10) DEFAULT NULL,
              `rc_deleted` tinyint(1) unsigned NOT NULL DEFAULT '0',
              `rc_logid` int(10) unsigned NOT NULL DEFAULT '0',
              `rc_log_type` varbinary(255) DEFAULT NULL,
              `rc_log_action` varbinary(255) DEFAULT NULL,
              `rc_params` blob NOT NULL,
              PRIMARY KEY (`rc_id`),
              KEY `rc_timestamp` (`rc_timestamp`),
              KEY `rc_namespace_title` (`rc_namespace`,`rc_title`),
              KEY `rc_cur_id` (`rc_cur_id`),
              KEY `new_name_timestamp` (`rc_new`,`rc_namespace`,`rc_timestamp`),
              KEY `rc_ip` (`rc_ip`),
              KEY `rc_ns_usertext` (`rc_namespace`,`rc_user_text`),
              KEY `rc_user_text` (`rc_user_text`,`rc_timestamp`)
            ) ENGINE=InnoDB AUTO_INCREMENT=52457042 DEFAULT CHARSET=binary

            -- Note the charset binary, this allows any character encoding

            INSERT INTO `recentchanges` VALUES (51267205,'20151023001221','',0,'Reinheitsgebot',0,'��鯰科','添加语言链接:[[:specieswiki:Pangasiidae]]',1,1,0,566064,36471314,36471314,5,'wb',0,'',1,'',1254,1254,0,0,NULL,'','a:3:{s:20:\"wikibase-repo-change\";a:14:{s:2:\"id\";i:259555851;s:4:\"type\";s:20:\"wikibase-item~update\";s:4:\"time\";s:14:\"20151023001221\";s:9:\"object_id\";s:7:\"q901080\";s:7:\"user_id\";s:6:\"940976\";s:11:\"revision_id\";s:9:\"261218043\";s:11:\"entity_type\";s:4:\"item\";s:9:\"user_text\";s:14:\"Reinheitsgebot\";s:7:\"page_id\";i:851866;s:9:\"parent_id\";i:255341522;s:7:\"comment\";s:49:\"/* wbsetsitelink-add:1|specieswiki */ Pangasiidae\";s:6:\"rev_id\";i:261218043;s:3:\"bot\";i:1;s:7:\"site_id\";s:12:\"wikidatawiki\";}s:7:\"comment\";s:49:\"添加语言链接:[[:specieswiki:Pangasiidae]]\";s:12:\"comment-html\";s:123:\"添加语言链接:<a class=\"external\" href=\"//www.wikidata.org/wiki/Specieswiki:Pangasiidae\">specieswiki:Pangasiidae</a>\";}');

            SET GLOBAL use_stat_tables=PREFERABLY;

            -- reload connection for change to take effect to your current session

            MariaDB [zhwiki]> ANALYZE TABLE recentchanges;
            +----------------------+---------+----------+---------------------------------------------------------------------------------------+
            | Table | Op | Msg_type | Msg_text |
            +----------------------+---------+----------+---------------------------------------------------------------------------------------+
            | zhwiki.recentchanges | analyze | status | Engine-independent statistics collected |
            | zhwiki.recentchanges | analyze | Warning | Incorrect string value: '\xF0\xA9\xB7\xB6\xE9\xAF...' for column 'min_value' at row 1 |
            | zhwiki.recentchanges | analyze | Warning | Incorrect string value: '\xF0\xA9\xB7\xB6\xE9\xAF...' for column 'max_value' at row 1 |
            | zhwiki.recentchanges | analyze | status | OK |
            +----------------------+---------+----------+---------------------------------------------------------------------------------------+
            With 1.4 millons rows, a simple query like:

            FLUSH STATUS;
            SELECT /* ApiQueryRecentChanges::run SANITIZED */ rc_id,rc_timestamp,rc_namespace,rc_title,rc_cur_id,rc_type,rc_deleted,rc_this_oldid,rc_last_oldid,rc_user,rc_user_text,ts_tags,page_is_redirect FROM `recentchanges` LEFT JOIN `tag_summary` ON ((rc_id=ts_rc_id)) LEFT JOIN `page` ON ((rc_namespace=page_namespace) AND (rc_title=page_title)) WHERE rc_namespace = '0' AND rc_type = '1' AND (rc_patrolled = 0) AND (page_is_redirect = 0 OR page_is_redirect IS NULL) AND rc_new=1 ORDER BY rc_timestamp DESC,rc_id DESC LIMIT 51;
            SHOW STATUS like 'Hand%';

            goes from reading 3000 rows to read the full table (full index scan) using the timestamp index, 1.4 millon rows.
            {code}

            As a workaround, disable use_stat_tables.

            3 things went wrong here:

            * most value-related columns from the stats tables should be utf8mb4, probably binary, as binary columns can be indexed too
            * Execute updates of use_stat_tables transactionally. Those are not warnings, they are errors and should rollback all updates to those tables
            * EXPLAIN should show the real query plan (is it ignoring use_stat_tables?). When stats are broken, rows: 51 is shown, when in reality 1.4 million rows are scanned.
            When using use_stat_tables = PREFERABLY, and then run ANALYZE TABLE on a table with a an indexed column containing UTF-8 characters (but not mysql's 'utf8'), some query plans can be altered, sometimes with horrible performance consequences.

            What it is worse, EXPLAIN does not show the new query plan from broken statistics, but it shows the original query plan, while we can check after execution than we have a very different execution plan not only on the query latency but on the Handler statistics.

            How to reproduce:

            {code:sql}
            CREATE DATABASE zhwiki;
            use zhwiki;
            CREATE TABLE `recentchanges` (
              `rc_id` int(8) NOT NULL AUTO_INCREMENT,
              `rc_timestamp` varbinary(14) NOT NULL DEFAULT '',
              `rc_cur_time` varbinary(14) NOT NULL DEFAULT '',
              `rc_user` int(10) unsigned NOT NULL DEFAULT '0',
              `rc_user_text` varbinary(255) NOT NULL DEFAULT '',
              `rc_namespace` int(11) NOT NULL DEFAULT '0',
              `rc_title` varbinary(255) NOT NULL DEFAULT '',
              `rc_comment` varbinary(255) NOT NULL DEFAULT '',
              `rc_minor` tinyint(3) unsigned NOT NULL DEFAULT '0',
              `rc_bot` tinyint(3) unsigned NOT NULL DEFAULT '0',
              `rc_new` tinyint(3) unsigned NOT NULL DEFAULT '0',
              `rc_cur_id` int(10) unsigned NOT NULL DEFAULT '0',
              `rc_this_oldid` int(10) unsigned NOT NULL DEFAULT '0',
              `rc_last_oldid` int(10) unsigned NOT NULL DEFAULT '0',
              `rc_type` tinyint(3) unsigned NOT NULL DEFAULT '0',
              `rc_source` varbinary(16) NOT NULL DEFAULT '',
              `rc_moved_to_ns` tinyint(3) unsigned NOT NULL DEFAULT '0',
              `rc_moved_to_title` varbinary(255) NOT NULL DEFAULT '',
              `rc_patrolled` tinyint(3) unsigned NOT NULL DEFAULT '0',
              `rc_ip` varbinary(40) NOT NULL DEFAULT '',
              `rc_old_len` int(10) DEFAULT NULL,
              `rc_new_len` int(10) DEFAULT NULL,
              `rc_deleted` tinyint(1) unsigned NOT NULL DEFAULT '0',
              `rc_logid` int(10) unsigned NOT NULL DEFAULT '0',
              `rc_log_type` varbinary(255) DEFAULT NULL,
              `rc_log_action` varbinary(255) DEFAULT NULL,
              `rc_params` blob NOT NULL,
              PRIMARY KEY (`rc_id`),
              KEY `rc_timestamp` (`rc_timestamp`),
              KEY `rc_namespace_title` (`rc_namespace`,`rc_title`),
              KEY `rc_cur_id` (`rc_cur_id`),
              KEY `new_name_timestamp` (`rc_new`,`rc_namespace`,`rc_timestamp`),
              KEY `rc_ip` (`rc_ip`),
              KEY `rc_ns_usertext` (`rc_namespace`,`rc_user_text`),
              KEY `rc_user_text` (`rc_user_text`,`rc_timestamp`)
            ) ENGINE=InnoDB AUTO_INCREMENT=52457042 DEFAULT CHARSET=binary

            -- Note the charset binary, this allows any character encoding

            INSERT INTO `recentchanges` VALUES (51267205,'20151023001221','',0,'Reinheitsgebot',0,'��鯰科','添加语言链接:[[:specieswiki:Pangasiidae]]',1,1,0,566064,36471314,36471314,5,'wb',0,'',1,'',1254,1254,0,0,NULL,'','a:3:{s:20:\"wikibase-repo-change\";a:14:{s:2:\"id\";i:259555851;s:4:\"type\";s:20:\"wikibase-item~update\";s:4:\"time\";s:14:\"20151023001221\";s:9:\"object_id\";s:7:\"q901080\";s:7:\"user_id\";s:6:\"940976\";s:11:\"revision_id\";s:9:\"261218043\";s:11:\"entity_type\";s:4:\"item\";s:9:\"user_text\";s:14:\"Reinheitsgebot\";s:7:\"page_id\";i:851866;s:9:\"parent_id\";i:255341522;s:7:\"comment\";s:49:\"/* wbsetsitelink-add:1|specieswiki */ Pangasiidae\";s:6:\"rev_id\";i:261218043;s:3:\"bot\";i:1;s:7:\"site_id\";s:12:\"wikidatawiki\";}s:7:\"comment\";s:49:\"添加语言链接:[[:specieswiki:Pangasiidae]]\";s:12:\"comment-html\";s:123:\"添加语言链接:<a class=\"external\" href=\"//www.wikidata.org/wiki/Specieswiki:Pangasiidae\">specieswiki:Pangasiidae</a>\";}');

            SET GLOBAL use_stat_tables=PREFERABLY;

            -- reload connection for change to take effect to your current session

            MariaDB [zhwiki]> ANALYZE TABLE recentchanges;
            +----------------------+---------+----------+---------------------------------------------------------------------------------------+
            | Table | Op | Msg_type | Msg_text |
            +----------------------+---------+----------+---------------------------------------------------------------------------------------+
            | zhwiki.recentchanges | analyze | status | Engine-independent statistics collected |
            | zhwiki.recentchanges | analyze | Warning | Incorrect string value: '\xF0\xA9\xB7\xB6\xE9\xAF...' for column 'min_value' at row 1 |
            | zhwiki.recentchanges | analyze | Warning | Incorrect string value: '\xF0\xA9\xB7\xB6\xE9\xAF...' for column 'max_value' at row 1 |
            | zhwiki.recentchanges | analyze | status | OK |
            +----------------------+---------+----------+---------------------------------------------------------------------------------------+

            We can check that effectively, the tables have the wrong encoding:

            CREATE TABLE `column_stats` (
              `db_name` varchar(64) COLLATE utf8_bin NOT NULL,
              `table_name` varchar(64) COLLATE utf8_bin NOT NULL,
              `column_name` varchar(64) COLLATE utf8_bin NOT NULL,
              `min_value` varchar(255) COLLATE utf8_bin DEFAULT NULL,
              `max_value` varchar(255) COLLATE utf8_bin DEFAULT NULL,
              `nulls_ratio` decimal(12,4) DEFAULT NULL,
              `avg_length` decimal(12,4) DEFAULT NULL,
              `avg_frequency` decimal(12,4) DEFAULT NULL,
              `hist_size` tinyint(3) unsigned DEFAULT NULL,
              `hist_type` enum('SINGLE_PREC_HB','DOUBLE_PREC_HB') COLLATE utf8_bin DEFAULT NULL,
              `histogram` varbinary(255) DEFAULT NULL,
              PRIMARY KEY (`db_name`,`table_name`,`column_name`)
            ) ENGINE=MyISAM DEFAULT CHARSET=utf8 COLLATE=utf8_bin COMMENT='Statistics on Columns'

            With 1.4 millons rows, a simple query like:

            FLUSH STATUS;
            SELECT /* ApiQueryRecentChanges::run SANITIZED */ rc_id,rc_timestamp,rc_namespace,rc_title,rc_cur_id,rc_type,rc_deleted,rc_this_oldid,rc_last_oldid,rc_user,rc_user_text,ts_tags,page_is_redirect FROM `recentchanges` LEFT JOIN `tag_summary` ON ((rc_id=ts_rc_id)) LEFT JOIN `page` ON ((rc_namespace=page_namespace) AND (rc_title=page_title)) WHERE rc_namespace = '0' AND rc_type = '1' AND (rc_patrolled = 0) AND (page_is_redirect = 0 OR page_is_redirect IS NULL) AND rc_new=1 ORDER BY rc_timestamp DESC,rc_id DESC LIMIT 51;
            SHOW STATUS like 'Hand%';

            goes from reading 3000 rows to read the full table (full index scan) using the timestamp index, 1.4 millon rows.
            {code}

            As a workaround, disable use_stat_tables.

            3 things went wrong here:

            * most value-related columns from the stats tables should be utf8mb4, probably binary, as binary columns can be indexed too
            * Execute updates of use_stat_tables transactionally. Those are not warnings, they are errors and should rollback all updates to those tables
            * EXPLAIN should show the real query plan (is it ignoring use_stat_tables?). When stats are broken, rows: 51 is shown, when in reality 1.4 million rows are scanned.
            danblack Daniel Black added a comment -

            agree. its an annoyance/

            danblack Daniel Black added a comment - agree. its an annoyance/
            danblack Daniel Black made changes -
            elenst Elena Stepanova made changes -
            Fix Version/s 10.0 [ 16000 ]
            Fix Version/s 10.1 [ 16100 ]
            Affects Version/s 10.0 [ 16000 ]
            Affects Version/s 10.1 [ 16100 ]
            Assignee Sergei Petrunia [ psergey ]
            Labels eits

            most value-related columns from the stats tables should be utf8mb4, probably binary, as binary columns can be indexed too

            Agree. They should be varbinary as that's the only charset that allows to store varbinary.

            Execute updates of use_stat_tables transactionally. Those are not warnings, they are errors and should rollback all updates to those tables

            This is a bit more complex as currently tables in mysql.* are mostly MyISAM tables.

            EXPLAIN should show the real query plan (is it ignoring use_stat_tables?). When stats are broken, rows: 51 is shown, when in reality 1.4 million rows are scanned.

            Tried with a smaller example, and I can observe the problem in EXPLAIN, too. The query optimizer uses EITS data depending on use_stat_tables / optimizer_use_condition_selectivity settings, it is done in the same way in EXPLAIN and regular queries.

            Maybe, you're hitting a problem with EXPLAIN being different from actual execution due to ORDER BY ... LIMIT problems? There were such bugs in MySQL/MariaDB. (Do I understand correctly that recentchanges table is not included in wikipedia database dumps?)

            psergei Sergei Petrunia added a comment - most value-related columns from the stats tables should be utf8mb4, probably binary, as binary columns can be indexed too Agree. They should be varbinary as that's the only charset that allows to store varbinary. Execute updates of use_stat_tables transactionally. Those are not warnings, they are errors and should rollback all updates to those tables This is a bit more complex as currently tables in mysql.* are mostly MyISAM tables. EXPLAIN should show the real query plan (is it ignoring use_stat_tables?). When stats are broken, rows: 51 is shown, when in reality 1.4 million rows are scanned. Tried with a smaller example, and I can observe the problem in EXPLAIN, too. The query optimizer uses EITS data depending on use_stat_tables / optimizer_use_condition_selectivity settings, it is done in the same way in EXPLAIN and regular queries. Maybe, you're hitting a problem with EXPLAIN being different from actual execution due to ORDER BY ... LIMIT problems? There were such bugs in MySQL/MariaDB. (Do I understand correctly that recentchanges table is not included in wikipedia database dumps?)

            Debugged to see what happens when we fail to save min_value and/or max_value.

            If we fail to save both, the server assumes that both min_value and max_value are empty. This makes a predicate

            tbl.col BETWEEN 'bar' and 'foo'

            look like it is highly selective.

            The reason for that is that we get either 0 or 1 when computing range endpoint's position in get_column_range_cardinality:

                    min_mp_pos= field->pos_in_interval(col_stats->min_value,
                                                       col_stats->max_value);

            and we again get either 0 or 1 for the other endpoint. Then

                  if (!hist->is_available())
                    sel= (max_mp_pos - min_mp_pos);
                  else
                    sel= hist->range_selectivity(min_mp_pos, max_mp_pos);

            Without histogram, we get sel=0;

                  res= col_non_nulls * sel;
                  set_if_bigger(res, col_stats->get_avg_frequency());

            and here we get col_stats->get_avg_frequency() as selectivity (for ANY meaningful range condition).

            That is, failure to save min_value or max_value in the table causes the optimizer to vastly over-estimate selectivity of any range condition. This can lead to bad query plans.

            psergei Sergei Petrunia added a comment - Debugged to see what happens when we fail to save min_value and/or max_value. If we fail to save both, the server assumes that both min_value and max_value are empty. This makes a predicate tbl.col BETWEEN 'bar' and 'foo' look like it is highly selective. The reason for that is that we get either 0 or 1 when computing range endpoint's position in get_column_range_cardinality : min_mp_pos= field->pos_in_interval(col_stats->min_value, col_stats->max_value); and we again get either 0 or 1 for the other endpoint. Then if (!hist->is_available()) sel= (max_mp_pos - min_mp_pos); else sel= hist->range_selectivity(min_mp_pos, max_mp_pos); Without histogram, we get sel=0; res= col_non_nulls * sel; set_if_bigger(res, col_stats->get_avg_frequency()); and here we get col_stats->get_avg_frequency() as selectivity (for ANY meaningful range condition). That is, failure to save min_value or max_value in the table causes the optimizer to vastly over-estimate selectivity of any range condition. This can lead to bad query plans.
            jcrespo Jaime Crespo added a comment -

            I am hot-patching the mysql table, converting it to binary (I know this may have problems on mysql_upgrade, but I need a workaround now). I wanted it to be utf8(mb4), because a varchar binary is essentially a varbinary, but that doesn't work for me.

            SET [GLOBAL] use_stat_tables = 2;
            MariaDB PRODUCTION s2 localhost zhwiki > ANALYZE TABLE recentchanges;
            +----------------------+---------+----------+-----------------------------------------+
            | Table                | Op      | Msg_type | Msg_text                                |
            +----------------------+---------+----------+-----------------------------------------+
            | zhwiki.recentchanges | analyze | status   | Engine-independent statistics collected |
            | zhwiki.recentchanges | analyze | status   | OK                                      |
            +----------------------+---------+----------+-----------------------------------------+
            2 rows in set (21.49 sec)

            I will report issues I may find.

            jcrespo Jaime Crespo added a comment - I am hot-patching the mysql table, converting it to binary (I know this may have problems on mysql_upgrade, but I need a workaround now ). I wanted it to be utf8(mb4), because a varchar binary is essentially a varbinary, but that doesn't work for me. SET [ GLOBAL ] use_stat_tables = 2; MariaDB PRODUCTION s2 localhost zhwiki > ANALYZE TABLE recentchanges; + ----------------------+---------+----------+-----------------------------------------+ | Table | Op | Msg_type | Msg_text | + ----------------------+---------+----------+-----------------------------------------+ | zhwiki.recentchanges | analyze | status | Engine-independent statistics collected | | zhwiki.recentchanges | analyze | status | OK | + ----------------------+---------+----------+-----------------------------------------+ 2 rows in set (21.49 sec) I will report issues I may find.
            jcrespo Jaime Crespo added a comment -

            BTW, yes, I can confirm that the EXPLAIN giving different query plans than the actual query is unrelated to the engine_independent statistics and a different issue.

            jcrespo Jaime Crespo added a comment - BTW, yes, I can confirm that the EXPLAIN giving different query plans than the actual query is unrelated to the engine_independent statistics and a different issue.

            Some details about how the error happened: we would pass the min/max_value, stating it is in utf8bin charset/collation and attempt to put it into an utf8bin Field object. This wouldn't do any data conversion but could cause the data to be truncated.

            I guess the upgrade procedure in patch MDEV-7383

            alter table column_stats modify min_value varbinary(255) DEFAULT NULL, modify max_value varbinary(255) DEFAULT NULL

            is adequate. If something has been truncated, we can't get it back. We could re-run ANALYZE TABLE for anything, but this could be a long process. My opinion is that a mention in the changelog would sufficient. (The upgrade doesn't make anything worse, if somebody wants to make things better, they'll need to re-run ANALYZE themselves).

            psergei Sergei Petrunia added a comment - Some details about how the error happened: we would pass the min/max_value, stating it is in utf8bin charset/collation and attempt to put it into an utf8bin Field object. This wouldn't do any data conversion but could cause the data to be truncated. I guess the upgrade procedure in patch MDEV-7383 alter table column_stats modify min_value varbinary(255) DEFAULT NULL, modify max_value varbinary(255) DEFAULT NULL is adequate. If something has been truncated, we can't get it back. We could re-run ANALYZE TABLE for anything, but this could be a long process. My opinion is that a mention in the changelog would sufficient. (The upgrade doesn't make anything worse, if somebody wants to make things better, they'll need to re-run ANALYZE themselves).

            Pushed the fix for MDEV-7383. Will discuss the rest of the issues on the optimizer call.

            psergei Sergei Petrunia added a comment - Pushed the fix for MDEV-7383 . Will discuss the rest of the issues on the optimizer call.

            jcrespo

            I am hot-patching the mysql table, converting it to binary (I know this may have problems on mysql_upgrade,

            I think there won't be any problems with mysql_upgrade (the table will just stay the same).

            I take the table columns are using VARBINARY data type?

            The patch consists of 1. Modifications to mysql.column_stats table and 2. Changes in the source code. I have debugged to see what happens when one applies #1 but not #2. utf8-based charset and VARBINARY columns in user tables are ok.
            If you use a charset that requires non-trivial conversion (I tried ucs2) , then you need to BOTH apply the patch and modify the mysql.column_stats table .

            I can confirm that the EXPLAIN giving different query plans than the actual query is unrelated to the engine_independent statistics and a different issue.

            Ok. The set of such queries is getting smaller over time, but some of them are still there.

            psergei Sergei Petrunia added a comment - jcrespo I am hot-patching the mysql table, converting it to binary (I know this may have problems on mysql_upgrade, I think there won't be any problems with mysql_upgrade (the table will just stay the same). I take the table columns are using VARBINARY data type? The patch consists of 1. Modifications to mysql.column_stats table and 2. Changes in the source code. I have debugged to see what happens when one applies #1 but not #2. utf8-based charset and VARBINARY columns in user tables are ok. If you use a charset that requires non-trivial conversion (I tried ucs2) , then you need to BOTH apply the patch and modify the mysql.column_stats table . I can confirm that the EXPLAIN giving different query plans than the actual query is unrelated to the engine_independent statistics and a different issue. Ok. The set of such queries is getting smaller over time, but some of them are still there.
            jcrespo Jaime Crespo added a comment -

            I have applied the schema change, it seems to work well. We use only binary fields for text due to legacy reasons. All encoding is handled at application side.

            I will either upgrade to a version patched or patch it myself. I am now more concerned about the full index scan on the query plan, but I need more testing. I will try to solve it on application code rather than a change in the database server.

            jcrespo Jaime Crespo added a comment - I have applied the schema change, it seems to work well. We use only binary fields for text due to legacy reasons. All encoding is handled at application side. I will either upgrade to a version patched or patch it myself. I am now more concerned about the full index scan on the query plan, but I need more testing. I will try to solve it on application code rather than a change in the database server.
            ratzpo Rasmus Johansson (Inactive) made changes -
            Sprint 10.1.9-3 [ 20 ]
            ratzpo Rasmus Johansson (Inactive) made changes -
            Rank Ranked higher
            danblack Daniel Black added a comment -

            I can't see what utf8 charset can't be converted to VARBINARY however I'm happy with the change for the gains of all the non-utf8 varchar columns I can't use.

            I am now more concerned about the full index scan on the query plan, but I need more testing. I will try to solve it on application code rather than a change in the database server.

            You can insert into the stats table directly without using analyze tables which means you shouldn't need to patch it manually.

            The charset aspects of https://mariadb.atlassian.net/secure/attachment/36708/mdev-7383-statistics-to-handle-as-my_charset_bin.patch relate to the storage by ANALYZE TABLE rather than any aspect to do with retrieval.

            We could re-run ANALYZE TABLE for anything, but this could be a long process.

            Eek, too long/invasive.

            My opinion is that a mention in the changelog would sufficient.

            Agree.

            .bq (I know this may have problems on mysql_upgrade...

            jcrespo, this shouldn't be the case. The alter table is idempotent.

            jcrespo agree on skipping column stats for blob/text columns in the last bit of discussion in MDEV-7383? If so happy to hear your reasons on MDEV-9118.

            So solved as per MDEV-7383 - can be closed now?

            danblack Daniel Black added a comment - I can't see what utf8 charset can't be converted to VARBINARY however I'm happy with the change for the gains of all the non-utf8 varchar columns I can't use. I am now more concerned about the full index scan on the query plan, but I need more testing. I will try to solve it on application code rather than a change in the database server. You can insert into the stats table directly without using analyze tables which means you shouldn't need to patch it manually. The charset aspects of https://mariadb.atlassian.net/secure/attachment/36708/mdev-7383-statistics-to-handle-as-my_charset_bin.patch relate to the storage by ANALYZE TABLE rather than any aspect to do with retrieval. We could re-run ANALYZE TABLE for anything, but this could be a long process. Eek, too long/invasive. My opinion is that a mention in the changelog would sufficient. Agree. .bq (I know this may have problems on mysql_upgrade... jcrespo , this shouldn't be the case. The alter table is idempotent. jcrespo agree on skipping column stats for blob/text columns in the last bit of discussion in MDEV-7383 ? If so happy to hear your reasons on MDEV-9118 . So solved as per MDEV-7383 - can be closed now?
            jcrespo Jaime Crespo added a comment -

            Yes, I will report any issue related to EXPLAIN separately, if any.

            jcrespo Jaime Crespo added a comment - Yes, I will report any issue related to EXPLAIN separately, if any.
            ratzpo Rasmus Johansson (Inactive) made changes -
            Rank Ranked higher

            Yes. Closed by fix for MDEV-7383.

            psergei Sergei Petrunia added a comment - Yes. Closed by fix for MDEV-7383 .
            psergei Sergei Petrunia made changes -
            Fix Version/s 10.0.23 [ 20401 ]
            Fix Version/s 10.0 [ 16000 ]
            Fix Version/s 10.1 [ 16100 ]
            Resolution Duplicate [ 3 ]
            Status Open [ 1 ] Closed [ 6 ]
            serg Sergei Golubchik made changes -
            Workflow MariaDB v3 [ 72485 ] MariaDB v4 [ 149802 ]

            People

              psergei Sergei Petrunia
              jcrespo Jaime Crespo
              Votes:
              1 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.