[MDEV-30129] MySQL 5.7 --> MariaDB 10.8 switch: mysqlcheck --check --extended hangs on table Created: 2022-11-29  Updated: 2023-02-05  Resolved: 2023-02-02

Status: Closed
Project: MariaDB Server
Component/s: Documentation, Scripts & Clients, Server
Affects Version/s: 10.8.6
Fix Version/s: 10.6.11, 10.7.7, 10.8.6, 10.9.4

Type: Bug Priority: Minor
Reporter: AJ Livingston Assignee: Daniel Black
Resolution: Fixed Votes: 0
Labels: None
Environment:

Ubuntu 18.04.6 x86_64


Attachments: Text File gdb_output_15-05.log     Text File gdb_output_15-12.log     Text File gdb_output_15-16.log    
Issue Links:
Relates
relates to MDEV-24402 CHECK TABLE may miss some cases of in... Closed

 Description   

This was originally posted as a question at
https://dba.stackexchange.com/questions/320087/mysql-5-7-mariadb-10-8-switch-mysqlcheck-check-extended-hangs-on-table?noredirect=1#comment624082_320087
but I'm filing it as bug report due to a recommendation there. I copied over the text of the stackexchange post along with some additional information

Recently I switched from MySQL 5.7 to MariaDB 10.8 and it's been generally great as a drop-in replacement, but I've run into a problem: the process for a cron job that daily runs

/usr/bin/nice /usr/bin/ionice -c3 \
    /usr/bin/mysqlcheck \
        --all-databases --check --extended --check-only-changed --silent  2>&1

hangs indefinitely on one of the larger tables (981 MB). Here's the relevant output from showing the process list after four days since server restart (there's one process for each day the cron job runs):

MariaDB [(none)]> show full processlist;
+-------+--------------+-----------------+-----------+---------+--------+-----------+----------------------------------------------+----------+
| Id    | User         | Host            | db        | Command | Time   | State     | Info                                         | Progress |
+-------+--------------+-----------------+-----------+---------+--------+-----------+----------------------------------------------+----------+
|  7719 | <cron user>  | localhost       | <db name> | Query   | 305690 | Executing | CHECK TABLE `<table name>`  EXTENDED CHANGED |    0.000 |
| 23567 | <cron user>  | localhost       | <db name> | Query   | 219929 | Executing | CHECK TABLE `<table name>`  EXTENDED CHANGED |    0.000 |
| 38185 | <cron user>  | localhost       | <db name> | Query   | 133529 | Executing | CHECK TABLE `<table name>`  EXTENDED CHANGED |    0.000 |
| 57790 | <cron user>  | localhost       | <db name> | Query   |  47127 | Executing | CHECK TABLE `<table name>`  EXTENDED CHANGED |    0.000 |

Nothing about this appears in /var/log/mysql/ error logs.

I am working on getting a backtrace, but this is a bit tricky as it's a production server and I haven't had to do that before. A commenter thought that I should post it now anyway since the table structure alone might provide a clue.

Note that this db started life as a mysql 5.7 db and was not upgraded from an early mysql version prior to the mariadb switch.

Create table:

CREATE TABLE `large_archive_table` (
  `id` int(11) unsigned NOT NULL AUTO_INCREMENT,
  `a` datetime NOT NULL DEFAULT '1974-08-17 10:00:00',
  `b` varchar(20) COLLATE utf8mb4_unicode_520_ci NOT NULL DEFAULT '',
  `c` varchar(256) COLLATE utf8mb4_unicode_520_ci NOT NULL DEFAULT '',
  `d` varchar(256) COLLATE utf8mb4_unicode_520_ci DEFAULT NULL,
  `e` varchar(256) COLLATE utf8mb4_unicode_520_ci DEFAULT NULL,
  `f` tinyint(1) unsigned DEFAULT '0',
  `g` int(11) unsigned DEFAULT NULL,
  `h` varchar(100) COLLATE utf8mb4_unicode_520_ci DEFAULT NULL,
  `i` int(11) unsigned DEFAULT NULL,
  `j` varchar(255) COLLATE utf8mb4_unicode_520_ci DEFAULT NULL,
  `k` varchar(20) COLLATE utf8mb4_unicode_520_ci NOT NULL DEFAULT '',
  `l` int(11) unsigned NOT NULL DEFAULT '0',
  `m` smallint(6) unsigned DEFAULT NULL,
  `n` smallint(6) unsigned DEFAULT NULL,
  `o` tinyint(4) unsigned DEFAULT NULL,
  `p` tinyint(4) unsigned DEFAULT NULL,
  `q` varchar(256) COLLATE utf8mb4_unicode_520_ci DEFAULT NULL,
  `r` varchar(255) COLLATE utf8mb4_unicode_520_ci DEFAULT NULL,
  `s` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,
  PRIMARY KEY (`id`),
  KEY `a` (`a`),
  KEY `i` (`i`),
  KEY `ip` (`b`),
  KEY `L` (`l`),
  KEY `q` (`q`(255)),
  KEY `c` (`c`(255)),
  KEY `g` (`g`) USING BTREE,
  KEY `r` (`r`) USING BTREE,
  KEY `k_r` (`k`,`r`) USING BTREE
) ENGINE=InnoDB AUTO_INCREMENT=9681472 DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_unicode_520_ci



 Comments   
Comment by AJ Livingston [ 2022-11-30 ]

I attached three stacktraces from during a run of the mysqlcheck command:

  • from near the beginning of the run: gdb_output_15-05.log
  • 7 minutes later: gdb_output_15-12.log
  • 4 minutes later: gdb_output_15-16.log
Comment by Marko Mäkelä [ 2023-01-30 ]

All three stack traces show only one active InnoDB thread, executing CHECK TABLE. MDEV-24402 finally implemented CHECK TABLE…EXTENDED for InnoDB tables.

Previously, CHECK TABLE ignored the EXTENDED keyword for InnoDB tables.

Comment by Marko Mäkelä [ 2023-01-30 ]

livingston@msp.org, does the command complete faster if you omit the parameter --extended?

Comment by AJ Livingston [ 2023-01-31 ]

Yes, omitting --extended made the command complete in just a few minutes

Comment by Marko Mäkelä [ 2023-02-01 ]

Thank you. I’m assigning this to danblack to decide what we should do. Possibly, update some documentation to say that CHECK TABLE…EXTENDED really performs an extended check for InnoDB tables.

Comment by Daniel Black [ 2023-02-02 ]

https://mariadb.com/kb/en/check-table/ updated. Appoligies for not finding MDEV-24402 earlier.

Comment by AJ Livingston [ 2023-02-05 ]

Sorry, could someone clarify what’s going on here? Was the extended flag not doing anything on MySQL 5.7? Is that why the query didn’t hang?

Comment by Marko Mäkelä [ 2023-02-05 ]

livingston@msp.org, Like I wrote, CHECK TABLE…EXTENDED used to do the same as plain CHECK TABLE. MDEV-24402 finally implemented an extended check, which will check every active version of every index record. The non-EXTENDED check merely counts the records in each index, and fails to catch some types of inconsistency, such as one that was caused by the bug MDEV-29666. I believe that you may be misinterpreting a long execution time as a hang.

Generated at Thu Feb 08 10:13:52 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.