Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-30531

Corrupt index(es) on busy table when using FOREIGN KEY with CASCADE or SET NULL

Details

    Description

      For several months now I have been having issues with index corruption on one table. Not entirely sure when it started as it wasn't directly obvious after upgrading MariaDB for example. It worked fine for months on the same machine, OS and filesystem. I am running 10.10 now but it also happened on 10.6. I know one other person running the same type of software who also has this issue, they are not using ZFS but EXT4 so filesystem is probably not the issue.

      This table is constantly being written to and read from. A program analyzes/processes the data every few mins, and then deletes the rows it just processed when it's done.

      The only way to "fix" it is to truncate the table, but it takes anywhere from 1h to 3 days for the issue to come up again showing:

      SQLSTATE[HY000]: General error: 1712 Index TABLENAME is corrupted
      

      journald then shows the following output (different indexes) (Database & table names replaced for privacy)

      Feb 09 20:30:54 SERVER_HOSTNAME mariadbd[3829372]: 2023-02-09 20:30:54 0 [ERROR] InnoDB: Flagged corruption of `INDEX_NAME` in table `DBNAME`.`TBLNAME` in purge
      

      It seems to start with tried to purge non-delete-marked record in index:

      Jan 28 09:38:41 SERVER_HOSTNAME mariadbd[962260]: 2023-01-28  9:38:41 0 [ERROR] InnoDB: tried to purge non-delete-marked record in index `ix_TBLNAME_collection` of table `DBNAME`.`TBLNAME`: tuple: TUPLE (info_bits=0, 2 fields): {[4]    (0x0018CA9D),[8]     7Sc(0x0000000000375363)}, record: COMPACT RECORD(info_bits=0, 2 fields): {[4]    (0x0018CA9D),[8]     7Sc(0x0000000000375363)}
      Jan 28 09:38:41 SERVER_HOSTNAME mariadbd[962260]: 2023-01-28  9:38:41 0 [ERROR] InnoDB: tried to purge non-delete-marked record in index `ix_TBLNAME_partcheck` of table `DBNAME`.`TBLNAME`: tuple: TUPLE (info_bits=0, 2 fields): {[1] (0x80),[8]     7Sc(0x0000000000375363)}, record: COMPACT RECORD(info_bits=0, 2 fields): {[1] (0x80),[8]     7Sc(0x0000000000375363)}
      Jan 29 20:52:19 SERVER_HOSTNAME mariadbd[962260]: 2023-01-29 20:52:19 0 [ERROR] InnoDB: tried to purge non-delete-marked record in index `ux_collection_id_filenumber` of table `DBNAME`.`TBLNAME`: tuple: TUPLE (info_bits=0, 3 fields): {[4] #  (0x002387DE),[4]    (0x00000001),[8]     Ks (0x00000000004B731F)}, record: COMPACT RECORD(info_bits=0, 3 fields): {[4] #  (0x002387DE),[4]    (0x00000001),[8]     Ks (0x00000000004B731F)}
      Jan 30 18:46:44 SERVER_HOSTNAME mariadbd[962260]: 2023-01-30 18:46:44 0 [ERROR] InnoDB: tried to purge non-delete-marked record in index `ix_TBLNAME_collection` of table `DBNAME`.`TBLNAME`: tuple: TUPLE (info_bits=0, 2 fields): {[4]    (0x0006B600),[8]      @ (0x00000000000F400A)}, record: COMPACT RECORD(info_bits=0, 2 fields): {[4]    (0x0006B600),[8]      @ (0x00000000000F400A)}
      Jan 30 18:46:44 SERVER_HOSTNAME mariadbd[962260]: 2023-01-30 18:46:44 0 [ERROR] InnoDB: tried to purge non-delete-marked record in index `ix_TBLNAME_partcheck` of table `DBNAME`.`TBLNAME`: tuple: TUPLE (info_bits=0, 2 fields): {[1] (0x80),[8]      @ (0x00000000000F400A)}, record: COMPACT RECORD(info_bits=0, 2 fields): {[1] (0x80),[8]      @ (0x00000000000F400A)
      

      Also shows:

      Jan 30 19:50:18 SERVER_HOSTNAME mariadbd[962260]: 2023-01-30 19:50:18 328202837 [ERROR] Got error 180 when reading table './dbname/tblname'
      Jan 30 19:50:33 SERVER_HOSTNAME mariadbd[962260]:  6: len 30; hex 616c742e62696e6172692023-01-30 19:50:33 328203835 [ERROR] Got error 180 when reading table './dbname/tblname'
      

      And it also shows thousands of these but I have no idea if that's related:

      Jan 29 20:53:49 SERVER_HOSTNAME mariadbd[962260]: RECORD LOCKS space id 22307 page no 7747 n bits 112 index PRIMARY of table `DBNAME`.`TBLNAME` trx id 2609064680 lock mode S locks rec but not gap
      Jan 29 20:53:50 SERVER_HOSTNAME mariadbd[962260]: RECORD LOCKS space id 22307 page no 7760 n bits 88 index PRIMARY of table `DBNAME`.`TBLNAME` trx id 2609071557 lock mode S locks rec but not gap waiting
      Jan 29 20:53:50 SERVER_HOSTNAME mariadbd[962260]: RECORD LOCKS space id 22307 page no 7760 n bits 88 index PRIMARY of table `DBNAME`.`TBLNAME` trx id 2609071682 lock_mode X locks rec but not gap
      Jan 29 20:53:50 SERVER_HOSTNAME mariadbd[962260]: RECORD LOCKS space id 22307 page no 7758 n bits 96 index PRIMARY of table `DBNAME`.`TBLNAME` trx id 2609071682 lock_mode X locks rec but not gap waiting
      Jan 29 20:53:50 SERVER_HOSTNAME mariadbd[962260]: RECORD LOCKS space id 22307 page no 7758 n bits 96 index PRIMARY of table `DBNAME`.`TBLNAME` trx id 2609071557 lock mode S locks rec but not gap
      

      Attachments

        Issue Links

          Activity

            Fossil Bento added a comment -

            Upgraded from 10.10.6 to 11.1.3 and my CPU usage is 100% now on all cores. Load went from 10-ish to 50+.

            New optimizer? Any way to turn this off?

            Fossil Bento added a comment - Upgraded from 10.10.6 to 11.1.3 and my CPU usage is 100% now on all cores. Load went from 10-ish to 50+. New optimizer? Any way to turn this off?
            Fossil Bento added a comment - - edited

            Okay so.... More tables had corrupted indexes but these were never reported. I had a few duplicate entries on columns with a unique index.

            I did do OPTIMIZE and also ANALYZE TABLE but it did not report any errors. Is there a better way of scanning ALL tables with unique indexes and look for dupes?

            Fossil Bento added a comment - - edited Okay so.... More tables had corrupted indexes but these were never reported. I had a few duplicate entries on columns with a unique index. I did do OPTIMIZE and also ANALYZE TABLE but it did not report any errors. Is there a better way of scanning ALL tables with unique indexes and look for dupes?

            As far as I understand, CHECK TABLE should detect dupes, you can run

            mariadb-check -A
            

            to run CHECK TABLE on all tables in all databases (see --help for more command line options, as usual)

            serg Sergei Golubchik added a comment - As far as I understand, CHECK TABLE should detect dupes, you can run mariadb-check -A to run CHECK TABLE on all tables in all databases (see --help for more command line options, as usual)
            Fossil Bento added a comment -

            It didn't. But I manually removed all the dupes now.

            I am still having issues with this new optimizer picking weird plans all the time. I have some slow queries on a certain column so I add an index on it, then suddenly a ton of other queries start using that index instead of their "old" one and execution time goes from 50ms to 30 seconds.

            Fossil Bento added a comment - It didn't. But I manually removed all the dupes now. I am still having issues with this new optimizer picking weird plans all the time. I have some slow queries on a certain column so I add an index on it, then suddenly a ton of other queries start using that index instead of their "old" one and execution time goes from 50ms to 30 seconds.

            The MDEV-24402 CHECK TABLE…EXTENDED (which can be invoked also via mariadb-check; see MDEV-30129) should hopefully find all inconsistencies.

            When a table is being rebuilt by OPTIMIZE TABLE or ALTER TABLE…FORCE, secondary indexes will be ignored and not checked. I don’t think that ANALYZE TABLE would report any index corruption.

            marko Marko Mäkelä added a comment - The MDEV-24402 CHECK TABLE…EXTENDED (which can be invoked also via mariadb-check ; see MDEV-30129 ) should hopefully find all inconsistencies. When a table is being rebuilt by OPTIMIZE TABLE or ALTER TABLE…FORCE , secondary indexes will be ignored and not checked. I don’t think that ANALYZE TABLE would report any index corruption.

            People

              marko Marko Mäkelä
              Fossil Bento
              Votes:
              5 Vote for this issue
              Watchers:
              27 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.