Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-27900

aio returning partial reads/writes looks like InnoDB: Database page corruption on btrfs with io_uring

Details

    Description

      Feb 17 23:47:45 apfelkirsch systemd[1]: Starting MySQL Server...
      Feb 17 23:47:45 apfelkirsch mysql-start[16475]: 2022-02-17 23:47:45 0 [Note] /nix/store/svbwxblrhai6fcij9p9pmyyafa5pcqls-mariadb-server-10.6.5/bin/mysqld (server 10.6.5-MariaDB) starting as process 16475 ...
      Feb 17 23:47:45 apfelkirsch mysql-start[16475]: 2022-02-17 23:47:45 0 [Warning] WSREP: Failed to guess base node address. Set it explicitly via wsrep_node_address.
      Feb 17 23:47:45 apfelkirsch mysql-start[16475]: 2022-02-17 23:47:45 0 [Warning] WSREP: Failed to guess base node address. Set it explicitly via wsrep_node_address.
      Feb 17 23:47:45 apfelkirsch mysql-start[16475]: 2022-02-17 23:47:45 0 [Warning] WSREP: Guessing address for incoming client connections failed. Try setting wsrep_node_incoming_address explicitly.
      Feb 17 23:47:45 apfelkirsch mysql-start[16475]: 2022-02-17 23:47:45 0 [Note] WSREP: Node addr:
      Feb 17 23:47:45 apfelkirsch mysql-start[16475]: 2022-02-17 23:47:45 0 [Note] InnoDB: Compressed tables use zlib 1.2.11
      Feb 17 23:47:45 apfelkirsch mysql-start[16475]: 2022-02-17 23:47:45 0 [Note] InnoDB: Number of pools: 1
      Feb 17 23:47:45 apfelkirsch mysql-start[16475]: 2022-02-17 23:47:45 0 [Note] InnoDB: Using crc32 + pclmulqdq instructions
      Feb 17 23:47:45 apfelkirsch mysql-start[16475]: 2022-02-17 23:47:45 0 [Note] InnoDB: Using liburing
      Feb 17 23:47:45 apfelkirsch mysql-start[16475]: 2022-02-17 23:47:45 0 [Note] InnoDB: Initializing buffer pool, total size = 134217728, chunk size = 134217728
      Feb 17 23:47:45 apfelkirsch mysql-start[16475]: 2022-02-17 23:47:45 0 [Note] InnoDB: Completed initialization of buffer pool
      Feb 17 23:47:45 apfelkirsch mysql-start[16475]: 2022-02-17 23:47:45 0 [Note] InnoDB: Starting crash recovery from checkpoint LSN=42161,42161
      Feb 17 23:47:45 apfelkirsch mysql-start[16475]: 2022-02-17 23:47:45 0 [Note] InnoDB: 128 rollback segments are active.
      Feb 17 23:47:45 apfelkirsch mysql-start[16475]: 2022-02-17 23:47:45 0 [Note] InnoDB: Removed temporary tablespace data file: "./ibtmp1"
      Feb 17 23:47:45 apfelkirsch mysql-start[16475]: 2022-02-17 23:47:45 0 [Note] InnoDB: Creating shared tablespace for temporary tables
      Feb 17 23:47:45 apfelkirsch mysql-start[16475]: 2022-02-17 23:47:45 0 [Note] InnoDB: Setting file './ibtmp1' size to 12 MB. Physically writing the file full; Please wait ...
      Feb 17 23:47:45 apfelkirsch mysql-start[16475]: 2022-02-17 23:47:45 0 [Note] InnoDB: File './ibtmp1' size is now 12 MB.
      Feb 17 23:47:45 apfelkirsch mysql-start[16475]: 2022-02-17 23:47:45 0 [Note] InnoDB: 10.6.5 started; log sequence number 42173; transaction id 14
      Feb 17 23:47:45 apfelkirsch mysql-start[16475]: 2022-02-17 23:47:45 0 [Note] InnoDB: Loading buffer pool(s) from /var/lib/mysql/ib_buffer_pool
      Feb 17 23:47:45 apfelkirsch mysql-start[16475]: 2022-02-17 23:47:45 0 [Note] Plugin 'FEEDBACK' is disabled.
      Feb 17 23:47:45 apfelkirsch mysql-start[16475]: 2022-02-17 23:47:45 0 [ERROR] InnoDB: Database page corruption on disk or a failed read of file './ibdata1' page [page id: space=0, page number=9]. You may have to recover from a backup.
      Feb 17 23:47:45 apfelkirsch mysql-start[16475]: 2022-02-17 23:47:45 0 [Note] InnoDB: Page dump in ascii and hex (16384 bytes):
      Feb 17 23:47:45 apfelkirsch mysql-start[16475]:  len 16384; hex 000000000000002022-02-17 23:47:45 0 [ERROR] InnoDB: Database page corruption on disk or a failed read of file './ibdata1' page [page id: space=0, page number=243]. You may have to recover from a backup.
      Feb 17 23:47:45 apfelkirsch mysql-start[16475]: 09ffffffffffffffff00000000000099ab2022-02-17 23:47:45 0 [Note] InnoDB: Page dump in ascii and hex (16384 bytes):
      Feb 17 23:47:45 apfelkirsch mysql-start[16475]:  len 16384; hex 00000000000000f3ffff45bf000000000000000000ffffffffffff000000000000000000009e6e00030000000000022022-02-17 23:47:45 0 [ERROR] InnoDB: Database page corruption on disk or a failed read of file './mysql/innodb_table_stats.ibd' page [page id: space=1, page number=0]. You may have to recover from a backup.
      Feb 17 23:47:45 apfelkirsch mysql-start[16475]: 2022-02-17 23:47:45 0 [Note] InnoDB: Page dump in ascii and hex (16384 bytes):
       
      […]
       
      Feb 17 23:47:45 apfelkirsch mysql-start[16475]: InnoDB: End of page dump
      Feb 17 23:47:45 apfelkirsch mysql-start[16475]: 2022-02-17 23:47:45 0 [Note] InnoDB: Uncompressed page, stored checksum in field1 0, calculated checksums for field1: crc32 2039713970, innodb 724040756, page type 17855, stored checksum in field2 0, innodb checksum for field2: 1607711317,  page LSN 0 39339, low 4 bytes of LSN at page end 0, page number (if stored to page already) 9, space id (if created with >= MySQL-4.1.1 and stored already) 0
      Feb 17 23:47:45 apfelkirsch mysql-start[16475]: 2022-02-17 23:47:45 0 [Note] InnoDB: Page may be an index page where index id is 5
      Feb 17 23:47:45 apfelkirsch mysql-start[16475]: 2022-02-17 23:47:45 0 [Note] InnoDB: Index 5 is `ID_IND` in table `SYS_TABLES`
      Feb 17 23:47:45 apfelkirsch mysql-start[16475]: 2022-02-17 23:47:45 0 [Note] InnoDB:  You can use CHECK TABLE to scan your table for corruption. Please refer to https://mariadb.com/kb/en/library/innodb-recovery-modes/ for information about forcing recovery.
      Feb 17 23:47:45 apfelkirsch mysql-start[16475]: 2022-02-17 23:47:45 0 [ERROR] [FATAL] InnoDB: Aborting because of a corrupt database page.
      Feb 17 23:47:46 apfelkirsch systemd[1]: mysql.service: Main process exited, code=killed, status=6/ABRT
      

      See https://github.com/NixOS/nixpkgs/issues/160516 and https://github.com/NixOS/nixpkgs/issues/160145 for potentially more details.

      Attachments

        Issue Links

          Activity

            I ran into this issue on Slackware (-current 2022-02-25) with both 10.6.7 and 10.7.3. After discussing on zulip I can confirm that setting either innodb_use_native_aio=0 or innodb_flush_method=fsync work around the bug for me.

            pprkut Heinz Wiesinger added a comment - I ran into this issue on Slackware (-current 2022-02-25) with both 10.6.7 and 10.7.3. After discussing on zulip I can confirm that setting either innodb_use_native_aio=0 or innodb_flush_method=fsync work around the bug for me.

            I think that this needs to be resolved in the Linux kernel somehow.

            marko Marko Mäkelä added a comment - I think that this needs to be resolved in the Linux kernel somehow.
            danblack Daniel Black added a comment -

            pprkut from zulip - your kernel version was 5.16.11. Chasing up now.

            danblack Daniel Black added a comment - pprkut from zulip - your kernel version was 5.16.11. Chasing up now.
            danblack Daniel Black added a comment - Kernel patch: https://lore.kernel.org/linux-btrfs/CABVffENfbsC6HjGbskRZGR2NvxbnQi17gAuW65eOM+QRzsr8Bg@mail.gmail.com/T/#mb2738e675e48e0e0778a2e8d1537dec5ec0d3d3a thanks to the great work by Filipe Manana (Suse).
            danblack Daniel Black added a comment -

            So the problem as Filipe Manana described is that we didn't look at the return values.

            And its true, the in the aio code we set m_ret_len to be the value returned by the underlying call, and except for the Windows case, we otherwise ignore it.

            If we assume the the cb->m_buffer isn't used by the AIO handlers, which was the case for innodb, we can just resubmit.

            bb-10.6-danielblack-MDEV-27900-resubmit-partial-reads for going through tests. Discussions on the linux mailing lists indicates that under some circumstances it might happen in AIO. So most of the patch will go into 10.5, and the uring to 10.6.

            danblack Daniel Black added a comment - So the problem as Filipe Manana described is that we didn't look at the return values. And its true, the in the aio code we set m_ret_len to be the value returned by the underlying call, and except for the Windows case, we otherwise ignore it. If we assume the the cb->m_buffer isn't used by the AIO handlers, which was the case for innodb, we can just resubmit. bb-10.6-danielblack- MDEV-27900 -resubmit-partial-reads for going through tests. Discussions on the linux mailing lists indicates that under some circumstances it might happen in AIO. So most of the patch will go into 10.5, and the uring to 10.6.

            People

              danblack Daniel Black
              ajs124 ajs124
              Votes:
              1 Vote for this issue
              Watchers:
              10 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.