Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-34422

InnoDB writes corrupted log on macOS and AIX due to uninitialized log_sys.lsn_lock

Details

    Description

      I have encountered a recurring issue with MariaDB 11.5.1 where the InnoDB storage engine fails to initialize due to a "Missing FILE_CHECKPOINT" error. This issue arises consistently after importing data into a freshly created database and then attempting to restart the MariaDB server.

      *Steps to Reproduce:*

      1. Install MariaDB 11.5.1.
      2. Initialize a new, empty database.
      3. Import data into the database.
      4. Restart the MariaDB server.

      *Expected Behavior:*

      MariaDB should restart successfully without any errors.

      *Actual Behavior:*

      The server fails to restart, and the following error messages appear in the log files:

      240619 11:35:26 mysqld_safe Starting mariadbd daemon with databases from /Applications/ServBay/db/mariadb/11.5
      2024-06-19 11:35:26 0 [Note] Starting MariaDB 11.5.1-MariaDB-log source revision 65079ece42f383bdeb9b3f913a833a6f0123919d as process 32024
      2024-06-19 11:35:26 0 [Warning] Setting lower_case_table_names=2 because file system for /Applications/ServBay/db/mariadb/11.5/ is case insensitive
      2024-06-19 11:35:26 0 [Warning] --innodb-file-per-table is deprecated and will be removed in a future release
      2024-06-19 11:35:26 0 [Note] InnoDB: Compressed tables use zlib 1.3.1
      2024-06-19 11:35:26 0 [Note] InnoDB: Number of transaction pools: 1
      2024-06-19 11:35:26 0 [Note] InnoDB: Using generic crc32 instructions
      2024-06-19 11:35:26 0 [Note] InnoDB: Initializing buffer pool, total size = 4.000GiB, chunk size = 64.000MiB
      2024-06-19 11:35:26 0 [Note] InnoDB: Completed initialization of buffer pool
      2024-06-19 11:35:26 0 [Note] InnoDB: End of log at LSN=47733
      2024-06-19 11:35:26 0 [Note] InnoDB: Opened 3 undo tablespaces
      2024-06-19 11:35:26 0 [Note] InnoDB: 128 rollback segments in 3 undo tablespaces are active.
      2024-06-19 11:35:26 0 [Note] InnoDB: Setting file '/Applications/ServBay/db/mariadb/11.5/ibtmp1' size to 12.000MiB. Physically writing the file full; Please wait ...
      2024-06-19 11:35:26 0 [Note] InnoDB: File '/Applications/ServBay/db/mariadb/11.5/ibtmp1' size is now 12.000MiB.
      2024-06-19 11:35:26 0 [Note] InnoDB: log sequence number 47733; transaction id 14
      2024-06-19 11:35:26 0 [Note] InnoDB: Loading buffer pool(s) from /Applications/ServBay/db/mariadb/11.5/ib_buffer_pool
      2024-06-19 11:35:26 0 [Note] Plugin 'FEEDBACK' is disabled.
      2024-06-19 11:35:26 0 [Note] Plugin 'wsrep-provider' is disabled.
      2024-06-19 11:35:26 0 [Note] InnoDB: Buffer pool(s) load completed at 240619 11:35:26
      2024-06-19 11:35:27 0 [Note] Server socket created on IP: '::1'.
      2024-06-19 11:35:27 0 [Note] Server socket created on IP: '127.0.0.1'.
      2024-06-19 11:35:27 0 [Note] Server socket created on IP: '::'.
      2024-06-19 11:35:27 0 [Note] Server socket created on IP: '0.0.0.0'.
      2024-06-19 11:35:27 0 [Note] mariadbd: Event Scheduler: Loaded 0 events
      2024-06-19 11:35:27 0 [Note] /Applications/ServBay/package/mariadb/11.5/current/bin/mariadbd: ready for connections.
      Version: '11.5.1-MariaDB-log'  socket: '/Applications/ServBay/tmp/mariadb-11.5.sock'  port: 3306  ServBay.dev
      2024-06-19 11:37:01 0 [Note] /Applications/ServBay/package/mariadb/11.5/current/bin/mariadbd (initiated by: unknown): Normal shutdown
      2024-06-19 11:37:01 0 [Note] InnoDB: FTS optimize thread exiting.
      2024-06-19 11:37:01 0 [Note] InnoDB: Starting shutdown...
      2024-06-19 11:37:01 0 [Note] InnoDB: Dumping buffer pool(s) to /Applications/ServBay/db/mariadb/11.5/ib_buffer_pool
      2024-06-19 11:37:01 0 [Note] InnoDB: Buffer pool(s) dump completed at 240619 11:37:01
      2024-06-19 11:37:01 0 [Note] InnoDB: Removed temporary tablespace data file: "/Applications/ServBay/db/mariadb/11.5/ibtmp1"
      2024-06-19 11:37:01 0 [Note] InnoDB: Shutdown completed; log sequence number 10225990; transaction id 623
      2024-06-19 11:37:01 0 [Note] /Applications/ServBay/package/mariadb/11.5/current/bin/mariadbd: Shutdown complete
       
      240619 11:37:01 mysqld_safe mysqld from pid file /Applications/ServBay/tmp/mariadb-11.5.pid ended
      240619 11:37:02 mysqld_safe Starting mariadbd daemon with databases from /Applications/ServBay/db/mariadb/11.5
      2024-06-19 11:37:02 0 [Warning] Setting lower_case_table_names=2 because file system for /Applications/ServBay/db/mariadb/11.5/ is case insensitive
      2024-06-19 11:37:02 0 [Note] Starting MariaDB 11.5.1-MariaDB-log source revision 65079ece42f383bdeb9b3f913a833a6f0123919d as process 32351
      2024-06-19 11:37:02 0 [Warning] --innodb-file-per-table is deprecated and will be removed in a future release
      2024-06-19 11:37:02 0 [Note] InnoDB: Compressed tables use zlib 1.3.1
      2024-06-19 11:37:02 0 [Note] InnoDB: Number of transaction pools: 1
      2024-06-19 11:37:02 0 [Note] InnoDB: Using generic crc32 instructions
      2024-06-19 11:37:02 0 [Note] InnoDB: Initializing buffer pool, total size = 4.000GiB, chunk size = 64.000MiB
      2024-06-19 11:37:02 0 [Note] InnoDB: Completed initialization of buffer pool
      2024-06-19 11:37:02 0 [ERROR] InnoDB: Missing FILE_CHECKPOINT(10225974) at 10225974
      2024-06-19 11:37:02 0 [ERROR] InnoDB: Log scan aborted at LSN 10225974
      2024-06-19 11:37:02 0 [ERROR] InnoDB: Plugin initialization aborted with error Generic error
      2024-06-19 11:37:02 0 [Note] InnoDB: Starting shutdown...
      2024-06-19 11:37:03 0 [ERROR] Plugin 'InnoDB' registration as a STORAGE ENGINE failed.
      2024-06-19 11:37:03 0 [Note] Plugin 'FEEDBACK' is disabled.
      2024-06-19 11:37:03 0 [Note] Plugin 'wsrep-provider' is disabled.
      2024-06-19 11:37:03 0 [ERROR] Unknown/unsupported storage engine: InnoDB
      2024-06-19 11:37:03 0 [ERROR] Aborting
      240619 11:37:03 mysqld_safe mysqld from pid file /Applications/ServBay/tmp/mariadb-11.5.pid ended
      

      This issue has been observed consistently across multiple users and environments.

      *Environment Details:*

      • MariaDB version: 11.5.1
      • Operating System: macOS Sonoma

      *Additional Information:*

      The issue does not occur when the database is empty. It only arises after data has been imported and the server is restarted. Sometimes it will also cause ib_logfile0 to be lost.

      11.5.0 works fine.

      *Reproducibility:*

      This issue is reproducible every time following the steps outlined above.

      Attachments

        Issue Links

          Activity

            Thank you for the report. There seems to be a regression between MariaDB Server 10.6 and 10.11. In our CI grid view for 10.6, we have a number of test failures on the experimental aarch64-macos builder, but the maximum configured number of test failures (49) is not being exceeded. In the CI grid view for 10.11, there are much more test failures and the tests are aborted due to too many failures.

            I was suspecting that innodb_log_file_buffering=OFF could play a role, but I realize that this setting (use of O_DIRECT) was only implemented for Linux and Microsoft Windows, because we do not know how to find the physical block size on other operating systems. The InnoDB redo log file format was changed in MDEV-14425, and I suspect that this could explain the regression.

            In the log excerpt that you posted, the LSN=47733 looks normal for a server startup right after the bootstrap.

            In any case, this kind of trouble seems to be consistently reproduced on our CI. Here is a log excerpt from the currently latest 10.11 build (note: the logs will be pruned within some weeks):

            10.11 c37b2a9f045f5170189e1ed18a7d177b61beab93

            encryption.innodb_encryption-page-compression w8 [ fail ]
                    Test ended at 2024-06-17 17:07:08
            CURRENT_TEST: encryption.innodb_encryption-page-compression
            mysqltest: At line 99: query 'SET GLOBAL innodb_encryption_threads = 4' failed: ER_UNKNOWN_SYSTEM_VARIABLE (1193): Unknown system variable 'innodb_encryption_threads'
            

            and some lines from the corresponding server error log (note: the logs will be pruned within some weeks):

            10.11 c37b2a9f045f5170189e1ed18a7d177b61beab93

            2024-06-17 17:07:06 0 [Warning] Could not increase number of max_open_files to more than 1024 (request: 32190)
            2024-06-17 17:07:06 0 [Warning] Changed limits: max_open_files: 1024  max_connections: 151 (was 151)  table_cache: 421 (was 2000)
            2024-06-17 17:07:06 0 [Warning] Setting lower_case_table_names=2 because file system for /Users/ec2-user/buildbot/aarch64-macos/build/mysql-test/var/8/mysqld.1/data/ is case insensitive
            2024-06-17 17:07:06 0 [Note] Starting MariaDB 10.11.9-MariaDB-log source revision c37b2a9f045f5170189e1ed18a7d177b61beab93 as process 62930
            2024-06-17 17:07:06 0 [Warning] Plugin 'example_key_management' is of maturity level experimental while the server is stable
            2024-06-17 17:07:06 0 [Note] Plugin 'partition' is disabled.
            2024-06-17 17:07:06 0 [Note] Plugin 'SEQUENCE' is disabled.
            2024-06-17 17:07:06 0 [Note] InnoDB: Compressed tables use zlib 1.2.12
            2024-06-17 17:07:06 0 [Note] InnoDB: Number of transaction pools: 1
            2024-06-17 17:07:06 0 [Note] InnoDB: Using generic crc32 instructions
            2024-06-17 17:07:06 0 [Note] InnoDB: Initializing buffer pool, total size = 8.000MiB, chunk size = 1.000MiB
            2024-06-17 17:07:06 0 [Note] InnoDB: Completed initialization of buffer pool
            2024-06-17 17:07:06 0 [ERROR] InnoDB: Missing FILE_CHECKPOINT(2679852) at 2679852
            2024-06-17 17:07:06 0 [ERROR] InnoDB: Log scan aborted at LSN 2679852
            2024-06-17 17:07:06 0 [ERROR] InnoDB: Plugin initialization aborted with error Generic error
            

            The message Using generic crc32 instructions caught my eye in your log excerpt as well as above. We have specialized CRC-32 implementations for ARMv8 too, actually a couple of variants (see MDEV-23495). On AMD64, the lowest grade specialization requires SSE 4.2, which was introduced in 2008. The generic code path should almost never be used.

            We would seem to have two problems on macOS:

            1. the specialized ARMv8 instructions for accelerating CRC-32 and CRC-32C computations are not being used; on AMD64 they are (and related to MDEV-33817 I recently learned that Apple has disabled compiler support for AVX512, in addition to not supporting it in their kernel)
            2. InnoDB is not fully restart safe on mac OS

            To diagnose the latter problem, I’d need a copy of the data directory that leads to a failed restart, generated with as little data as possible. Ideally this should be available from our CI system, but it looks like in https://ci.mariadb.org/47580/logs/aarch64-macos/ we only have the error logs available.

            marko Marko Mäkelä added a comment - Thank you for the report. There seems to be a regression between MariaDB Server 10.6 and 10.11. In our CI grid view for 10.6 , we have a number of test failures on the experimental aarch64-macos builder , but the maximum configured number of test failures (49) is not being exceeded. In the CI grid view for 10.11 , there are much more test failures and the tests are aborted due to too many failures. I was suspecting that innodb_log_file_buffering=OFF could play a role, but I realize that this setting (use of O_DIRECT ) was only implemented for Linux and Microsoft Windows, because we do not know how to find the physical block size on other operating systems. The InnoDB redo log file format was changed in MDEV-14425 , and I suspect that this could explain the regression. In the log excerpt that you posted, the LSN=47733 looks normal for a server startup right after the bootstrap. In any case, this kind of trouble seems to be consistently reproduced on our CI. Here is a log excerpt from the currently latest 10.11 build (note: the logs will be pruned within some weeks): 10.11 c37b2a9f045f5170189e1ed18a7d177b61beab93 encryption.innodb_encryption-page-compression w8 [ fail ] Test ended at 2024-06-17 17:07:08 CURRENT_TEST: encryption.innodb_encryption-page-compression mysqltest: At line 99: query 'SET GLOBAL innodb_encryption_threads = 4' failed: ER_UNKNOWN_SYSTEM_VARIABLE (1193): Unknown system variable 'innodb_encryption_threads' and some lines from the corresponding server error log (note: the logs will be pruned within some weeks): 10.11 c37b2a9f045f5170189e1ed18a7d177b61beab93 2024-06-17 17:07:06 0 [Warning] Could not increase number of max_open_files to more than 1024 (request: 32190) 2024-06-17 17:07:06 0 [Warning] Changed limits: max_open_files: 1024 max_connections: 151 (was 151) table_cache: 421 (was 2000) 2024-06-17 17:07:06 0 [Warning] Setting lower_case_table_names=2 because file system for /Users/ec2-user/buildbot/aarch64-macos/build/mysql-test/var/8/mysqld.1/data/ is case insensitive 2024-06-17 17:07:06 0 [Note] Starting MariaDB 10.11.9-MariaDB-log source revision c37b2a9f045f5170189e1ed18a7d177b61beab93 as process 62930 2024-06-17 17:07:06 0 [Warning] Plugin 'example_key_management' is of maturity level experimental while the server is stable 2024-06-17 17:07:06 0 [Note] Plugin 'partition' is disabled. 2024-06-17 17:07:06 0 [Note] Plugin 'SEQUENCE' is disabled. 2024-06-17 17:07:06 0 [Note] InnoDB: Compressed tables use zlib 1.2.12 2024-06-17 17:07:06 0 [Note] InnoDB: Number of transaction pools: 1 2024-06-17 17:07:06 0 [Note] InnoDB: Using generic crc32 instructions 2024-06-17 17:07:06 0 [Note] InnoDB: Initializing buffer pool, total size = 8.000MiB, chunk size = 1.000MiB 2024-06-17 17:07:06 0 [Note] InnoDB: Completed initialization of buffer pool 2024-06-17 17:07:06 0 [ERROR] InnoDB: Missing FILE_CHECKPOINT(2679852) at 2679852 2024-06-17 17:07:06 0 [ERROR] InnoDB: Log scan aborted at LSN 2679852 2024-06-17 17:07:06 0 [ERROR] InnoDB: Plugin initialization aborted with error Generic error The message Using generic crc32 instructions caught my eye in your log excerpt as well as above. We have specialized CRC-32 implementations for ARMv8 too, actually a couple of variants (see MDEV-23495 ). On AMD64, the lowest grade specialization requires SSE 4.2, which was introduced in 2008. The generic code path should almost never be used. We would seem to have two problems on macOS: the specialized ARMv8 instructions for accelerating CRC-32 and CRC-32C computations are not being used; on AMD64 they are (and related to MDEV-33817 I recently learned that Apple has disabled compiler support for AVX512, in addition to not supporting it in their kernel) InnoDB is not fully restart safe on mac OS To diagnose the latter problem, I’d need a copy of the data directory that leads to a failed restart, generated with as little data as possible. Ideally this should be available from our CI system, but it looks like in https://ci.mariadb.org/47580/logs/aarch64-macos/ we only have the error logs available.

            On Intel-based (I admit, old) macOS InnoDB in MariaDB server 10.11 eventually fails the same way for me:

            Yuliyas-Air:mysql-test Valerii$ ./mtr --suite=innodb
            Logging: ./mtr  --suite=innodb
            VS config: 
            vardir: /Users/Valerii/dbs/maria10.11/mysql-test/var
            Removing old var directory...
            Creating var directory '/Users/Valerii/dbs/maria10.11/mysql-test/var'...
            Checking supported features...
            MariaDB Version 10.11.9-MariaDB
             - SSL connections supported
            Using suites: innodb
            Collecting tests...
            Installing system database...
             
            ==============================================================================
             
            TEST                                      RESULT   TIME (ms) or COMMENT
            --------------------------------------------------------------------------
             
            worker[01] Using MTR_BUILD_THREAD 300, with reserved ports 16000..16019
            innodb.alter_large_dml                   [ skipped ]  Test needs --big-test
            ...
            innodb.alter_not_null 'INPLACE,STRICT,innodb' [ pass ]    301
            innodb.alter_not_null 'INPLACE,NON-STRICT,innodb' [ pass ]    416
            innodb.alter_algorithm 'INPLACE,innodb'  [ pass ]    474
            innodb.alter_algorithm 'INSTANT,innodb'  [ pass ]    252
            innodb.alter_algorithm 'NOCOPY,innodb'   [ pass ]    317
            innodb.innodb 'innodb'                   [ skipped ]  Test requires InnoDB with 16k Page size.
            line
            2024-06-19  8:21:26 0 [ERROR] InnoDB: Missing FILE_CHECKPOINT(48586) at 48586
            2024-06-19  8:21:26 0 [ERROR] InnoDB: Log scan aborted at LSN 48586
            2024-06-19  8:21:26 0 [ERROR] InnoDB: Plugin initialization aborted with error Generic error
            2024-06-19  8:21:26 0 [ERROR] Plugin 'InnoDB' registration as a STORAGE ENGINE failed.
            ^ Found warnings in /Users/Valerii/dbs/maria10.11/mysql-test/var/log/mysqld.1.err
            ok
             
            innodb.group_commit_force_recovery 'innodb' [ pass ]     47
            innodb.innodb_bug53674 'innodb'          [ pass ]     40
            innodb.snapshot 'innodb'                 [ pass ]     51
            ...
            

            Version details:

            Yuliyas-Air:maria10.11 Valerii$ bin/mysql -e "show variables like 'version%'"
            +-------------------------+------------------------------------------+
            | Variable_name           | Value                                    |
            +-------------------------+------------------------------------------+
            | version                 | 10.11.9-MariaDB                          |
            | version_comment         | MariaDB Server                           |
            | version_compile_machine | x86_64                                   |
            | version_compile_os      | osx10.13                                 |
            | version_malloc_library  | system                                   |
            | version_source_revision | f01e6503f4151a8d3d3f0495dab22c47ec794e67 |
            | version_ssl_library     | OpenSSL 1.1.1q  5 Jul 2022               |
            +-------------------------+------------------------------------------+
            

            valerii Valerii Kravchuk added a comment - On Intel-based (I admit, old) macOS InnoDB in MariaDB server 10.11 eventually fails the same way for me: Yuliyas-Air:mysql-test Valerii$ ./mtr --suite=innodb Logging: ./mtr --suite=innodb VS config: vardir: /Users/Valerii/dbs/maria10.11/mysql-test/var Removing old var directory... Creating var directory '/Users/Valerii/dbs/maria10.11/mysql-test/var'... Checking supported features... MariaDB Version 10.11.9-MariaDB - SSL connections supported Using suites: innodb Collecting tests... Installing system database...   ==============================================================================   TEST RESULT TIME (ms) or COMMENT --------------------------------------------------------------------------   worker[01] Using MTR_BUILD_THREAD 300, with reserved ports 16000..16019 innodb.alter_large_dml [ skipped ] Test needs --big-test ... innodb.alter_not_null 'INPLACE,STRICT,innodb' [ pass ] 301 innodb.alter_not_null 'INPLACE,NON-STRICT,innodb' [ pass ] 416 innodb.alter_algorithm 'INPLACE,innodb' [ pass ] 474 innodb.alter_algorithm 'INSTANT,innodb' [ pass ] 252 innodb.alter_algorithm 'NOCOPY,innodb' [ pass ] 317 innodb.innodb 'innodb' [ skipped ] Test requires InnoDB with 16k Page size. line 2024-06-19 8:21:26 0 [ERROR] InnoDB: Missing FILE_CHECKPOINT(48586) at 48586 2024-06-19 8:21:26 0 [ERROR] InnoDB: Log scan aborted at LSN 48586 2024-06-19 8:21:26 0 [ERROR] InnoDB: Plugin initialization aborted with error Generic error 2024-06-19 8:21:26 0 [ERROR] Plugin 'InnoDB' registration as a STORAGE ENGINE failed. ^ Found warnings in /Users/Valerii/dbs/maria10.11/mysql-test/var/log/mysqld.1.err ok   innodb.group_commit_force_recovery 'innodb' [ pass ] 47 innodb.innodb_bug53674 'innodb' [ pass ] 40 innodb.snapshot 'innodb' [ pass ] 51 ... Version details: Yuliyas-Air:maria10.11 Valerii$ bin/mysql -e "show variables like 'version%'" +-------------------------+------------------------------------------+ | Variable_name | Value | +-------------------------+------------------------------------------+ | version | 10.11.9-MariaDB | | version_comment | MariaDB Server | | version_compile_machine | x86_64 | | version_compile_os | osx10.13 | | version_malloc_library | system | | version_source_revision | f01e6503f4151a8d3d3f0495dab22c47ec794e67 | | version_ssl_library | OpenSSL 1.1.1q 5 Jul 2022 | +-------------------------+------------------------------------------+

            In MDEV-34422-innodb-failed-install.db.tar.gz, the checkpoint information looks like this:

            od -Ax -t x1 -j 0x1000 -N 0x2000 var/install.db/ib_logfile0
            

            001000 00 00 00 00 00 00 bd 94 00 00 00 00 00 00 bd 94
            001010 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
            *
            001030 00 00 00 00 00 00 00 00 00 00 00 00 56 9a be 9f
            001040 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
            *
            002000 00 00 00 00 00 00 bd 1c 00 00 00 00 00 00 bd 1c
            002010 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
            *
            002030 00 00 00 00 00 00 00 00 00 00 00 00 98 c0 7e 40
            002040 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
            *
            

            In the MDEV-14425 format, the LSN is conveniently equal to the file offset, provided that the LSN is smaller than the file size in bytes (and the first LSN, stored at byte offset 8, is 0x3000, which is the start of the log record payload area). The checkpoint block at 0x1000 is newer than the one at 0x2000, pointing to LSN=48532 (0xbd94). Let us check the end of the log to cover also the older checkpoint 0xbd1c:

            od -Ax -t x1 -j 0xbd00 var/install.db/ib_logfile0
            

            00bd00 80 d9 ae 48 00 80 73 aa 72 80 40 00 01 8e 20 38
            00bd10 77 b0 12 01 00 2e 2f 6d 79 73 71 6c 2f 69 6e 6e
            00bd20 6f 64 62 5f 74 61 62 6c 65 5f 73 74 61 74 73 2e
            00bd30 69 62 64 b0 12 02 00 2e 2f 6d 79 73 71 6c 2f 69
            00bd40 6e 6e 6f 64 62 5f 69 6e 64 65 78 5f 73 74 61 74
            00bd50 73 2e 69 62 64 b0 14 03 00 2e 2f 6d 79 73 71 6c
            00bd60 2f 74 72 61 6e 73 61 63 74 69 6f 6e 5f 72 65 67
            00bd70 69 73 74 72 79 2e 69 62 64 fa 00 00 00 00 00 00
            00bd80 00 00 bd 1c 01 49 15 1d 68 fa 00 00 00 00 00 00
            00bd90 00 00 bd 94 01 71 21 b1 62 00 39 00 30 32 ff ff
            00bda0 ff ff 00 00 b7 00 ff ff ff ff 00 00 34 00 30 31
            00bdb0 00 c3 16 04 ff b3 8f 83 0c 47 00 80 73 aa b2 04
            00bdc0 ff 34 00 00 3d 33 03 00 80 ad 35 00 00 80 d9 ae
            00bdd0 48 00 80 73 aa 72 80 40 00 01 b8 14 5a 2e 39 00
            00bde0 31 32 ff ff ff ff 00 00 b7 00 ff ff ff ff 00 00
            00bdf0 34 00 31 31 00 c3 16 04 ff b3 8f 83 0c 47 00 80
            00be00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
            *
            a00000
            

            One of the LSNs in the checkpoint blocks should point to a possibly empty sequence of FILE_MODIFY records that is terminated by a FILE_CHECKPOINT record. The FILE_CHECKPOINT record is encoded as fa 00 00 followed by the 8-byte checkpoint LSN. In the MDEV-14425 format it is followed by an end-of-mini-transaction marker (00 or in this case 01) and a CRC-32C of the mini-transaction. Above, we can see exactly two FILE_CHECKPOINT records:

            00bd70 __ __ __ __ __ __ __ __ __ fa 00 00 00 00 00 00
            00bd80 00 00 bd 1c 01 49 15 1d 68 fa 00 00 00 00 00 00
            00bd90 00 00 bd 94 01 71 21 b1 62 00 __ __ __ __ __ __
            

            The record at byte offset 0xbd79 contains the LSN 0xbd1c, which does occur in the second checkpoint block at 0x2000. The record at byte offset 0xbd89 contains the LSN 0xbd94, which occurs in the first checkpoint block at 0x1000. So, the checkpoint blocks are storing the wrong LSN.

            For performance reasons, log_t::write_buf() is not clearing the unused part of the buffer when it is writing 512 bytes (note: MDEV-33894 will reintroduce a parameter to control this size). It only writes an extra NUL byte at the end (0xbd99). The rest of the 512-block is expected to be garbage. On shutdown (of the server bootstrap in this case), the very last mini-transaction is supposed to be a single FILE_CHECKPOINT record that stores the final LSN.

            It is somewhat strange that the buffer is filled with garbage already at this early point. The default value of innodb_log_buffer_size is 16 MiB, and also its minimum value is 2 MiB. We only used 0xbd99-0x3000 = 36249 bytes of that buffer so far. I would have expected the operating system to have zero-initialized the buffer.

            In a healthy file (generated on my AMD64 based Debian GNU/Linux system), an mysql-test/var/install.db/ib_logfile0 would look like this:

            001000 00 00 00 00 00 00 e3 0c 00 00 00 00 00 00 e3 0c
            001010 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
            *
            001030 00 00 00 00 00 00 00 00 00 00 00 00 70 76 77 a5
            00e300 00 00 00 00 00 e2 94 01 66 91 43 a7 fa 00 00 00
            00e310 00 00 00 00 00 e3 0c 01 5d 59 b0 b5 00 00 00 00
            00e320 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
            *
            a00000
            

            Here, the first checkpoint block is newer, and at file offset 0xe30c we really have the FILE_CHECKPOINT record for 0xe30c followed by the 0x01 marker (it would be 0x00 if the file had wrapped around an odd number of times) and the bit-reflected CRC-32C of the FILE_CHECKPOINT record.

            For some reason, on macOS, the file offsets do not match the checkpoint LSNs at this stage.

            marko Marko Mäkelä added a comment - In MDEV-34422-innodb-failed-install.db.tar.gz , the checkpoint information looks like this: od -Ax -t x1 -j 0x1000 -N 0x2000 var/install.db/ib_logfile0 001000 00 00 00 00 00 00 bd 94 00 00 00 00 00 00 bd 94 001010 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 * 001030 00 00 00 00 00 00 00 00 00 00 00 00 56 9a be 9f 001040 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 * 002000 00 00 00 00 00 00 bd 1c 00 00 00 00 00 00 bd 1c 002010 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 * 002030 00 00 00 00 00 00 00 00 00 00 00 00 98 c0 7e 40 002040 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 * In the MDEV-14425 format, the LSN is conveniently equal to the file offset, provided that the LSN is smaller than the file size in bytes (and the first LSN, stored at byte offset 8, is 0x3000, which is the start of the log record payload area). The checkpoint block at 0x1000 is newer than the one at 0x2000, pointing to LSN=48532 (0xbd94). Let us check the end of the log to cover also the older checkpoint 0xbd1c: od -Ax -t x1 -j 0xbd00 var/install.db/ib_logfile0 00bd00 80 d9 ae 48 00 80 73 aa 72 80 40 00 01 8e 20 38 00bd10 77 b0 12 01 00 2e 2f 6d 79 73 71 6c 2f 69 6e 6e 00bd20 6f 64 62 5f 74 61 62 6c 65 5f 73 74 61 74 73 2e 00bd30 69 62 64 b0 12 02 00 2e 2f 6d 79 73 71 6c 2f 69 00bd40 6e 6e 6f 64 62 5f 69 6e 64 65 78 5f 73 74 61 74 00bd50 73 2e 69 62 64 b0 14 03 00 2e 2f 6d 79 73 71 6c 00bd60 2f 74 72 61 6e 73 61 63 74 69 6f 6e 5f 72 65 67 00bd70 69 73 74 72 79 2e 69 62 64 fa 00 00 00 00 00 00 00bd80 00 00 bd 1c 01 49 15 1d 68 fa 00 00 00 00 00 00 00bd90 00 00 bd 94 01 71 21 b1 62 00 39 00 30 32 ff ff 00bda0 ff ff 00 00 b7 00 ff ff ff ff 00 00 34 00 30 31 00bdb0 00 c3 16 04 ff b3 8f 83 0c 47 00 80 73 aa b2 04 00bdc0 ff 34 00 00 3d 33 03 00 80 ad 35 00 00 80 d9 ae 00bdd0 48 00 80 73 aa 72 80 40 00 01 b8 14 5a 2e 39 00 00bde0 31 32 ff ff ff ff 00 00 b7 00 ff ff ff ff 00 00 00bdf0 34 00 31 31 00 c3 16 04 ff b3 8f 83 0c 47 00 80 00be00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 * a00000 One of the LSNs in the checkpoint blocks should point to a possibly empty sequence of FILE_MODIFY records that is terminated by a FILE_CHECKPOINT record. The FILE_CHECKPOINT record is encoded as fa 00 00 followed by the 8-byte checkpoint LSN. In the MDEV-14425 format it is followed by an end-of-mini-transaction marker ( 00 or in this case 01 ) and a CRC-32C of the mini-transaction. Above, we can see exactly two FILE_CHECKPOINT records: 00bd70 __ __ __ __ __ __ __ __ __ fa 00 00 00 00 00 00 00bd80 00 00 bd 1c 01 49 15 1d 68 fa 00 00 00 00 00 00 00bd90 00 00 bd 94 01 71 21 b1 62 00 __ __ __ __ __ __ The record at byte offset 0xbd79 contains the LSN 0xbd1c, which does occur in the second checkpoint block at 0x2000. The record at byte offset 0xbd89 contains the LSN 0xbd94, which occurs in the first checkpoint block at 0x1000. So, the checkpoint blocks are storing the wrong LSN. For performance reasons, log_t::write_buf() is not clearing the unused part of the buffer when it is writing 512 bytes (note: MDEV-33894 will reintroduce a parameter to control this size). It only writes an extra NUL byte at the end (0xbd99). The rest of the 512-block is expected to be garbage. On shutdown (of the server bootstrap in this case), the very last mini-transaction is supposed to be a single FILE_CHECKPOINT record that stores the final LSN. It is somewhat strange that the buffer is filled with garbage already at this early point. The default value of innodb_log_buffer_size is 16 MiB, and also its minimum value is 2 MiB. We only used 0xbd99-0x3000 = 36249 bytes of that buffer so far. I would have expected the operating system to have zero-initialized the buffer. In a healthy file (generated on my AMD64 based Debian GNU/Linux system), an mysql-test/var/install.db/ib_logfile0 would look like this: 001000 00 00 00 00 00 00 e3 0c 00 00 00 00 00 00 e3 0c 001010 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 * 001030 00 00 00 00 00 00 00 00 00 00 00 00 70 76 77 a5 … 00e300 00 00 00 00 00 e2 94 01 66 91 43 a7 fa 00 00 00 00e310 00 00 00 00 00 e3 0c 01 5d 59 b0 b5 00 00 00 00 00e320 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 * a00000 Here, the first checkpoint block is newer, and at file offset 0xe30c we really have the FILE_CHECKPOINT record for 0xe30c followed by the 0x01 marker (it would be 0x00 if the file had wrapped around an odd number of times) and the bit-reflected CRC-32C of the FILE_CHECKPOINT record. For some reason, on macOS, the file offsets do not match the checkpoint LSNs at this stage.

            I see similar failures on IBM AIX (which only recently started to work on MariaDB Server 10.7 or later), but I have not analyzed any ib_logfile0 contents yet.

            On FreeBSD there have not been such issues.

            marko Marko Mäkelä added a comment - I see similar failures on IBM AIX (which only recently started to work on MariaDB Server 10.7 or later), but I have not analyzed any ib_logfile0 contents yet. On FreeBSD there have not been such issues.

            I figured out a natural explanation for the garbage bytes at the end of the 512-byte block. My ”healthy file” is with the memory-mapped log interface and not log_t::write_buf(). That one would swap log_sys.buf and log_sys.flush_buf, and any previous contents of the buffer will not be cleared. We only write an extra NUL byte at the end of the used part of the block to make the log parser (crash recovery on server startup) to detect the end of the log faster.

            I can reproduce this in our IBM AIX 7.1 environment.

            10.11 c37b2a9f045f5170189e1ed18a7d177b61beab93

            innodb.innodb                            [ skipped ]  Test requires InnoDB with 16k Page size.
            line
            2024-06-19  8:47:49 0 [ERROR] InnoDB: Missing FILE_CHECKPOINT(48469) at 48469
            2024-06-19  8:47:49 0 [ERROR] InnoDB: Log scan aborted at LSN 48469
            2024-06-19  8:47:49 0 [ERROR] InnoDB: Plugin initialization aborted with error Generic error
            2024-06-19  8:47:50 0 [ERROR] Plugin 'InnoDB' registration as a STORAGE ENGINE failed.
            ^ Found warnings in /home/mdborg/build-mariadb/mysql-test/var/log/mysqld.1.err
            

            For the above failure, we have the following in the file:

            001000 00 00 00 00 00 00 bd 55 00 00 00 00 00 00 bd 55
            001010 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
            *
            001030 00 00 00 00 00 00 00 00 00 00 00 00 61 9a 53 ef
            001040 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
            *
            002000 00 00 00 00 00 00 bc dd 00 00 00 00 00 00 bc dd
            002010 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
            *
            002030 00 00 00 00 00 00 00 00 00 00 00 00 e5 3d 44 77
            002040 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
            *
            003000
            00bcd0 6d 79 73 71 6c 2f 69 6e 6e 6f 64 62 5f 74 61 62
            00bce0 6c 65 5f 73 74 61 74 73 2e 69 62 64 b0 12 02 00
            00bcf0 2e 2f 6d 79 73 71 6c 2f 69 6e 6e 6f 64 62 5f 69
            00bd00 6e 64 65 78 5f 73 74 61 74 73 2e 69 62 64 b0 14
            00bd10 03 00 2e 2f 6d 79 73 71 6c 2f 74 72 61 6e 73 61
            00bd20 63 74 69 6f 6e 5f 72 65 67 69 73 74 72 79 2e 69
            00bd30 62 64 fa 00 00 00 00 00 00 00 00 bc dd 01 6b 51
            00bd40 20 d8 fa 00 00 00 00 00 00 00 00 bd 55 01 40 c7
            00bd50 14 a5 00 32 03 0e 00 0e 00 00 00 00 00 00 e0 80
            00bd60 00 00 00 00 00 00 05 6d 79 73 71 6c 0e 67 74 69
            00bd70 64 5f 73 6c 61 76 65 5f 70 6f 73 00 19 00 05 6d
            00bd80 79 73 71 6c 01 0e 67 74 69 64 5f 73 6c 61 76 65
            00bd90 5f 70 6f 73 01 20 ad 94 c9 34 01 03 7a 20 b9 1c
            00bda0 0d 06 00 00 01 2d 01 10 01 6c 87 59 09 20 61 00
            00bdb0 80 ad 03 0e 01 0f 00 00 00 00 00 00 e0 80 00 00
            00bdc0 00 00 00 00 05 6d 79 73 71 6c 0e 67 74 69 64 5f
            00bdd0 73 6c 61 76 65 5f 70 6f 73 07 50 52 49 4d 41 52
            00bde0 59 0c 6e 5f 64 69 66 66 5f 70 66 78 30 31 00 30
            00bdf0 00 05 6d 79 73 71 6c 01 0e 67 74 69 64 5f 73 6c
            00be00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
            *
            a00000
            

            The second-last checkpoint (the block at 0x2000) is pointing to the byte offset 0xbcdd. But that is in the middle of what looks like a FILE_MODIFY record. There is a FILE_CHECKPOINT record for 0xbcdd starting at 0xbd32.

            Likewise, the last checkpoint (the block at 0x1000) is pointing to the byte offset 0xbd55, while the FILE_CHECKPOINT record for that was written earlier, at 0xbd42. There is a NUL end-marker byte written at 0xbd53, and the rest of the 512-byte block is garbage.

            I think that I should be able to narrow this down with some debug printouts and assertions. I have been a bit too spoiled with https://rr-project.org; on this platform I have to resort to old-fashioned methods.

            marko Marko Mäkelä added a comment - I figured out a natural explanation for the garbage bytes at the end of the 512-byte block. My ”healthy file” is with the memory-mapped log interface and not log_t::write_buf() . That one would swap log_sys.buf and log_sys.flush_buf , and any previous contents of the buffer will not be cleared. We only write an extra NUL byte at the end of the used part of the block to make the log parser (crash recovery on server startup) to detect the end of the log faster. I can reproduce this in our IBM AIX 7.1 environment. 10.11 c37b2a9f045f5170189e1ed18a7d177b61beab93 innodb.innodb [ skipped ] Test requires InnoDB with 16k Page size. line 2024-06-19 8:47:49 0 [ERROR] InnoDB: Missing FILE_CHECKPOINT(48469) at 48469 2024-06-19 8:47:49 0 [ERROR] InnoDB: Log scan aborted at LSN 48469 2024-06-19 8:47:49 0 [ERROR] InnoDB: Plugin initialization aborted with error Generic error 2024-06-19 8:47:50 0 [ERROR] Plugin 'InnoDB' registration as a STORAGE ENGINE failed. ^ Found warnings in /home/mdborg/build-mariadb/mysql-test/var/log/mysqld.1.err For the above failure, we have the following in the file: 001000 00 00 00 00 00 00 bd 55 00 00 00 00 00 00 bd 55 001010 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 * 001030 00 00 00 00 00 00 00 00 00 00 00 00 61 9a 53 ef 001040 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 * 002000 00 00 00 00 00 00 bc dd 00 00 00 00 00 00 bc dd 002010 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 * 002030 00 00 00 00 00 00 00 00 00 00 00 00 e5 3d 44 77 002040 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 * 003000 … 00bcd0 6d 79 73 71 6c 2f 69 6e 6e 6f 64 62 5f 74 61 62 00bce0 6c 65 5f 73 74 61 74 73 2e 69 62 64 b0 12 02 00 00bcf0 2e 2f 6d 79 73 71 6c 2f 69 6e 6e 6f 64 62 5f 69 00bd00 6e 64 65 78 5f 73 74 61 74 73 2e 69 62 64 b0 14 00bd10 03 00 2e 2f 6d 79 73 71 6c 2f 74 72 61 6e 73 61 00bd20 63 74 69 6f 6e 5f 72 65 67 69 73 74 72 79 2e 69 00bd30 62 64 fa 00 00 00 00 00 00 00 00 bc dd 01 6b 51 00bd40 20 d8 fa 00 00 00 00 00 00 00 00 bd 55 01 40 c7 00bd50 14 a5 00 32 03 0e 00 0e 00 00 00 00 00 00 e0 80 00bd60 00 00 00 00 00 00 05 6d 79 73 71 6c 0e 67 74 69 00bd70 64 5f 73 6c 61 76 65 5f 70 6f 73 00 19 00 05 6d 00bd80 79 73 71 6c 01 0e 67 74 69 64 5f 73 6c 61 76 65 00bd90 5f 70 6f 73 01 20 ad 94 c9 34 01 03 7a 20 b9 1c 00bda0 0d 06 00 00 01 2d 01 10 01 6c 87 59 09 20 61 00 00bdb0 80 ad 03 0e 01 0f 00 00 00 00 00 00 e0 80 00 00 00bdc0 00 00 00 00 05 6d 79 73 71 6c 0e 67 74 69 64 5f 00bdd0 73 6c 61 76 65 5f 70 6f 73 07 50 52 49 4d 41 52 00bde0 59 0c 6e 5f 64 69 66 66 5f 70 66 78 30 31 00 30 00bdf0 00 05 6d 79 73 71 6c 01 0e 67 74 69 64 5f 73 6c 00be00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 * a00000 The second-last checkpoint (the block at 0x2000) is pointing to the byte offset 0xbcdd. But that is in the middle of what looks like a FILE_MODIFY record. There is a FILE_CHECKPOINT record for 0xbcdd starting at 0xbd32. Likewise, the last checkpoint (the block at 0x1000) is pointing to the byte offset 0xbd55, while the FILE_CHECKPOINT record for that was written earlier, at 0xbd42. There is a NUL end-marker byte written at 0xbd53, and the rest of the 512-byte block is garbage. I think that I should be able to narrow this down with some debug printouts and assertions. I have been a bit too spoiled with https://rr-project.org ; on this platform I have to resort to old-fashioned methods.

            The compile&retest cycle on this AIX machine is really slow, and my first attempt at catching a mismatch was not successful:

            diff --git a/storage/innobase/log/log0log.cc b/storage/innobase/log/log0log.cc
            index ea717de226a..1990ae04364 100644
            --- a/storage/innobase/log/log0log.cc
            +++ b/storage/innobase/log/log0log.cc
            @@ -119,6 +119,8 @@ bool log_t::create()
                 goto alloc_fail;
               }
             
            +  memset(buf, 0, buf_size);
            +  memset(flush_buf, 0, buf_size);
               TRASH_ALLOC(buf, buf_size);
               TRASH_ALLOC(flush_buf, buf_size);
               checkpoint_buf= static_cast<byte*>(aligned_malloc(4096, 4096));
            @@ -829,6 +831,7 @@ template<bool release_latch> inline lsn_t log_t::write_buf() noexcept
                 ut_ad(write_lsn >= get_flushed_lsn());
                 const size_t block_size_1{get_block_size() - 1};
                 lsn_t offset{calc_lsn_offset(write_lsn) & ~lsn_t{block_size_1}};
            +    ut_a(!(size_t(write_lsn ^ offset) & ~block_size_1));
             
                 DBUG_PRINT("ib_log", ("write " LSN_PF " to " LSN_PF " at " LSN_PF,
                                       write_lsn, lsn, offset));
            

            For a debug build (this one was CMAKE_BUILD_TYPE=RelWithDebInfo), I think that the following invocation could be useful, to enable the above DBUG_PRINT output:

            ./mtr --mysqld=--debug-dbug=+d,ib_log innodb.innodb
            

            I think that this could be more practical to debug in an Apple environment. Hopefully a colleague who has suitable hardware will volunteer to debug and fix this.

            By the way, the failure is sporadic on the AIX environment, and it seems to be so also on macOS, because sometimes it fails straight after bootstrap, sometimes after a later restart.

            ==============================================================================
             
            TEST                                      RESULT   TIME (ms) or COMMENT
            --------------------------------------------------------------------------
             
            innodb.innodb                            [ pass ]   3929
            --------------------------------------------------------------------------
            The servers were restarted 0 times
            Spent 3.929 of 12 seconds executing testcases
             
            Completed: All 1 tests were successful.
            

            marko Marko Mäkelä added a comment - The compile&retest cycle on this AIX machine is really slow, and my first attempt at catching a mismatch was not successful: diff --git a/storage/innobase/log/log0log.cc b/storage/innobase/log/log0log.cc index ea717de226a..1990ae04364 100644 --- a/storage/innobase/log/log0log.cc +++ b/storage/innobase/log/log0log.cc @@ -119,6 +119,8 @@ bool log_t::create() goto alloc_fail; } + memset(buf, 0, buf_size); + memset(flush_buf, 0, buf_size); TRASH_ALLOC(buf, buf_size); TRASH_ALLOC(flush_buf, buf_size); checkpoint_buf= static_cast<byte*>(aligned_malloc(4096, 4096)); @@ -829,6 +831,7 @@ template<bool release_latch> inline lsn_t log_t::write_buf() noexcept ut_ad(write_lsn >= get_flushed_lsn()); const size_t block_size_1{get_block_size() - 1}; lsn_t offset{calc_lsn_offset(write_lsn) & ~lsn_t{block_size_1}}; + ut_a(!(size_t(write_lsn ^ offset) & ~block_size_1)); DBUG_PRINT("ib_log", ("write " LSN_PF " to " LSN_PF " at " LSN_PF, write_lsn, lsn, offset)); For a debug build (this one was CMAKE_BUILD_TYPE=RelWithDebInfo ), I think that the following invocation could be useful, to enable the above DBUG_PRINT output: ./mtr --mysqld=--debug-dbug=+d,ib_log innodb.innodb I think that this could be more practical to debug in an Apple environment. Hopefully a colleague who has suitable hardware will volunteer to debug and fix this. By the way, the failure is sporadic on the AIX environment, and it seems to be so also on macOS, because sometimes it fails straight after bootstrap, sometimes after a later restart. ==============================================================================   TEST RESULT TIME (ms) or COMMENT --------------------------------------------------------------------------   innodb.innodb [ pass ] 3929 -------------------------------------------------------------------------- The servers were restarted 0 times Spent 3.929 of 12 seconds executing testcases   Completed: All 1 tests were successful.
            naizhao Sam added a comment -

            tried to patch and rebuild, not working

            240619 18:23:59 mysqld_safe mysqld from pid file /Applications/ServBay/tmp/mariadb-11.5.pid ended
            240619 18:24:01 mysqld_safe Starting mariadbd daemon with databases from /Applications/ServBay/db/mariadb/11.5
            2024-06-19 18:24:01 0 [Note] Starting MariaDB 11.5.1-MariaDB-log source revision 65079ece42f383bdeb9b3f913a833a6f0123919d as process 71316
            2024-06-19 18:24:01 0 [Warning] Setting lower_case_table_names=2 because file system for /Applications/ServBay/db/mariadb/11.5/ is case insensitive
            2024-06-19 18:24:01 0 [Warning] --innodb-file-per-table is deprecated and will be removed in a future release
            2024-06-19 18:24:01 0 [Note] InnoDB: Compressed tables use zlib 1.3.1
            2024-06-19 18:24:01 0 [Note] InnoDB: Number of transaction pools: 1
            2024-06-19 18:24:01 0 [Note] InnoDB: Using generic crc32 instructions
            2024-06-19 18:24:01 0 [Note] InnoDB: Initializing buffer pool, total size = 4.000GiB, chunk size = 64.000MiB
            2024-06-19 18:24:01 0 [Note] InnoDB: Completed initialization of buffer pool
            2024-06-19 18:24:01 0 [Note] InnoDB: End of log at LSN=47678
            2024-06-19 18:24:01 0 [Note] InnoDB: Opened 3 undo tablespaces
            2024-06-19 18:24:01 0 [Note] InnoDB: 128 rollback segments in 3 undo tablespaces are active.
            2024-06-19 18:24:01 0 [Note] InnoDB: Setting file '/Applications/ServBay/db/mariadb/11.5/ibtmp1' size to 12.000MiB. Physically writing the file full; Please wait ...
            2024-06-19 18:24:01 0 [Note] InnoDB: File '/Applications/ServBay/db/mariadb/11.5/ibtmp1' size is now 12.000MiB.
            2024-06-19 18:24:01 0 [Note] InnoDB: log sequence number 47678; transaction id 14
            2024-06-19 18:24:01 0 [Note] InnoDB: Loading buffer pool(s) from /Applications/ServBay/db/mariadb/11.5/ib_buffer_pool
            2024-06-19 18:24:01 0 [Note] Plugin 'FEEDBACK' is disabled.
            2024-06-19 18:24:01 0 [Note] Plugin 'wsrep-provider' is disabled.
            2024-06-19 18:24:01 0 [Note] InnoDB: Buffer pool(s) load completed at 240619 18:24:01
            2024-06-19 18:24:02 0 [Note] Server socket created on IP: '::1'.
            2024-06-19 18:24:02 0 [Note] Server socket created on IP: '127.0.0.1'.
            2024-06-19 18:24:02 0 [Note] Server socket created on IP: '::'.
            2024-06-19 18:24:02 0 [Note] Server socket created on IP: '0.0.0.0'.
            2024-06-19 18:24:02 0 [Note] mariadbd: Event Scheduler: Loaded 0 events
            2024-06-19 18:24:02 0 [Note] /Applications/ServBay/package/mariadb/11.5/current/bin/mariadbd: ready for connections.
            Version: '11.5.1-MariaDB-log'  socket: '/Applications/ServBay/tmp/mariadb-11.5.sock'  port: 3306  ServBay.dev
             
             
             
             
             
            2024-06-19 18:25:15 0 [Note] /Applications/ServBay/package/mariadb/11.5/current/bin/mariadbd (initiated by: unknown): Normal shutdown
            2024-06-19 18:25:15 0 [Note] InnoDB: FTS optimize thread exiting.
            2024-06-19 18:25:15 0 [Note] InnoDB: Starting shutdown...
            2024-06-19 18:25:15 0 [Note] InnoDB: Dumping buffer pool(s) to /Applications/ServBay/db/mariadb/11.5/ib_buffer_pool
            2024-06-19 18:25:15 0 [Note] InnoDB: Buffer pool(s) dump completed at 240619 18:25:15
            2024-06-19 18:25:15 0 [Note] InnoDB: Removed temporary tablespace data file: "/Applications/ServBay/db/mariadb/11.5/ibtmp1"
            2024-06-19 18:25:15 0 [Note] InnoDB: Shutdown completed; log sequence number 10231495; transaction id 620
            2024-06-19 18:25:15 0 [Note] /Applications/ServBay/package/mariadb/11.5/current/bin/mariadbd: Shutdown complete
             
            240619 18:25:15 mysqld_safe mysqld from pid file /Applications/ServBay/tmp/mariadb-11.5.pid ended
            240619 18:25:16 mysqld_safe Starting mariadbd daemon with databases from /Applications/ServBay/db/mariadb/11.5
            2024-06-19 18:25:16 0 [Note] Starting MariaDB 11.5.1-MariaDB-log source revision 65079ece42f383bdeb9b3f913a833a6f0123919d as process 71590
            2024-06-19 18:25:16 0 [Warning] Setting lower_case_table_names=2 because file system for /Applications/ServBay/db/mariadb/11.5/ is case insensitive
            2024-06-19 18:25:16 0 [Warning] --innodb-file-per-table is deprecated and will be removed in a future release
            2024-06-19 18:25:16 0 [Note] InnoDB: Compressed tables use zlib 1.3.1
            2024-06-19 18:25:16 0 [Note] InnoDB: Number of transaction pools: 1
            2024-06-19 18:25:16 0 [Note] InnoDB: Using generic crc32 instructions
            2024-06-19 18:25:16 0 [Note] InnoDB: Initializing buffer pool, total size = 4.000GiB, chunk size = 64.000MiB
            2024-06-19 18:25:16 0 [Note] InnoDB: Completed initialization of buffer pool
            2024-06-19 18:25:16 0 [ERROR] InnoDB: Missing FILE_CHECKPOINT(10231479) at 10231479
            2024-06-19 18:25:16 0 [ERROR] InnoDB: Log scan aborted at LSN 10231479
            2024-06-19 18:25:16 0 [ERROR] InnoDB: Plugin initialization aborted with error Generic error
            2024-06-19 18:25:16 0 [Note] InnoDB: Starting shutdown...
            2024-06-19 18:25:16 0 [ERROR] Plugin 'InnoDB' registration as a STORAGE ENGINE failed.
            2024-06-19 18:25:16 0 [Note] Plugin 'FEEDBACK' is disabled.
            2024-06-19 18:25:16 0 [Note] Plugin 'wsrep-provider' is disabled.
            2024-06-19 18:25:16 0 [ERROR] Unknown/unsupported storage engine: InnoDB
            2024-06-19 18:25:16 0 [ERROR] Aborting
            240619 18:25:16 mysqld_safe mysqld from pid file /Applications/ServBay/tmp/mariadb-11.5.pid ended
            
            

            naizhao Sam added a comment - tried to patch and rebuild, not working 240619 18:23:59 mysqld_safe mysqld from pid file /Applications/ServBay/tmp/mariadb-11 .5.pid ended 240619 18:24:01 mysqld_safe Starting mariadbd daemon with databases from /Applications/ServBay/db/mariadb/11 .5 2024-06-19 18:24:01 0 [Note] Starting MariaDB 11.5.1-MariaDB-log source revision 65079ece42f383bdeb9b3f913a833a6f0123919d as process 71316 2024-06-19 18:24:01 0 [Warning] Setting lower_case_table_names=2 because file system for /Applications/ServBay/db/mariadb/11 .5/ is case insensitive 2024-06-19 18:24:01 0 [Warning] --innodb- file -per-table is deprecated and will be removed in a future release 2024-06-19 18:24:01 0 [Note] InnoDB: Compressed tables use zlib 1.3.1 2024-06-19 18:24:01 0 [Note] InnoDB: Number of transaction pools: 1 2024-06-19 18:24:01 0 [Note] InnoDB: Using generic crc32 instructions 2024-06-19 18:24:01 0 [Note] InnoDB: Initializing buffer pool, total size = 4.000GiB, chunk size = 64.000MiB 2024-06-19 18:24:01 0 [Note] InnoDB: Completed initialization of buffer pool 2024-06-19 18:24:01 0 [Note] InnoDB: End of log at LSN=47678 2024-06-19 18:24:01 0 [Note] InnoDB: Opened 3 undo tablespaces 2024-06-19 18:24:01 0 [Note] InnoDB: 128 rollback segments in 3 undo tablespaces are active. 2024-06-19 18:24:01 0 [Note] InnoDB: Setting file '/Applications/ServBay/db/mariadb/11.5/ibtmp1' size to 12.000MiB. Physically writing the file full; Please wait ... 2024-06-19 18:24:01 0 [Note] InnoDB: File '/Applications/ServBay/db/mariadb/11.5/ibtmp1' size is now 12.000MiB. 2024-06-19 18:24:01 0 [Note] InnoDB: log sequence number 47678; transaction id 14 2024-06-19 18:24:01 0 [Note] InnoDB: Loading buffer pool(s) from /Applications/ServBay/db/mariadb/11 .5 /ib_buffer_pool 2024-06-19 18:24:01 0 [Note] Plugin 'FEEDBACK' is disabled. 2024-06-19 18:24:01 0 [Note] Plugin 'wsrep-provider' is disabled. 2024-06-19 18:24:01 0 [Note] InnoDB: Buffer pool(s) load completed at 240619 18:24:01 2024-06-19 18:24:02 0 [Note] Server socket created on IP: '::1' . 2024-06-19 18:24:02 0 [Note] Server socket created on IP: '127.0.0.1' . 2024-06-19 18:24:02 0 [Note] Server socket created on IP: '::' . 2024-06-19 18:24:02 0 [Note] Server socket created on IP: '0.0.0.0' . 2024-06-19 18:24:02 0 [Note] mariadbd: Event Scheduler: Loaded 0 events 2024-06-19 18:24:02 0 [Note] /Applications/ServBay/package/mariadb/11 .5 /current/bin/mariadbd : ready for connections. Version: '11.5.1-MariaDB-log' socket: '/Applications/ServBay/tmp/mariadb-11.5.sock' port: 3306 ServBay.dev           2024-06-19 18:25:15 0 [Note] /Applications/ServBay/package/mariadb/11 .5 /current/bin/mariadbd (initiated by: unknown): Normal shutdown 2024-06-19 18:25:15 0 [Note] InnoDB: FTS optimize thread exiting. 2024-06-19 18:25:15 0 [Note] InnoDB: Starting shutdown ... 2024-06-19 18:25:15 0 [Note] InnoDB: Dumping buffer pool(s) to /Applications/ServBay/db/mariadb/11 .5 /ib_buffer_pool 2024-06-19 18:25:15 0 [Note] InnoDB: Buffer pool(s) dump completed at 240619 18:25:15 2024-06-19 18:25:15 0 [Note] InnoDB: Removed temporary tablespace data file : "/Applications/ServBay/db/mariadb/11.5/ibtmp1" 2024-06-19 18:25:15 0 [Note] InnoDB: Shutdown completed; log sequence number 10231495; transaction id 620 2024-06-19 18:25:15 0 [Note] /Applications/ServBay/package/mariadb/11 .5 /current/bin/mariadbd : Shutdown complete   240619 18:25:15 mysqld_safe mysqld from pid file /Applications/ServBay/tmp/mariadb-11 .5.pid ended 240619 18:25:16 mysqld_safe Starting mariadbd daemon with databases from /Applications/ServBay/db/mariadb/11 .5 2024-06-19 18:25:16 0 [Note] Starting MariaDB 11.5.1-MariaDB-log source revision 65079ece42f383bdeb9b3f913a833a6f0123919d as process 71590 2024-06-19 18:25:16 0 [Warning] Setting lower_case_table_names=2 because file system for /Applications/ServBay/db/mariadb/11 .5/ is case insensitive 2024-06-19 18:25:16 0 [Warning] --innodb- file -per-table is deprecated and will be removed in a future release 2024-06-19 18:25:16 0 [Note] InnoDB: Compressed tables use zlib 1.3.1 2024-06-19 18:25:16 0 [Note] InnoDB: Number of transaction pools: 1 2024-06-19 18:25:16 0 [Note] InnoDB: Using generic crc32 instructions 2024-06-19 18:25:16 0 [Note] InnoDB: Initializing buffer pool, total size = 4.000GiB, chunk size = 64.000MiB 2024-06-19 18:25:16 0 [Note] InnoDB: Completed initialization of buffer pool 2024-06-19 18:25:16 0 [ERROR] InnoDB: Missing FILE_CHECKPOINT(10231479) at 10231479 2024-06-19 18:25:16 0 [ERROR] InnoDB: Log scan aborted at LSN 10231479 2024-06-19 18:25:16 0 [ERROR] InnoDB: Plugin initialization aborted with error Generic error 2024-06-19 18:25:16 0 [Note] InnoDB: Starting shutdown ... 2024-06-19 18:25:16 0 [ERROR] Plugin 'InnoDB' registration as a STORAGE ENGINE failed. 2024-06-19 18:25:16 0 [Note] Plugin 'FEEDBACK' is disabled. 2024-06-19 18:25:16 0 [Note] Plugin 'wsrep-provider' is disabled. 2024-06-19 18:25:16 0 [ERROR] Unknown /unsupported storage engine: InnoDB 2024-06-19 18:25:16 0 [ERROR] Aborting 240619 18:25:16 mysqld_safe mysqld from pid file /Applications/ServBay/tmp/mariadb-11 .5.pid ended
            naizhao Sam added a comment -

            If I use the db file created in 11.5.0, it will be normal if I close, restart and then run it on 11.5.1. However, if I use the db file created in 11.5.1, import data, close it and then run it, this bug will definitely be triggered.

            naizhao Sam added a comment - If I use the db file created in 11.5.0, it will be normal if I close, restart and then run it on 11.5.1. However, if I use the db file created in 11.5.1, import data, close it and then run it, this bug will definitely be triggered.

            naizhao, the patch that I posted was an attempt to get a crash during an incorrect write, so that the situation can be explored closer to the moment where the error was triggered. I thought that the patch would give ideas to a developer who can debug this in a macOS environment. Another idea would be to trace the system calls. The problem is detected at database startup, but it was caused during the preceding invocation of the server.

            Between the MariaDB Server 11.5.0 preview release and the 11.5.1 release, there are not that many InnoDB differences. Because this failure is nondeterministic (not every restart fails), I would expect also the 11.5.0 preview release to be affected by this, if you just tested it enough many times. If the 11.5.0 preview really is not affected by this, then the culprit candidates would include MDEV-33585 and MDEV-33515.

            marko Marko Mäkelä added a comment - naizhao , the patch that I posted was an attempt to get a crash during an incorrect write, so that the situation can be explored closer to the moment where the error was triggered. I thought that the patch would give ideas to a developer who can debug this in a macOS environment. Another idea would be to trace the system calls . The problem is detected at database startup, but it was caused during the preceding invocation of the server. Between the MariaDB Server 11.5.0 preview release and the 11.5.1 release, there are not that many InnoDB differences. Because this failure is nondeterministic (not every restart fails), I would expect also the 11.5.0 preview release to be affected by this, if you just tested it enough many times. If the 11.5.0 preview really is not affected by this, then the culprit candidates would include MDEV-33585 and MDEV-33515 .
            rec Richard Eckart de Castilho added a comment - - edited

            I have this problem quite consistently with MariaDB 11.4.2 from homebrew on macos Sonoma on Apple Silicon.

            cf MDEV-34350

            rec Richard Eckart de Castilho added a comment - - edited I have this problem quite consistently with MariaDB 11.4.2 from homebrew on macos Sonoma on Apple Silicon. cf MDEV-34350
            webtech Olivier added a comment -

            I have the same problem with mariadb: stable 11.4.2 (bottled) on macOS Sonoma 14.5 (apple silicon M2)

            webtech Olivier added a comment - I have the same problem with mariadb: stable 11.4.2 (bottled) on macOS Sonoma 14.5 (apple silicon M2)
            Gosselin Dave Gosselin added a comment - - edited

            I can easily reproduce this failure on an M1 Apple Silicon mac by running the innodb test suite in mtr. I first suspected case-insensitivity to be the problem, but provisioning a HFSx ramdisk and directing MTR to use it (with --tmpdir and --vardir) did not cause the error to stop appearing.
            I can also readily reproduce this with the customer-provided attachment.

            Gosselin Dave Gosselin added a comment - - edited I can easily reproduce this failure on an M1 Apple Silicon mac by running the innodb test suite in mtr. I first suspected case-insensitivity to be the problem, but provisioning a HFSx ramdisk and directing MTR to use it (with --tmpdir and --vardir) did not cause the error to stop appearing. I can also readily reproduce this with the customer-provided attachment.
            Gosselin Dave Gosselin added a comment -

            Using the customer-supplied attachment, I can repro on Rocky Linux 9, both x86_64 and aarch64. I built the 11.5.1 tagged release as RelWithDebInfo, then pointed --datadir to the customer-provided attachment (after fixing up the permissions as all files are read-only). This is from my x86_64 Rocky Linux 9s system:

            [dgosselin]$ ./mariadbd --datadir /Users/dgosselin/install.db
            2024-06-26 15:24:33 0 [Note] Starting MariaDB 11.5.1-MariaDB source revision 65079ece42f383bdeb9b3f913a833a6f0123919d as process 25597
            2024-06-26 15:24:33 0 [Note] InnoDB: Compressed tables use zlib 1.2.11
            2024-06-26 15:24:33 0 [Note] InnoDB: Using transactional memory
            2024-06-26 15:24:33 0 [Note] InnoDB: Number of transaction pools: 1
            2024-06-26 15:24:33 0 [Note] InnoDB: Using crc32 + pclmulqdq instructions
            2024-06-26 15:24:33 0 [Note] InnoDB: Initializing buffer pool, total size = 128.000MiB, chunk size = 2.000MiB
            2024-06-26 15:24:33 0 [Note] InnoDB: Completed initialization of buffer pool
            2024-06-26 15:24:33 0 [Note] InnoDB: File system buffers for log disabled (block size=512 bytes)
            2024-06-26 15:24:33 0 [ERROR] InnoDB: Missing FILE_CHECKPOINT(48532) at 48532
            2024-06-26 15:24:33 0 [ERROR] InnoDB: Log scan aborted at LSN 48532
            2024-06-26 15:24:33 0 [ERROR] InnoDB: Plugin initialization aborted with error Generic error
            2024-06-26 15:24:33 0 [Note] InnoDB: Starting shutdown...
            2024-06-26 15:24:33 0 [ERROR] Plugin 'InnoDB' registration as a STORAGE ENGINE failed.
            2024-06-26 15:24:33 0 [Note] CONNECT: Version 1.07.0002 March 22, 2021
            2024-06-26 15:24:33 0 [Note] Plugin 'FEEDBACK' is disabled.
            2024-06-26 15:24:33 0 [Note] Plugin 'wsrep-provider' is disabled.
            2024-06-26 15:24:33 0 [ERROR] Unknown/unsupported storage engine: InnoDB
            2024-06-26 15:24:33 0 [ERROR] Aborting
            [dgosselin]$
            

            Gosselin Dave Gosselin added a comment - Using the customer-supplied attachment, I can repro on Rocky Linux 9, both x86_64 and aarch64. I built the 11.5.1 tagged release as RelWithDebInfo, then pointed --datadir to the customer-provided attachment (after fixing up the permissions as all files are read-only). This is from my x86_64 Rocky Linux 9s system: [dgosselin]$ ./mariadbd --datadir /Users/dgosselin/install.db 2024-06-26 15:24:33 0 [Note] Starting MariaDB 11.5.1-MariaDB source revision 65079ece42f383bdeb9b3f913a833a6f0123919d as process 25597 2024-06-26 15:24:33 0 [Note] InnoDB: Compressed tables use zlib 1.2.11 2024-06-26 15:24:33 0 [Note] InnoDB: Using transactional memory 2024-06-26 15:24:33 0 [Note] InnoDB: Number of transaction pools: 1 2024-06-26 15:24:33 0 [Note] InnoDB: Using crc32 + pclmulqdq instructions 2024-06-26 15:24:33 0 [Note] InnoDB: Initializing buffer pool, total size = 128.000MiB, chunk size = 2.000MiB 2024-06-26 15:24:33 0 [Note] InnoDB: Completed initialization of buffer pool 2024-06-26 15:24:33 0 [Note] InnoDB: File system buffers for log disabled (block size=512 bytes) 2024-06-26 15:24:33 0 [ERROR] InnoDB: Missing FILE_CHECKPOINT(48532) at 48532 2024-06-26 15:24:33 0 [ERROR] InnoDB: Log scan aborted at LSN 48532 2024-06-26 15:24:33 0 [ERROR] InnoDB: Plugin initialization aborted with error Generic error 2024-06-26 15:24:33 0 [Note] InnoDB: Starting shutdown... 2024-06-26 15:24:33 0 [ERROR] Plugin 'InnoDB' registration as a STORAGE ENGINE failed. 2024-06-26 15:24:33 0 [Note] CONNECT: Version 1.07.0002 March 22, 2021 2024-06-26 15:24:33 0 [Note] Plugin 'FEEDBACK' is disabled. 2024-06-26 15:24:33 0 [Note] Plugin 'wsrep-provider' is disabled. 2024-06-26 15:24:33 0 [ERROR] Unknown/unsupported storage engine: InnoDB 2024-06-26 15:24:33 0 [ERROR] Aborting [dgosselin]$
            Gosselin Dave Gosselin added a comment -

            Are we expecting that this should recover automatically on startup?

            Gosselin Dave Gosselin added a comment - Are we expecting that this should recover automatically on startup?

            I explained in my comment on 2024-06-19 what exactly is wrong in the ib_logfile0 in MDEV-34422-innodb-failed-install.db.tar.gz. The recovery logic is working as intended, and the log file is corrupted. Apparently the log records were being written from log_sys.buf or log_sys.flush_buf to the incorrect file offset, or the contents of those buffers was incorrect.

            The challenge is to find out what was going wrong during the time the log was written. The last change to the ib_logfile0 format was MDEV-14425 in MariaDB Server 10.8. The recovery should work in the same way in any major version after that.

            marko Marko Mäkelä added a comment - I explained in my comment on 2024-06-19 what exactly is wrong in the ib_logfile0 in MDEV-34422-innodb-failed-install.db.tar.gz . The recovery logic is working as intended, and the log file is corrupted. Apparently the log records were being written from log_sys.buf or log_sys.flush_buf to the incorrect file offset, or the contents of those buffers was incorrect. The challenge is to find out what was going wrong during the time the log was written. The last change to the ib_logfile0 format was MDEV-14425 in MariaDB Server 10.8. The recovery should work in the same way in any major version after that.
            afdzal3 Afdzal Kamal added a comment -

            i face same issue .. i notice that the data directory is set to /opt/homebrew/var/ instead of /opt/homebrew/var/mysql . However after changing it it still cannot solve the issue.

            afdzal3 Afdzal Kamal added a comment - i face same issue .. i notice that the data directory is set to /opt/homebrew/var/ instead of /opt/homebrew/var/mysql . However after changing it it still cannot solve the issue.
            ooloo ooloo added a comment -

            Maybe a coincidence maybe not: After the server restarts and runs into the FILE_CHECKPOINT issue, it can be partially revived with innodb_force_recovery=6. However some data is always corrupt. I tried it multiple times, with an empty database, then importing a previously dumped database. What I noticed: it's always the same data, which gets corrupted. It's one table, the only one, which has a generated virtual column. I found it strange, that a checkpoint fails at precisely the same byte.

            When trying to export/rescue the database in this state, it aborts at the very table with "-- MySQL server has gone away"

            ooloo ooloo added a comment - Maybe a coincidence maybe not: After the server restarts and runs into the FILE_CHECKPOINT issue, it can be partially revived with innodb_force_recovery=6. However some data is always corrupt. I tried it multiple times, with an empty database, then importing a previously dumped database. What I noticed: it's always the same data, which gets corrupted. It's one table, the only one, which has a generated virtual column. I found it strange, that a checkpoint fails at precisely the same byte. When trying to export/rescue the database in this state, it aborts at the very table with "-- MySQL server has gone away"

            I recovered my data using innodb_force_recovery as well (dumped and then restored to a fresh DB) and later found that all my foreign key contraints were gone.

            rec Richard Eckart de Castilho added a comment - I recovered my data using innodb_force_recovery as well (dumped and then restored to a fresh DB) and later found that all my foreign key contraints were gone.

            @ooloo @Richard Eckart de Castilho Yes, I think I mentioned this in the original issue - level 6 make it run, but the damage varies. First time I got away with just damaged indexes, but the next two times there was actual data loss. So relying on level 6 is a Russian roulette.

            joro_abv Georgi Georgiev added a comment - @ooloo @Richard Eckart de Castilho Yes, I think I mentioned this in the original issue - level 6 make it run, but the damage varies. First time I got away with just damaged indexes, but the next two times there was actual data loss. So relying on level 6 is a Russian roulette.

            As far as I can tell, this was caused by MDEV-33515, where I had removed the member function init_lsn_lock(). I should have remembered that on some platforms, calling pthread_mutex_init() is mandatory. Unfortunately, the AIX and macOS builders on our CI farm are experimental. At that point of time, the AIX builds consistently failed due to a configuration issue, and there were massive amounts of test failures on macOS (there still are many), so this regression was not caught by then.

            A possible work-around could be to disable the use of log_sys.lsn_lock and replace it with the pure spinlock, by setting the parameter innodb_log_spin_wait_delay to a nonzero value. (As noted in MDEV-33515, depending on the system, this will improve or degrade the performance; more CPU will be burned in user space but there could be much fewer context switches.)

            Yesterday, I experimented with unconditionally enabling SUX_LOCK_GENERIC to disable the futex-based mutex and rw-lock implementations on all platforms. I really should have done this during the development of MDEV-33515. I only tested that combination locally on my Debian GNU/Linux system.

            On Linux, in my experiment I also disabled the use of memory-mapped log on /dev/shm. This did not cause any additional failures on Linux, but on Microsoft Windows I got a nice crash when an uninitialized mutex was being acquired during a log write. Gosselin independently found that the issue is a missing initialization of the mutex log_sys.lsn_lock.

            marko Marko Mäkelä added a comment - As far as I can tell, this was caused by MDEV-33515 , where I had removed the member function init_lsn_lock() . I should have remembered that on some platforms, calling pthread_mutex_init() is mandatory. Unfortunately, the AIX and macOS builders on our CI farm are experimental. At that point of time, the AIX builds consistently failed due to a configuration issue, and there were massive amounts of test failures on macOS (there still are many), so this regression was not caught by then. A possible work-around could be to disable the use of log_sys.lsn_lock and replace it with the pure spinlock, by setting the parameter innodb_log_spin_wait_delay to a nonzero value. (As noted in MDEV-33515 , depending on the system, this will improve or degrade the performance; more CPU will be burned in user space but there could be much fewer context switches.) Yesterday, I experimented with unconditionally enabling SUX_LOCK_GENERIC to disable the futex-based mutex and rw-lock implementations on all platforms. I really should have done this during the development of MDEV-33515 . I only tested that combination locally on my Debian GNU/Linux system. On Linux, in my experiment I also disabled the use of memory-mapped log on /dev/shm . This did not cause any additional failures on Linux, but on Microsoft Windows I got a nice crash when an uninitialized mutex was being acquired during a log write. Gosselin independently found that the issue is a missing initialization of the mutex log_sys.lsn_lock .

            Thanks for the patch. I have provided my comments in https://github.com/MariaDB/server/pull/3433.

            debarun Debarun Banerjee added a comment - Thanks for the patch. I have provided my comments in https://github.com/MariaDB/server/pull/3433 .

            While analyzing this issue I found another problem with similar redo corruption symptom which can happen in general case on any platform with high concurrency. I have now filed MDEV-34689 to address the issue.

            debarun Debarun Banerjee added a comment - While analyzing this issue I found another problem with similar redo corruption symptom which can happen in general case on any platform with high concurrency. I have now filed MDEV-34689 to address the issue.

            People

              marko Marko Mäkelä
              naizhao Sam
              Votes:
              5 Vote for this issue
              Watchers:
              18 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.