Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-32115

Assertion `!recv_no_log_write' failed in bool log_checkpoint(bool)

Details

    Description

      The crash is sporadic and there is no reproducible test case yet.

      #11 0x00007fb13562271b in __assert_fail_base (fmt=0x7fb1357bac30 "%s%s%s:%u: %s%sAssertion `%s' failed.\n%n", assertion=0x55ec8b2db941 "!recv_no_log_write", file=0x55ec8b2db410 "/home/jan/mysql/10.4/storage/innobase/log/log0log.cc", line=1483, function=<optimized out>) at ./assert/assert.c:92
      #12 0x00007fb135633596 in __GI___assert_fail (assertion=0x55ec8b2db941 "!recv_no_log_write", file=0x55ec8b2db410 "/home/jan/mysql/10.4/storage/innobase/log/log0log.cc", line=1483, function=0x55ec8b2dbfbb "bool log_checkpoint(bool)") at ./assert/assert.c:101
      #13 0x000055ec8a9f56bd in log_checkpoint (sync=true) at /home/jan/mysql/10.4/storage/innobase/log/log0log.cc:1483
      #14 0x000055ec8a9f5a6c in log_make_checkpoint () at /home/jan/mysql/10.4/storage/innobase/log/log0log.cc:1550
      #15 0x000055ec8a9118a9 in sst_disable_innodb_writes () at /home/jan/mysql/10.4/storage/innobase/handler/ha_innodb.cc:1890
      #16 0x000055ec8a9119ac in innodb_disable_internal_writes (disable=true) at /home/jan/mysql/10.4/storage/innobase/handler/ha_innodb.cc:1907
      #17 0x000055ec8a55948a in plugin_disable_internal_writes (plugin=0x55ec8e15fa38, disable=0x7fb0da7fb8bc) at /home/jan/mysql/10.4/sql/handler.cc:881
      #18 0x000055ec8a2124a3 in plugin_foreach_with_mask (thd=0x0, func=0x55ec8a559442 <plugin_disable_internal_writes(THD*, plugin_ref, void*)>, type=1, state_mask=8, arg=0x7fb0da7fb8bc) at /home/jan/mysql/10.4/sql/sql_plugin.cc:2477
      #19 0x000055ec8a5594c7 in ha_disable_internal_writes (disable=true) at /home/jan/mysql/10.4/sql/handler.cc:888
      #20 0x000055ec8a8d8ed5 in sst_flush_tables (thd=0x7fb0b8000d38) at /home/jan/mysql/10.4/sql/wsrep_sst.cc:1626
      #21 0x000055ec8a8d9551 in sst_donor_thread (a=0x7fb1346fcac0) at /home/jan/mysql/10.4/sql/wsrep_sst.cc:1724
      

      Attachments

        Issue Links

          Activity

            marko I could do a PR containing exactly that change. I can't write MariaDB server repositories directly.

            janlindstrom Jan Lindström added a comment - marko I could do a PR containing exactly that change. I can't write MariaDB server repositories directly.

            janlindstrom, I thought that you could apply 10.4-recv_no_log_write.patch (which I just created for you) into a local build and run some stress tests. I do not think that it is feasible to try to reproduce this in a CI environment. I don’t think that this can be analyzed without having a good enough execution trace, such as rr record. First it could be worthwhile to come up with a stress test that increases the failure probability, and then employ rr record.

            The patch actually triggered one test failure, unrelated to Galera SST:

            10.4 4e2594768d658194220ae5678a686ed680aa3217 RelWithDebInfo with patch

            mariabackup.innodb_xa_rollback 'innodb'  w55 [ fail ]
            2023-09-28 11:02:46 0 [Note] InnoDB: 1 transactions in prepared state after recovery
            2023-09-28 11:02:46 0x7fcfb1bdb800  InnoDB: Assertion failure in file /mariadb/10.4/storage/innobase/mtr/mtr0mtr.cc line 429
            InnoDB: Failing assertion: !m_modifications || !recv_no_log_write
            

            This is most likely because I forgot to remove the ut_d() around assignments to recv_no_log_write in extra/mariabackup/xtrabackup.cc.

            marko Marko Mäkelä added a comment - janlindstrom , I thought that you could apply 10.4-recv_no_log_write.patch (which I just created for you) into a local build and run some stress tests. I do not think that it is feasible to try to reproduce this in a CI environment. I don’t think that this can be analyzed without having a good enough execution trace, such as rr record . First it could be worthwhile to come up with a stress test that increases the failure probability, and then employ rr record . The patch actually triggered one test failure, unrelated to Galera SST: 10.4 4e2594768d658194220ae5678a686ed680aa3217 RelWithDebInfo with patch mariabackup.innodb_xa_rollback 'innodb' w55 [ fail ] … 2023-09-28 11:02:46 0 [Note] InnoDB: 1 transactions in prepared state after recovery 2023-09-28 11:02:46 0x7fcfb1bdb800 InnoDB: Assertion failure in file /mariadb/10.4/storage/innobase/mtr/mtr0mtr.cc line 429 InnoDB: Failing assertion: !m_modifications || !recv_no_log_write This is most likely because I forgot to remove the ut_d() around assignments to recv_no_log_write in extra/mariabackup/xtrabackup.cc .

            Same assertion was triggered on CI but same test does not crash locally or running full suite using --rr and stack trace is useless.

            Thread 1 (Thread 0x7fa3617fa700 (LWP 16225)):
            #0  __pthread_kill (threadid=<optimized out>, signo=6) at ../sysdeps/unix/sysv/linux/pthread_kill.c:57
            #1  0x0000564798628441 in my_write_core (sig=6) at /home/jenkins/workspace/mtr-galera-4.x-mariadb-10.4/mysql-wsrep/mysys/stacktrace.c:386
            #2  0x0000564797d44da0 in handle_fatal_signal (sig=6) at /home/jenkins/workspace/mtr-galera-4.x-mariadb-10.4/mysql-wsrep/sql/signal_handler.cc:372
            #3  <signal handler called>
            #4  __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:51
            #5  0x00007fa3c2be27f1 in __GI_abort () at abort.c:79
            #6  0x00005647983bde1a in ut_dbg_assertion_failed (expr=0x5647989e306b "!recv_no_log_write", file=0x5647989e2c58 "/home/jenkins/workspace/mtr-galera-4.x-mariadb-10.4/mysql-wsrep/storage/innobase/log/log0log.cc", line=1484) at /home/jenkins/workspace/mtr-galera-4.x-mariadb-10.4/mysql-wsrep/storage/innobase/ut/ut0dbg.cc:60
            #7  0x00005647981f8d9d in log_checkpoint (sync=true) at /home/jenkins/workspace/mtr-galera-4.x-mariadb-10.4/mysql-wsrep/storage/innobase/log/log0log.cc:1484
            #8  0x00005647981f90ef in log_make_checkpoint () at /home/jenkins/workspace/mtr-galera-4.x-mariadb-10.4/mysql-wsrep/storage/innobase/log/log0log.cc:1551
            #9  0x0000564798121a4a in sst_disable_innodb_writes () at /home/jenkins/workspace/mtr-galera-4.x-mariadb-10.4/mysql-wsrep/storage/innobase/handler/ha_innodb.cc:1892
            #10 0x0000564798121aea in innodb_disable_internal_writes (disable=true) at /home/jenkins/workspace/mtr-galera-4.x-mariadb-10.4/mysql-wsrep/storage/innobase/handler/ha_innodb.cc:1909
            #11 0x0000564797d49be2 in plugin_disable_internal_writes (plugin=0x7fa34c183f20, disable=0x7fa3617f990c) at /home/jenkins/workspace/mtr-galera-4.x-mariadb-10.4/mysql-wsrep/sql/handler.cc:886
            #12 0x0000564797a20594 in plugin_foreach_with_mask (thd=0x0, func=0x564797d49b9e <plugin_disable_internal_writes(THD*, plugin_ref, void*)>, type=1, state_mask=8, arg=0x7fa3617f990c) at /home/jenkins/workspace/mtr-galera-4.x-mariadb-10.4/mysql-wsrep/sql/sql_plugin.cc:2521
            #13 0x0000564797d49c18 in ha_disable_internal_writes (disable=true) at /home/jenkins/workspace/mtr-galera-4.x-mariadb-10.4/mysql-wsrep/sql/handler.cc:893
            #14 0x00005647980ed588 in sst_flush_tables (thd=0x7fa338000d50) at /home/jenkins/workspace/mtr-galera-4.x-mariadb-10.4/mysql-wsrep/sql/wsrep_sst.cc:1640
            #15 0x00005647980edbff in sst_donor_thread (a=0x7fa3bc32ca20) at /home/jenkins/workspace/mtr-galera-4.x-mariadb-10.4/mysql-wsrep/sql/wsrep_sst.cc:1738
            #16 0x00005647980bd889 in pfs_spawn_thread (arg=0x7fa3a00b4590) at /home/jenkins/workspace/mtr-galera-4.x-mariadb-10.4/mysql-wsrep/storage/perfschema/pfs.cc:1869
            #17 0x00007fa3c3add6db in start_thread (arg=0x7fa3617fa700) at pthread_create.c:463
            #18 0x00007fa3c2cc361f in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95
            

            janlindstrom Jan Lindström added a comment - Same assertion was triggered on CI but same test does not crash locally or running full suite using --rr and stack trace is useless. Thread 1 (Thread 0x7fa3617fa700 (LWP 16225)): #0 __pthread_kill (threadid=<optimized out>, signo=6) at ../sysdeps/unix/sysv/linux/pthread_kill.c:57 #1 0x0000564798628441 in my_write_core (sig=6) at /home/jenkins/workspace/mtr-galera-4.x-mariadb-10.4/mysql-wsrep/mysys/stacktrace.c:386 #2 0x0000564797d44da0 in handle_fatal_signal (sig=6) at /home/jenkins/workspace/mtr-galera-4.x-mariadb-10.4/mysql-wsrep/sql/signal_handler.cc:372 #3 <signal handler called> #4 __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:51 #5 0x00007fa3c2be27f1 in __GI_abort () at abort.c:79 #6 0x00005647983bde1a in ut_dbg_assertion_failed (expr=0x5647989e306b "!recv_no_log_write", file=0x5647989e2c58 "/home/jenkins/workspace/mtr-galera-4.x-mariadb-10.4/mysql-wsrep/storage/innobase/log/log0log.cc", line=1484) at /home/jenkins/workspace/mtr-galera-4.x-mariadb-10.4/mysql-wsrep/storage/innobase/ut/ut0dbg.cc:60 #7 0x00005647981f8d9d in log_checkpoint (sync=true) at /home/jenkins/workspace/mtr-galera-4.x-mariadb-10.4/mysql-wsrep/storage/innobase/log/log0log.cc:1484 #8 0x00005647981f90ef in log_make_checkpoint () at /home/jenkins/workspace/mtr-galera-4.x-mariadb-10.4/mysql-wsrep/storage/innobase/log/log0log.cc:1551 #9 0x0000564798121a4a in sst_disable_innodb_writes () at /home/jenkins/workspace/mtr-galera-4.x-mariadb-10.4/mysql-wsrep/storage/innobase/handler/ha_innodb.cc:1892 #10 0x0000564798121aea in innodb_disable_internal_writes (disable=true) at /home/jenkins/workspace/mtr-galera-4.x-mariadb-10.4/mysql-wsrep/storage/innobase/handler/ha_innodb.cc:1909 #11 0x0000564797d49be2 in plugin_disable_internal_writes (plugin=0x7fa34c183f20, disable=0x7fa3617f990c) at /home/jenkins/workspace/mtr-galera-4.x-mariadb-10.4/mysql-wsrep/sql/handler.cc:886 #12 0x0000564797a20594 in plugin_foreach_with_mask (thd=0x0, func=0x564797d49b9e <plugin_disable_internal_writes(THD*, plugin_ref, void*)>, type=1, state_mask=8, arg=0x7fa3617f990c) at /home/jenkins/workspace/mtr-galera-4.x-mariadb-10.4/mysql-wsrep/sql/sql_plugin.cc:2521 #13 0x0000564797d49c18 in ha_disable_internal_writes (disable=true) at /home/jenkins/workspace/mtr-galera-4.x-mariadb-10.4/mysql-wsrep/sql/handler.cc:893 #14 0x00005647980ed588 in sst_flush_tables (thd=0x7fa338000d50) at /home/jenkins/workspace/mtr-galera-4.x-mariadb-10.4/mysql-wsrep/sql/wsrep_sst.cc:1640 #15 0x00005647980edbff in sst_donor_thread (a=0x7fa3bc32ca20) at /home/jenkins/workspace/mtr-galera-4.x-mariadb-10.4/mysql-wsrep/sql/wsrep_sst.cc:1738 #16 0x00005647980bd889 in pfs_spawn_thread (arg=0x7fa3a00b4590) at /home/jenkins/workspace/mtr-galera-4.x-mariadb-10.4/mysql-wsrep/storage/perfschema/pfs.cc:1869 #17 0x00007fa3c3add6db in start_thread (arg=0x7fa3617fa700) at pthread_create.c:463 #18 0x00007fa3c2cc361f in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95

            The assertion failure catches the fact that some log records had been written before the previous log_make_checkpoint() completed. An execution trace would reveal where that offending write happened.

            Were you able to reproduce this on 10.5 or later major versions? The page flushing and checkpoint logic is much simpler there, because everything is handled by the single buf_flush_page_cleaner thread.

            marko Marko Mäkelä added a comment - The assertion failure catches the fact that some log records had been written before the previous log_make_checkpoint() completed. An execution trace would reveal where that offending write happened. Were you able to reproduce this on 10.5 or later major versions? The page flushing and checkpoint logic is much simpler there, because everything is handled by the single buf_flush_page_cleaner thread.

            marko After several rounds and history on Jenkings I can say on some confidence this problem does not reproduce 10.5 or later major versions.

            janlindstrom Jan Lindström added a comment - marko After several rounds and history on Jenkings I can say on some confidence this problem does not reproduce 10.5 or later major versions.

            People

              janlindstrom Jan Lindström
              ramesh Ramesh Sivaraman
              Votes:
              1 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

                Created:
                Updated:

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.