Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-4322

Broken XID counting during binlog rotation

Details

    • Bug
    • Status: Closed (View Workflow)
    • Major
    • Resolution: Fixed
    • 10.0.1
    • 10.0.2
    • None

    Description

      It looks like rotation of binlogs involves some broken logic involving . After binlog rotation MYSQL_BIN_LOG::do_checkpoint_request() is called. It calls ha_commit_checkpoint_request(), and from there for each engine that has commit_checkpoint_request function binlog_checkpoint_callback() is called. There mysql_bin_log.mark_xids_active() is called and corresponding mark_xid_done() is not called from anywhere.

      I don't understand why this problem is not exposed anywhere in the test suite but I was able to hit it like this:

      cmake . -DCMAKE_BUILD_TYPE=Debug
      make
      cd mysql-test/
      echo "--innodb" >> suite/sys_vars/t/rpl_max_binlog_size_func-master.opt
      ./mtr sys_vars.rpl_max_binlog_size_func

      After that I get "void MYSQL_BIN_LOG::cleanup(): Assertion `b->xid_count == 0' failed" during shutdown after the test.

      Attachments

        Activity

          perl ./mtr sys_vars.rpl_max_binlog_size_func -mysqld=-innodb

          #7 0x00007fc5c1010192 in _GI__assert_fail (assertion=0xe4eaa4 "b->xid_count == 0", file=0xe4dd48 "/data/bzr/10.0/sql/log.cc", line=2979, function=0xe51d90 "void MYSQL_BIN_LOG::cleanup()") at assert.c:103
          #8 0x00000000008a1bb3 in MYSQL_BIN_LOG::cleanup (this=0x15d8220) at /data/bzr/10.0/sql/log.cc:2979
          #9 0x0000000000563f23 in clean_up (print_message=true) at /data/bzr/10.0/sql/mysqld.cc:1906
          #10 0x0000000000563d42 in unireg_end () at /data/bzr/10.0/sql/mysqld.cc:1837
          #11 0x0000000000563c66 in kill_server (sig_ptr=0x0) at /data/bzr/10.0/sql/mysqld.cc:1765
          #12 0x0000000000563c81 in kill_server_thread (arg=0x7fc5b9424e48) at /data/bzr/10.0/sql/mysqld.cc:1788
          #13 0x0000000000cd9f64 in pfs_spawn_thread (arg=0x7fc588001340) at /data/bzr/10.0/storage/perfschema/pfs.cc:1800
          #14 0x00007fc5c1de0e9a in start_thread (arg=0x7fc5b9393700) at pthread_create.c:308
          #15 0x00007fc5c10d4cbd in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:112
          #16 0x0000000000000000 in ?? ()

          elenst Elena Stepanova added a comment - perl ./mtr sys_vars.rpl_max_binlog_size_func - mysqld= -innodb #7 0x00007fc5c1010192 in _ GI __assert_fail (assertion=0xe4eaa4 "b->xid_count == 0", file=0xe4dd48 "/data/bzr/10.0/sql/log.cc", line=2979, function=0xe51d90 "void MYSQL_BIN_LOG::cleanup()") at assert.c:103 #8 0x00000000008a1bb3 in MYSQL_BIN_LOG::cleanup (this=0x15d8220) at /data/bzr/10.0/sql/log.cc:2979 #9 0x0000000000563f23 in clean_up (print_message=true) at /data/bzr/10.0/sql/mysqld.cc:1906 #10 0x0000000000563d42 in unireg_end () at /data/bzr/10.0/sql/mysqld.cc:1837 #11 0x0000000000563c66 in kill_server (sig_ptr=0x0) at /data/bzr/10.0/sql/mysqld.cc:1765 #12 0x0000000000563c81 in kill_server_thread (arg=0x7fc5b9424e48) at /data/bzr/10.0/sql/mysqld.cc:1788 #13 0x0000000000cd9f64 in pfs_spawn_thread (arg=0x7fc588001340) at /data/bzr/10.0/storage/perfschema/pfs.cc:1800 #14 0x00007fc5c1de0e9a in start_thread (arg=0x7fc5b9393700) at pthread_create.c:308 #15 0x00007fc5c10d4cbd in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:112 #16 0x0000000000000000 in ?? ()

          Thanks for catching this!

          > There mysql_bin_log.mark_xids_active() is called and corresponding
          > mark_xid_done() is not called from anywhere.

          It is called from binlog_background_thread(). This happens asynchroneously, in
          the background.

          I suspect the problem here is that we need to wait for the binlog background
          thread to have time to process any pending requests before we do binlog
          cleanup. At least if I add this at the end of the test case, it does not
          assert:

          --source include/wait_for_binlog_checkpoint.inc

          I'll take a closer look next week and find out what the real solution should
          be.

          knielsen Kristian Nielsen added a comment - Thanks for catching this! > There mysql_bin_log.mark_xids_active() is called and corresponding > mark_xid_done() is not called from anywhere. It is called from binlog_background_thread(). This happens asynchroneously, in the background. I suspect the problem here is that we need to wait for the binlog background thread to have time to process any pending requests before we do binlog cleanup. At least if I add this at the end of the test case, it does not assert: --source include/wait_for_binlog_checkpoint.inc I'll take a closer look next week and find out what the real solution should be.

          Fix pushed to 10.0-base (will be later merged to 10.0).

          knielsen Kristian Nielsen added a comment - Fix pushed to 10.0-base (will be later merged to 10.0).

          Here is the patch for the bug, including test case:

          http://lists.askmonty.org/pipermail/commits/2013-March/004483.html

          knielsen Kristian Nielsen added a comment - Here is the patch for the bug, including test case: http://lists.askmonty.org/pipermail/commits/2013-March/004483.html

          People

            knielsen Kristian Nielsen
            pivanof Pavel Ivanov
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Git Integration

                Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.