Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-360

safe_mutex: Trying to destroy a mutex keycache->cache_lock that was locked

Details

    • Bug
    • Status: Closed (View Workflow)
    • Major
    • Resolution: Fixed
    • None
    • 5.5.27
    • None
    • None

    Description

      RQG test crashes with assetion in the safemutex code

      http://buildbot.askmonty.org/buildbot/builders/rqg-perpush-bugfix-tests/builds/25/steps/rqg_bugfix_tests/logs/stdio

      the crash callstack points to "repartition_key_cache" function.

      mysys/thr_mutex.c:608(safe_mutex_destroy)[0xc39f1d]
      psi/mysql_thread.h:597(inline_mysql_mutex_destroy)[0xc1071e]
      mysys/mf_keycache.c:1002(end_simple_key_cache)[0xc1066d]
      mysys/mf_keycache.c:5342(end_partitioned_key_cache)[0xc17814]
      mysys/mf_keycache.c:6109(end_key_cache_internal)[0xc184ae]
      mysys/mf_keycache.c:6476(repartition_key_cache_internal)[0xc188ea]
      mysys/mf_keycache.c:6527(repartition_key_cache)[0xc1898b]

      Attachments

        Activity

          elenst Elena Stepanova added a comment - - edited

          FYI, the RQG test in question was added as a regression test for LP:1008293. It runs the same 2 grammars that were provided in the bug report.

          elenst Elena Stepanova added a comment - - edited FYI, the RQG test in question was added as a regression test for LP:1008293. It runs the same 2 grammars that were provided in the bug report.

          Elena,
          It's not clear from the above where the test was added.
          The original fix for LP:1008293 was pushed into 5.2. I don't see any failures in 5.2.

          igor Igor Babaev (Inactive) added a comment - Elena, It's not clear from the above where the test was added. The original fix for LP:1008293 was pushed into 5.2. I don't see any failures in 5.2.
          elenst Elena Stepanova added a comment - - edited

          Igor,
          The test was added to 5.2, 5.3 and 5.5. It passed on 5.2 and 5.3 after the fix was pushed/merged in the corresponding tree, but failed on 5.5 with the failure Wlad mentioned above (safe_mutex: Trying to destroy a mutex keycache->cache_lock) – it's different from the initial crash.

          Please note however that the new failure is sporadic, so unless you can guess a source of it by just looking at the stack trace, you'll probably want to assign it to me and wait till I come up with a test case for it (which might take time because from my previous experience, these destroying mutex race conditions might be not easy to catch).

          elenst Elena Stepanova added a comment - - edited Igor, The test was added to 5.2, 5.3 and 5.5. It passed on 5.2 and 5.3 after the fix was pushed/merged in the corresponding tree, but failed on 5.5 with the failure Wlad mentioned above (safe_mutex: Trying to destroy a mutex keycache->cache_lock) – it's different from the initial crash. Please note however that the new failure is sporadic, so unless you can guess a source of it by just looking at the stack trace, you'll probably want to assign it to me and wait till I come up with a test case for it (which might take time because from my previous experience, these destroying mutex race conditions might be not easy to catch).

          I would prefer to have a test case to start working on this bug.

          igor Igor Babaev (Inactive) added a comment - I would prefer to have a test case to start working on this bug.

          Igor,

          Please try the MTR test case below. It crashes on two machines out of 3 that i tried (the 3rd is a slow 32-bit box, not sure whether it's slowness or the bits that stop it from crashing).
          Please run the test with --repeat=100. (It usually fails for me in the first 10 repetitions)

          1. MTR test case

          CREATE TABLE t1 (a INT, b DATE, KEY(a), KEY(b)) ENGINE=MyISAM;
          INSERT INTO t1 VALUES (8, '2008-10-02');
          --send SET GLOBAL key_cache_segments = 1
          --connect (con8,127.0.0.1,root,,test)
          SET GLOBAL keycache1.key_buffer_size = 1024*1024;
          --send CACHE INDEX t1 IN keycache1
          --connection default
          --reap
          SET GLOBAL key_cache_segments = 7;
          --connection con8
          --reap

          1. End of MTR test case
          1. If it does not work, please try to use the following RQG grammar
          2. (it's one of the grammars from lp:1008293).
          3. cat 3.yy

          query_init:
          SET GLOBAL keycache1.key_buffer_size = 1024*1024;

          thread1:
          SET GLOBAL key_cache_segments = _digit;

          query:
          CACHE INDEX _table IN keycache1;

          1. end of RQG grammar 3.yy
          1. Run it as

          perl runall.pl \
          --no-mask \
          --queries=100M \
          --duration=300 \
          --threads=2 \
          --engine=MyISAM \
          --grammar=3.yy \
          --basedir=<your basedir> --vardir=<your vardir>

          1. Or, on an already started server, as

          perl gentest.pl \
          --gendata= \
          --engine=MyISAM \
          --threads=2 \
          --queries=100M \
          --duration=300 \
          --grammar=3.yy \
          --dsn=dbi:mysql:host=127.0.0.1:port=19300:user=root:database=test

          (replace 19300 with your port).

          Again, normally it fails within seconds after start, but sometimes it does not.

          If neither of this works for you, please let me know.

          elenst Elena Stepanova added a comment - Igor, Please try the MTR test case below. It crashes on two machines out of 3 that i tried (the 3rd is a slow 32-bit box, not sure whether it's slowness or the bits that stop it from crashing). Please run the test with --repeat=100. (It usually fails for me in the first 10 repetitions) MTR test case CREATE TABLE t1 (a INT, b DATE, KEY(a), KEY(b)) ENGINE=MyISAM; INSERT INTO t1 VALUES (8, '2008-10-02'); --send SET GLOBAL key_cache_segments = 1 --connect (con8,127.0.0.1,root,,test) SET GLOBAL keycache1.key_buffer_size = 1024*1024; --send CACHE INDEX t1 IN keycache1 --connection default --reap SET GLOBAL key_cache_segments = 7; --connection con8 --reap End of MTR test case If it does not work, please try to use the following RQG grammar (it's one of the grammars from lp:1008293). cat 3.yy query_init: SET GLOBAL keycache1.key_buffer_size = 1024*1024; thread1: SET GLOBAL key_cache_segments = _digit; query: CACHE INDEX _table IN keycache1; end of RQG grammar 3.yy Run it as perl runall.pl \ --no-mask \ --queries=100M \ --duration=300 \ --threads=2 \ --engine=MyISAM \ --grammar=3.yy \ --basedir=<your basedir> --vardir=<your vardir> Or, on an already started server, as perl gentest.pl \ --gendata= \ --engine=MyISAM \ --threads=2 \ --queries=100M \ --duration=300 \ --grammar=3.yy \ --dsn=dbi:mysql:host=127.0.0.1:port=19300:user=root:database=test (replace 19300 with your port). Again, normally it fails within seconds after start, but sometimes it does not. If neither of this works for you, please let me know.

          Algrorithm to start the MTR test above:

          • copy the test case into t/t1.test
          • run
            perl ./mtr --repeat=100 t1
          elenst Elena Stepanova added a comment - Algrorithm to start the MTR test above: copy the test case into t/t1.test run perl ./mtr --repeat=100 t1

          The fix was applied to 5.2, them merged into 5.3 and 5.5.
          The problem was not observed anymore.

          igor Igor Babaev (Inactive) added a comment - The fix was applied to 5.2, them merged into 5.3 and 5.5. The problem was not observed anymore.

          People

            igor Igor Babaev (Inactive)
            wlad Vladislav Vaintroub
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Git Integration

                Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.