Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-13333

Deadlock failure that does not occur elsewhere

Details

    Description

      MariaDB gets the following deadlock error:

      localhost-dir: sql_create.c:837-5 Fill File table Query failed: INSERT INTO File
      (FileIndex, JobId, PathId, FilenameId, LStat, MD5, DeltaSeq) SELECT
      batch.FileIndex, batch.JobId, Path.PathId, Filename.FilenameId,batch.LStat,
      batch.MD5, batch.DeltaSeq FROM batch JOIN Path ON (batch.Path = Path.Path) JOIN
      Filename ON (batch.Name = Filename.Name): ERR=Deadlock found when trying to get
      lock;

      When running the Bacula 9.0.1 regression script named: three-pool-virtual-test
      This does not occur on any version of MySQL, nor the Ubuntu version 10.0 of MariaDB. The code has been stable for many years.

      I am running all instances of MariaDB and MySQL out of the box. I have changed no parameters.

      This appears to be a false deadlock detection. Note, it is 100% reproducible.

      Attachments

        1. mariadb-bug
          1 kB
        2. mdev13333.test
          2 kB
        3. output_b
          6.44 MB

        Issue Links

          Activity

            kern Kern Sibbald created issue -

            Please describe step by step what one needs to do with this tool in order to reproduce the problem, assuming that one has cloned it from github and never had an installation before.

            elenst Elena Stepanova added a comment - Please describe step by step what one needs to do with this tool in order to reproduce the problem, assuming that one has cloned it from github and never had an installation before.
            elenst Elena Stepanova made changes -
            Field Original Value New Value
            Labels need_feedback
            kern Kern Sibbald added a comment -

            Your request sounds reasonable to me. I will prepare everything you will need, test it, and comment/document it. It will take a couple of days.

            kern Kern Sibbald added a comment - Your request sounds reasonable to me. I will prepare everything you will need, test it, and comment/document it. It will take a couple of days.
            kern Kern Sibbald made changes -
            Attachment mariadb-bug [ 43862 ]
            kern Kern Sibbald added a comment -

            The instructions for repeating the problem are simpler than I thought.

            I have created a file named mariadb-bug and uploaded it to this issue. It is a Linux shell script that runs as non-root, which will download the current Bacula (including some minor modifications I made this morning to make your task easier) into a new subdirectory named "bacula". It will then setup a config file, build Bacula and attempt to run the test script that fails on MariaDB 10.2.7.

            If you have a new installation of MariaDB, You life will be easier if prior to running the script, you create the MariaDB database and user both named regress. The regress user should have full permissions for the regress database, and if you also give your self full permissions to access/modify the regress database. Otherwise the script will tell you how to correct it.

            Of course, you can run everything as root and none of the minor privilege problems will occur.

            If you run the script tests/three-pool-virtual-test with the environment variable REGRESS_DEBUG=1 you will see all the normal Bacula output plus some debug information. E.g.

            REGRESS_DEBUG=1 tests/three-pool-virtual-test

            kern Kern Sibbald added a comment - The instructions for repeating the problem are simpler than I thought. I have created a file named mariadb-bug and uploaded it to this issue. It is a Linux shell script that runs as non-root, which will download the current Bacula (including some minor modifications I made this morning to make your task easier) into a new subdirectory named "bacula". It will then setup a config file, build Bacula and attempt to run the test script that fails on MariaDB 10.2.7. If you have a new installation of MariaDB, You life will be easier if prior to running the script, you create the MariaDB database and user both named regress. The regress user should have full permissions for the regress database, and if you also give your self full permissions to access/modify the regress database. Otherwise the script will tell you how to correct it. Of course, you can run everything as root and none of the minor privilege problems will occur. If you run the script tests/three-pool-virtual-test with the environment variable REGRESS_DEBUG=1 you will see all the normal Bacula output plus some debug information. E.g. REGRESS_DEBUG=1 tests/three-pool-virtual-test
            elenst Elena Stepanova made changes -
            Labels need_feedback
            elenst Elena Stepanova made changes -
            Assignee Alice Sherepa [ alice ]
            alice Alice Sherepa added a comment -

            I tried to build bacula with MariaDB 10.2.7 on docker image Ubuntu 16.04, but can not make it work so far,
            got an error when building:

            /bacula/regress/build/libtool --silent --tag=CXX --mode=link /usr/bin/g++    -o libbaccats.la cats_null.lo -export-dynamic -rpath /bacula/regress/bin -release 9.0.2
            mysql.c: In member function 'virtual bool BDB_MYSQL::bdb_open_database(JCR*)':
            mysql.c:261:20: error: 'MYSQL {aka struct st_mysql}' has no member named 'reconnect'
                mdb->m_instance.reconnect = 1;             /* so connection does not timeout */ 
                                ^
            

            Then I tried to apply this patch https://bugzilla.redhat.com/show_bug.cgi?id=1467706, but without success for now, will try again later

            alice Alice Sherepa added a comment - I tried to build bacula with MariaDB 10.2.7 on docker image Ubuntu 16.04, but can not make it work so far, got an error when building: /bacula/regress/build/libtool --silent --tag=CXX --mode=link /usr/bin/g++ -o libbaccats.la cats_null.lo -export-dynamic -rpath /bacula/regress/bin -release 9.0.2 mysql.c: In member function 'virtual bool BDB_MYSQL::bdb_open_database(JCR*)': mysql.c:261:20: error: 'MYSQL {aka struct st_mysql}' has no member named 'reconnect' mdb->m_instance.reconnect = 1; /* so connection does not timeout */ ^ Then I tried to apply this patch https://bugzilla.redhat.com/show_bug.cgi?id=1467706 , but without success for now, will try again later
            kern Kern Sibbald added a comment -

            Yes, I saw that RedHat ran into that problem. It did not happen on the version I pulled from your binary repo. It appears to be a new difference that 10.2.7 has introduced since prior MariaDB versions that were compatible with MySQL. I suggest to comment out that line and judging from the problems RedHat had, you will need to either change the name of your library back to agree with the MySQL library name, or simply link the MySQL library name to yours. I am not sure why I did not have those problems – do you have several versions of 10.2.7?

            Comment out "reconnect variable:

            // mdb->m_instance.reconnect = 1; /* so connection does not timeout */

            kern Kern Sibbald added a comment - Yes, I saw that RedHat ran into that problem. It did not happen on the version I pulled from your binary repo. It appears to be a new difference that 10.2.7 has introduced since prior MariaDB versions that were compatible with MySQL. I suggest to comment out that line and judging from the problems RedHat had, you will need to either change the name of your library back to agree with the MySQL library name, or simply link the MySQL library name to yours. I am not sure why I did not have those problems – do you have several versions of 10.2.7? Comment out "reconnect variable: // mdb->m_instance.reconnect = 1; /* so connection does not timeout */
            kern Kern Sibbald added a comment -

            By the way, thanks for pointing me to the RedHat patch. I hadn't seen it. I will apply it here and if it makes both MySQL and MariaDB work, it will be a nice solution.

            kern Kern Sibbald added a comment - By the way, thanks for pointing me to the RedHat patch. I hadn't seen it. I will apply it here and if it makes both MySQL and MariaDB work, it will be a nice solution.
            alice Alice Sherepa added a comment -

            I used this instructions to install MariaDB 10.2.7 (https://downloads.mariadb.org/mariadb/repositories/#mirror=dotsrc&distro=Ubuntu&distro_release=xenial--ubuntu_xenial&version=10.2)

            sudo apt-get install software-properties-common
            sudo apt-key adv --recv-keys --keyserver hkp://keyserver.ubuntu.com:80 0xF1656F24C74CD1D8
            sudo add-apt-repository 'deb [arch=amd64,i386,ppc64el] http://mirrors.dotsrc.org/mariadb/repo/10.2/ubuntu xenial main'
             
            sudo apt update
            sudo apt install mariadb-server
            

            Then added packages libmariadb-dev and libacl1-dev and got that error when tried make setup.
            When I change file mysql.c and then run make setup, an error appears again and file is as it was before change, like it is copied from somewhere else.

            alice Alice Sherepa added a comment - I used this instructions to install MariaDB 10.2.7 ( https://downloads.mariadb.org/mariadb/repositories/#mirror=dotsrc&distro=Ubuntu&distro_release=xenial--ubuntu_xenial&version=10.2 ) sudo apt-get install software-properties-common sudo apt-key adv --recv-keys --keyserver hkp://keyserver.ubuntu.com:80 0xF1656F24C74CD1D8 sudo add-apt-repository 'deb [arch=amd64,i386,ppc64el] http://mirrors.dotsrc.org/mariadb/repo/10.2/ubuntu xenial main'   sudo apt update sudo apt install mariadb-server Then added packages libmariadb-dev and libacl1-dev and got that error when tried make setup. When I change file mysql.c and then run make setup, an error appears again and file is as it was before change, like it is copied from somewhere else.
            kern Kern Sibbald added a comment -

            The simplest way to change then test is:

            edit the original Bacula file that was downloaded
            cd regress
            make setup
            tests/...

            However, that takes time because it rebuilds all of Bacula. The way I do it is:

            (from the regress directory)
            cd build/src/cats
            (edit mysql.c)
            make
            make install
            cd (back to regress)
            tests/...

            This is faster but the change is in the regress/build subtree and will be lost or overridden on the next "make setup".

            I think you are close to making it work.

            By the way the commands I used to load MariaDB were the same as yours but I only specified arch=amd64 and I also did
            sudo apt-get install mariadb-server mariadb-client

            kern Kern Sibbald added a comment - The simplest way to change then test is: edit the original Bacula file that was downloaded cd regress make setup tests/... However, that takes time because it rebuilds all of Bacula. The way I do it is: (from the regress directory) cd build/src/cats (edit mysql.c) make make install cd (back to regress) tests/... This is faster but the change is in the regress/build subtree and will be lost or overridden on the next "make setup". I think you are close to making it work. By the way the commands I used to load MariaDB were the same as yours but I only specified arch=amd64 and I also did sudo apt-get install mariadb-server mariadb-client
            alice Alice Sherepa added a comment -

            I ran tests/three-pool-virtual-test and got this output and didn't find any sign of deadlocks

            root@366e219453f5:/bacula/regress#  tests/three-pool-virtual-test
             
             
             === Start three-pool-virtual-test  at 07:28:56 ===
             
             
              !!!!! three-pool-virtual-test  failed!!! 07:29:09 00:00:12 12s !!!!! 
                 Status: zombie=0 backup=2 restore=0 diff=0 verify=0
                 !!! Bad termination status       !!! 
                 Status: backup=2 restore=0 diff=0 verify=0
                 Test owner of bacula-127.0.0.1 is my-name@domain.com
            

            MariaDB [(none)]> show processlist;
            +-----+-------------+-----------+---------+---------+------+-------------------------+------------------------------------------------------------------------------------------------------+----------+
            | Id  | User        | Host      | db      | Command | Time | State                   | Info                                                                                                 | Progress |
            +-----+-------------+-----------+---------+---------+------+-------------------------+------------------------------------------------------------------------------------------------------+----------+
            |   1 | system user |           | NULL    | Daemon  | NULL |                         | NULL                                                                                                 |    0.000 |
            |   2 | system user |           | NULL    | Daemon  | NULL |                         | NULL                                                                                                 |    0.000 |
            |   4 | system user |           | NULL    | Daemon  | NULL |                         | NULL                                                                                                 |    0.000 |
            |   3 | system user |           | NULL    | Daemon  | NULL |                         | NULL                                                                                                 |    0.000 |
            |   5 | system user |           | NULL    | Daemon  | NULL | InnoDB shutdown handler | NULL                                                                                                 |    0.000 |
            |  98 | root        | localhost | NULL    | Query   |    0 | init                    | show processlist                                                                                     |    0.000 |
            | 105 | regress     | localhost | regress | Query   |    0 | query end               | INSERT INTO Job (Job,Name,Type,Level,JobStatus,SchedTime,JobTDate,ClientId,Comment) VALUES ('threepo |    0.000 |
            +-----+-------------+-----------+---------+---------+------+-------------------------+------------------------------------------------------------------------------------------------------+----------+
            7 rows in set (0.00 sec)
             
            MariaDB [(none)]> show global status like '%wait%';
            +---------------------------------------+-------+
            | Variable_name                         | Value |
            +---------------------------------------+-------+
            | Binlog_group_commit_trigger_lock_wait | 0     |
            | Innodb_buffer_pool_wait_free          | 0     |
            | Innodb_log_waits                      | 0     |
            | Innodb_row_lock_current_waits         | 0     |
            | Innodb_row_lock_waits                 | 0     |
            | Master_gtid_wait_count                | 0     |
            | Master_gtid_wait_time                 | 0     |
            | Master_gtid_wait_timeouts             | 0     |
            | Table_locks_waited                    | 0     |
            | Tc_log_page_waits                     | 0     |
            +---------------------------------------+-------+
            10 rows in set (0.00 sec)
            

            alice Alice Sherepa added a comment - I ran tests/three-pool-virtual-test and got this output and didn't find any sign of deadlocks root@366e219453f5:/bacula/regress# tests/three-pool-virtual-test === Start three-pool-virtual-test at 07:28:56 === !!!!! three-pool-virtual-test failed!!! 07:29:09 00:00:12 12s !!!!! Status: zombie=0 backup=2 restore=0 diff=0 verify=0 !!! Bad termination status !!! Status: backup=2 restore=0 diff=0 verify=0 Test owner of bacula-127.0.0.1 is my-name@domain.com MariaDB [(none)]> show processlist; +-----+-------------+-----------+---------+---------+------+-------------------------+------------------------------------------------------------------------------------------------------+----------+ | Id | User | Host | db | Command | Time | State | Info | Progress | +-----+-------------+-----------+---------+---------+------+-------------------------+------------------------------------------------------------------------------------------------------+----------+ | 1 | system user | | NULL | Daemon | NULL | | NULL | 0.000 | | 2 | system user | | NULL | Daemon | NULL | | NULL | 0.000 | | 4 | system user | | NULL | Daemon | NULL | | NULL | 0.000 | | 3 | system user | | NULL | Daemon | NULL | | NULL | 0.000 | | 5 | system user | | NULL | Daemon | NULL | InnoDB shutdown handler | NULL | 0.000 | | 98 | root | localhost | NULL | Query | 0 | init | show processlist | 0.000 | | 105 | regress | localhost | regress | Query | 0 | query end | INSERT INTO Job (Job,Name,Type,Level,JobStatus,SchedTime,JobTDate,ClientId,Comment) VALUES ('threepo | 0.000 | +-----+-------------+-----------+---------+---------+------+-------------------------+------------------------------------------------------------------------------------------------------+----------+ 7 rows in set (0.00 sec)   MariaDB [(none)]> show global status like '%wait%'; +---------------------------------------+-------+ | Variable_name | Value | +---------------------------------------+-------+ | Binlog_group_commit_trigger_lock_wait | 0 | | Innodb_buffer_pool_wait_free | 0 | | Innodb_log_waits | 0 | | Innodb_row_lock_current_waits | 0 | | Innodb_row_lock_waits | 0 | | Master_gtid_wait_count | 0 | | Master_gtid_wait_time | 0 | | Master_gtid_wait_timeouts | 0 | | Table_locks_waited | 0 | | Tc_log_page_waits | 0 | +---------------------------------------+-------+ 10 rows in set (0.00 sec)
            kern Kern Sibbald added a comment -

            Your results look consistent with the problem. The test failed, and if you run the test with:

            REGRESS_DEBUG=1 tests/three-pool-virtual-test

            and capture the output. You will find in that output that a Bacula backup job failed because of what MariaDB says is a deadlock. I.e. you will see the message that I posted in the original bug submission. On all other systems the test runs and reports that it succeeded.
            If the mariaDB server is not really getting a deadlock then there is some new error being reported that we have never seen before. Something is going wrong either in MariaDB or in our code. Since our code runs fine on prior MariaDB versions, on Postgresql, and on MySQL, for the moment I am assuming that the problem is on the MariaDB side. In addition, we simply print the message that MariaDB furnishes us: "Deadlock found when trying to get lock;"

            kern Kern Sibbald added a comment - Your results look consistent with the problem. The test failed, and if you run the test with: REGRESS_DEBUG=1 tests/three-pool-virtual-test and capture the output. You will find in that output that a Bacula backup job failed because of what MariaDB says is a deadlock. I.e. you will see the message that I posted in the original bug submission. On all other systems the test runs and reports that it succeeded. If the mariaDB server is not really getting a deadlock then there is some new error being reported that we have never seen before. Something is going wrong either in MariaDB or in our code. Since our code runs fine on prior MariaDB versions, on Postgresql, and on MySQL, for the moment I am assuming that the problem is on the MariaDB side. In addition, we simply print the message that MariaDB furnishes us: "Deadlock found when trying to get lock;"
            alice Alice Sherepa made changes -
            Attachment output_b [ 43865 ]
            alice Alice Sherepa made changes -
            Attachment output_b [ 43866 ]
            alice Alice Sherepa made changes -
            Attachment output_b [ 43867 ]
            alice Alice Sherepa added a comment -

            please find output from REGRESS_DEBUG=1 tests/three-pool-virtual-test attached, still no deadlocks there.
            output_b

            alice Alice Sherepa added a comment - please find output from REGRESS_DEBUG=1 tests/three-pool-virtual-test attached, still no deadlocks there. output_b
            alice Alice Sherepa made changes -
            Attachment output_b [ 43867 ]
            alice Alice Sherepa made changes -
            Attachment output_b [ 43865 ]
            kern Kern Sibbald added a comment -

            The first output you showed as a non-attachment is identical to the errors I have been seeing. There seem to be 4 uploads, but only 2 of them can be accessed, and neither represents a failure.

            Can you explain the difference between your first execution of Bacula where the error shows up and the executions that correspond to the two outputs that I could examine?

            What is surprising is that your first job produces the following:
            === Start three-pool-virtual-test at 07:28:56 ===

            !!!!! three-pool-virtual-test failed!!! 07:29:09 00:00:12 12s !!!!!

            Status: zombie=0 backup=2 restore=0 diff=0 verify=0

            !!! Bad termination status !!!

            Status: backup=2 restore=0 diff=0 verify=0

            Test owner of bacula-127.0.0.1 is my-name@domain.com

            Which is Bacula failing during a backup.

            I just tried rebuilding the source and re-running the test, and surprisingly the test succeeds. I really do not understand what was going on, because previously it would always fail. Now it runs.

            I will try a few more tests to see if I can come up with something, and then get back to you.

            kern Kern Sibbald added a comment - The first output you showed as a non-attachment is identical to the errors I have been seeing. There seem to be 4 uploads, but only 2 of them can be accessed, and neither represents a failure. Can you explain the difference between your first execution of Bacula where the error shows up and the executions that correspond to the two outputs that I could examine? What is surprising is that your first job produces the following: === Start three-pool-virtual-test at 07:28:56 === !!!!! three-pool-virtual-test failed!!! 07:29:09 00:00:12 12s !!!!! Status: zombie=0 backup=2 restore=0 diff=0 verify=0 !!! Bad termination status !!! Status: backup=2 restore=0 diff=0 verify=0 Test owner of bacula-127.0.0.1 is my-name@domain.com Which is Bacula failing during a backup. I just tried rebuilding the source and re-running the test, and surprisingly the test succeeds. I really do not understand what was going on, because previously it would always fail. Now it runs. I will try a few more tests to see if I can come up with something, and then get back to you.
            kern Kern Sibbald added a comment -

            To try to get back to the starting point, I removed MariaDB. Since then I have been unable to re-install it. There are multiple different errors that show up, and now with the new upstartd trying to resolve problems starting daemons is complicated.

            At this point, I give up.

            I will attempt to reinstall MySQL, and stay with it until some distribution has worked out the problems with Bacula running with MariaDB 10.2.7.

            kern Kern Sibbald added a comment - To try to get back to the starting point, I removed MariaDB. Since then I have been unable to re-install it. There are multiple different errors that show up, and now with the new upstartd trying to resolve problems starting daemons is complicated. At this point, I give up. I will attempt to reinstall MySQL, and stay with it until some distribution has worked out the problems with Bacula running with MariaDB 10.2.7.
            alice Alice Sherepa added a comment -

            sorry, I attached by mistake the same file 4 times, then removed 3 of them)
            I ran the same test, first time without REGRESS_DEBUG=1, second time with REGRESS_DEBUG=1 and sent output into a file, that is all.
            when I built Bacula, I needed additional packages : libmariadb-dev, libacl1-dev, openssl
            what version of Mysql will you use, please write if you succeed to build and run tests with it.

            alice Alice Sherepa added a comment - sorry, I attached by mistake the same file 4 times, then removed 3 of them) I ran the same test, first time without REGRESS_DEBUG=1, second time with REGRESS_DEBUG=1 and sent output into a file, that is all. when I built Bacula, I needed additional packages : libmariadb-dev, libacl1-dev, openssl what version of Mysql will you use, please write if you succeed to build and run tests with it.
            kern Kern Sibbald added a comment -

            After removing mariadb-server and mariadb-client, I was unable to re-install them. No matter what I did it failed, and I am reasonably familiar with coaching apt-get and dpkgs along when there are problems.

            I purged the mariadb-server and mariadb-client and deleted the database directory, then reinstalled MySQL, which installed and runs perfectly fine. It is the following version:

            mysql Ver 14.14 Distrib 5.7.18, for Linux (x86_64) using EditLine wrapper

            I still find it very odd that your very first job failed with what looks identical to the failure I saw. All the other Jobs you posted did not fail, and much to my amazement, MariaDB stopped failing here too. Was there a change in your Ubuntu package in the past couple of days, because during my testing I very likely did an upgrade, which might have pulled a newer (or different) version of MariaDB.

            In any case, I would suggest that someone other than me installs MariaDB on a Ubuntu 16.04, then purges it and removes the /var/lib/mysql directory then attempt to reinstall it. Here it failed, but it could be particular to my site.

            Thanks for your quick response to my ticket.

            kern Kern Sibbald added a comment - After removing mariadb-server and mariadb-client, I was unable to re-install them. No matter what I did it failed, and I am reasonably familiar with coaching apt-get and dpkgs along when there are problems. I purged the mariadb-server and mariadb-client and deleted the database directory, then reinstalled MySQL, which installed and runs perfectly fine. It is the following version: mysql Ver 14.14 Distrib 5.7.18, for Linux (x86_64) using EditLine wrapper I still find it very odd that your very first job failed with what looks identical to the failure I saw. All the other Jobs you posted did not fail, and much to my amazement, MariaDB stopped failing here too. Was there a change in your Ubuntu package in the past couple of days, because during my testing I very likely did an upgrade, which might have pulled a newer (or different) version of MariaDB. In any case, I would suggest that someone other than me installs MariaDB on a Ubuntu 16.04, then purges it and removes the /var/lib/mysql directory then attempt to reinstall it. Here it failed, but it could be particular to my site. Thanks for your quick response to my ticket.
            alice Alice Sherepa added a comment -

            it looks, that I finally got those deadlocks, I will investigate more and update later.

            alice Alice Sherepa added a comment - it looks, that I finally got those deadlocks, I will investigate more and update later.
            kern Kern Sibbald added a comment -

            Thank you. That is good news, because there is problem. It is not so good for you, because, at least, for me intermittent problems are difficult to resolve. If you resolve it, I'll be happy to try again to re-install the fixed version.

            kern Kern Sibbald added a comment - Thank you. That is good news, because there is problem. It is not so good for you, because, at least, for me intermittent problems are difficult to resolve. If you resolve it, I'll be happy to try again to re-install the fixed version.
            alice Alice Sherepa made changes -
            Attachment mdev13333.test [ 43870 ]
            alice Alice Sherepa added a comment -

            mdev13333.test Please find test case mdev13333 attached to reproduce the problem (it is only for reproducing/debugging, not for regression suite )
            On MariaDB 10.2 it returns error " At line 75: query 'reap' failed: 1213: Deadlock found when trying to get lock; try restarting transaction" , but no errors on 10.1
            Note that test is non-deterministic, so maybe it will need --repeat=N option

            alice Alice Sherepa added a comment - mdev13333.test Please find test case mdev13333 attached to reproduce the problem (it is only for reproducing/debugging, not for regression suite ) On MariaDB 10.2 it returns error " At line 75: query 'reap' failed: 1213: Deadlock found when trying to get lock; try restarting transaction" , but no errors on 10.1 Note that test is non-deterministic, so maybe it will need --repeat=N option
            elenst Elena Stepanova made changes -
            Status Open [ 1 ] Confirmed [ 10101 ]

            svoj,

            I'm not sure whether it's InnoDB or locking to blame. Deadlock detection is normally an InnoDB thing, but strangely it does not show up in ENGINE INNODB STATUS, and MySQL 5.7 which has InnoDB 5.7 does not exhibit this behavior. So, I think maybe it's the server locking changes that have made the difference. Could you please take the first look at it, and if it turns out to be InnoDB's fault, reassign it to jplindst?

            elenst Elena Stepanova added a comment - svoj , I'm not sure whether it's InnoDB or locking to blame. Deadlock detection is normally an InnoDB thing, but strangely it does not show up in ENGINE INNODB STATUS , and MySQL 5.7 which has InnoDB 5.7 does not exhibit this behavior. So, I think maybe it's the server locking changes that have made the difference. Could you please take the first look at it, and if it turns out to be InnoDB's fault, reassign it to jplindst ?
            elenst Elena Stepanova made changes -
            Component/s Locking [ 10900 ]
            Fix Version/s 10.2 [ 14601 ]
            Assignee Alice Sherepa [ alice ] Sergey Vojtovich [ svoj ]
            elenst Elena Stepanova made changes -
            azurit azurit added a comment -

            Kern Sibbald were you able to find any workaround for this? See MDEV-16067 .

            azurit azurit added a comment - Kern Sibbald were you able to find any workaround for this? See MDEV-16067 .
            alice Alice Sherepa added a comment -

            Problem is reproducible on MariaDB 10.1-10.3

            Simplified test case: ( please use --repeat=N)

            --source include/have_innodb.inc
             
            CREATE TABLE tr (i2 int, i1 int); 
            INSERT INTO tr VALUES(1,1);
             
            CREATE TABLE t3 (j1 int, j2 int, t2 varchar(5), n1 varchar(5));
            INSERT INTO t3 VALUES (1697,2,'/b','g');
             
            CREATE TABLE t4 (j1 int, j2 int, t2 varchar(5), n1 varchar(5));
            INSERT INTO t4 VALUES (97,3,'/b','u');
             
            CREATE TABLE t5 (j1 int, j2 int, t2 varchar(5), n1 varchar(5));
            INSERT INTO t5 VALUES (97,1,'/b','u');
             
            CREATE TABLE tt (
            	i3 int AUTO_INCREMENT PRIMARY KEY, 
            	j1 int, j2 int, i2 int, i1 int) ENGINE=InnoDB;
             
            --connect (con1,localhost,root,,test)
            --connect (con2,localhost,root,,test)
            --connect (con3,localhost,root,,test)
             
            --connection con1
            --send 
            INSERT INTO tt (j1, j2, i2, i1) SELECT t4.j1, t4.j2, tr.i2, tr.i1 FROM t4 JOIN tr;
             
            --connection con2
            --send 
            INSERT INTO tt (j1, j2, i2, i1) SELECT t5.j1, t5.j2, tr.i2, tr.i1 FROM t5 JOIN tr;
             
            --connection con3
            INSERT INTO tt (j1, j2, i2, i1) SELECT t3.j1, t3.j2, tr.i2, tr.i1 FROM t3 JOIN tr;
            --disconnect con3
             
            --connection con2
            --reap
            --disconnect con2
             
            --connection con1
            --reap
            --disconnect con1
             
            --connection default
            DROP TABLE tr,tt,t3,t4,t5;
            

            MariaDB Version 10.1.35-MariaDB-debug  (commit 36ea82617c1506532e863cb241296acc8b657243)
             
            CREATE TABLE tr (i2 int, i1 int);
            INSERT INTO tr VALUES(1,1);
            CREATE TABLE t3 (j1 int, j2 int, t2 varchar(5), n1 varchar(5));
            INSERT INTO t3 VALUES (1697,2,'/b','g');
            CREATE TABLE t4 (j1 int, j2 int, t2 varchar(5), n1 varchar(5));
            INSERT INTO t4 VALUES (97,3,'/b','u');
            CREATE TABLE t5 (j1 int, j2 int, t2 varchar(5), n1 varchar(5));
            INSERT INTO t5 VALUES (97,1,'/b','u');
            CREATE TABLE tt (
            i3 int AUTO_INCREMENT PRIMARY KEY, 
            j1 int, j2 int, i2 int, i1 int) ENGINE=InnoDB;
            INSERT INTO tt (j1, j2, i2, i1) SELECT t4.j1, t4.j2, tr.i2, tr.i1 FROM t4 JOIN tr;
            INSERT INTO tt (j1, j2, i2, i1) SELECT t5.j1, t5.j2, tr.i2, tr.i1 FROM t5 JOIN tr;
            INSERT INTO tt (j1, j2, i2, i1) SELECT t3.j1, t3.j2, tr.i2, tr.i1 FROM t3 JOIN tr;
            main.1_my 'innodb_plugin'                [ fail ]
                    Test ended at 2018-07-02 20:11:15
             
            CURRENT_TEST: main.1_my
            mysqltest: At line 36: query 'reap' failed: 1213: Deadlock found when trying to get lock; try restarting transaction
             
            ##############################
            CREATE TABLE tr (i2 int, i1 int);
            INSERT INTO tr VALUES(1,1);
            CREATE TABLE t3 (j1 int, j2 int, t2 varchar(5), n1 varchar(5));
            INSERT INTO t3 VALUES (1697,2,'/b','g');
            CREATE TABLE t4 (j1 int, j2 int, t2 varchar(5), n1 varchar(5));
            INSERT INTO t4 VALUES (97,3,'/b','u');
            CREATE TABLE t5 (j1 int, j2 int, t2 varchar(5), n1 varchar(5));
            INSERT INTO t5 VALUES (97,1,'/b','u');
            CREATE TABLE tt (
            i3 int AUTO_INCREMENT PRIMARY KEY, 
            j1 int, j2 int, i2 int, i1 int) ENGINE=InnoDB;
            INSERT INTO tt (j1, j2, i2, i1) SELECT t4.j1, t4.j2, tr.i2, tr.i1 FROM t4 JOIN tr;
            INSERT INTO tt (j1, j2, i2, i1) SELECT t5.j1, t5.j2, tr.i2, tr.i1 FROM t5 JOIN tr;
            INSERT INTO tt (j1, j2, i2, i1) SELECT t3.j1, t3.j2, tr.i2, tr.i1 FROM t3 JOIN tr;
            main.1_my 'xtradb'                       [ fail ]
                    Test ended at 2018-07-02 20:11:17
             
            CURRENT_TEST: main.1_my
            mysqltest: At line 40: query 'reap' failed: 1213: Deadlock found when trying to get lock; try restarting transaction
            

            alice Alice Sherepa added a comment - Problem is reproducible on MariaDB 10.1-10.3 Simplified test case: ( please use --repeat=N) --source include/have_innodb.inc   CREATE TABLE tr (i2 int , i1 int ); INSERT INTO tr VALUES (1,1);   CREATE TABLE t3 (j1 int , j2 int , t2 varchar (5), n1 varchar (5)); INSERT INTO t3 VALUES (1697,2, '/b' , 'g' );   CREATE TABLE t4 (j1 int , j2 int , t2 varchar (5), n1 varchar (5)); INSERT INTO t4 VALUES (97,3, '/b' , 'u' );   CREATE TABLE t5 (j1 int , j2 int , t2 varchar (5), n1 varchar (5)); INSERT INTO t5 VALUES (97,1, '/b' , 'u' );   CREATE TABLE tt ( i3 int AUTO_INCREMENT PRIMARY KEY , j1 int , j2 int , i2 int , i1 int ) ENGINE=InnoDB;   --connect (con1,localhost,root,,test) --connect (con2,localhost,root,,test) --connect (con3,localhost,root,,test)   --connection con1 --send INSERT INTO tt (j1, j2, i2, i1) SELECT t4.j1, t4.j2, tr.i2, tr.i1 FROM t4 JOIN tr;   --connection con2 --send INSERT INTO tt (j1, j2, i2, i1) SELECT t5.j1, t5.j2, tr.i2, tr.i1 FROM t5 JOIN tr;   --connection con3 INSERT INTO tt (j1, j2, i2, i1) SELECT t3.j1, t3.j2, tr.i2, tr.i1 FROM t3 JOIN tr; --disconnect con3   --connection con2 --reap --disconnect con2   --connection con1 --reap --disconnect con1   --connection default DROP TABLE tr,tt,t3,t4,t5; MariaDB Version 10.1.35-MariaDB-debug (commit 36ea82617c1506532e863cb241296acc8b657243)   CREATE TABLE tr (i2 int, i1 int); INSERT INTO tr VALUES(1,1); CREATE TABLE t3 (j1 int, j2 int, t2 varchar(5), n1 varchar(5)); INSERT INTO t3 VALUES (1697,2,'/b','g'); CREATE TABLE t4 (j1 int, j2 int, t2 varchar(5), n1 varchar(5)); INSERT INTO t4 VALUES (97,3,'/b','u'); CREATE TABLE t5 (j1 int, j2 int, t2 varchar(5), n1 varchar(5)); INSERT INTO t5 VALUES (97,1,'/b','u'); CREATE TABLE tt ( i3 int AUTO_INCREMENT PRIMARY KEY, j1 int, j2 int, i2 int, i1 int) ENGINE=InnoDB; INSERT INTO tt (j1, j2, i2, i1) SELECT t4.j1, t4.j2, tr.i2, tr.i1 FROM t4 JOIN tr; INSERT INTO tt (j1, j2, i2, i1) SELECT t5.j1, t5.j2, tr.i2, tr.i1 FROM t5 JOIN tr; INSERT INTO tt (j1, j2, i2, i1) SELECT t3.j1, t3.j2, tr.i2, tr.i1 FROM t3 JOIN tr; main.1_my 'innodb_plugin' [ fail ] Test ended at 2018-07-02 20:11:15   CURRENT_TEST: main.1_my mysqltest: At line 36: query 'reap' failed: 1213: Deadlock found when trying to get lock; try restarting transaction   ############################## CREATE TABLE tr (i2 int, i1 int); INSERT INTO tr VALUES(1,1); CREATE TABLE t3 (j1 int, j2 int, t2 varchar(5), n1 varchar(5)); INSERT INTO t3 VALUES (1697,2,'/b','g'); CREATE TABLE t4 (j1 int, j2 int, t2 varchar(5), n1 varchar(5)); INSERT INTO t4 VALUES (97,3,'/b','u'); CREATE TABLE t5 (j1 int, j2 int, t2 varchar(5), n1 varchar(5)); INSERT INTO t5 VALUES (97,1,'/b','u'); CREATE TABLE tt ( i3 int AUTO_INCREMENT PRIMARY KEY, j1 int, j2 int, i2 int, i1 int) ENGINE=InnoDB; INSERT INTO tt (j1, j2, i2, i1) SELECT t4.j1, t4.j2, tr.i2, tr.i1 FROM t4 JOIN tr; INSERT INTO tt (j1, j2, i2, i1) SELECT t5.j1, t5.j2, tr.i2, tr.i1 FROM t5 JOIN tr; INSERT INTO tt (j1, j2, i2, i1) SELECT t3.j1, t3.j2, tr.i2, tr.i1 FROM t3 JOIN tr; main.1_my 'xtradb' [ fail ] Test ended at 2018-07-02 20:11:17   CURRENT_TEST: main.1_my mysqltest: At line 40: query 'reap' failed: 1213: Deadlock found when trying to get lock; try restarting transaction
            alice Alice Sherepa made changes -
            Fix Version/s 10.1 [ 16100 ]
            Fix Version/s 10.3 [ 22126 ]
            alice Alice Sherepa made changes -
            Assignee Sergey Vojtovich [ svoj ] Jan Lindström [ jplindst ]

            For 10.2 did you use innodb_lock_schedule_algorithm = FCFS ?

            jplindst Jan Lindström (Inactive) added a comment - - edited For 10.2 did you use innodb_lock_schedule_algorithm = FCFS ?
            jplindst Jan Lindström (Inactive) made changes -
            Status Confirmed [ 10101 ] In Progress [ 3 ]

            Repeatable using MariaDB 10.1.35 and confirmed it is InnoDB deadlock. Not repeatable with MySQL 5.6.38 with both test cases using --repeat=2000 both with release builds. In MariaDB it does not matter what innodb-lock-schedule-algorithm value is.

            jplindst Jan Lindström (Inactive) added a comment - - edited Repeatable using MariaDB 10.1.35 and confirmed it is InnoDB deadlock. Not repeatable with MySQL 5.6.38 with both test cases using --repeat=2000 both with release builds. In MariaDB it does not matter what innodb-lock-schedule-algorithm value is.

            Ok found a reason we execute in bad luck wsrep code when we definitely should not.

            jplindst Jan Lindström (Inactive) added a comment - Ok found a reason we execute in bad luck wsrep code when we definitely should not.
            azurit azurit added a comment -

            azurit azurit added a comment -
            jplindst Jan Lindström (Inactive) made changes -
            Fix Version/s 10.1.36 [ 23117 ]
            Fix Version/s 10.2.18 [ 23112 ]
            Fix Version/s 10.3.10 [ 23140 ]
            Fix Version/s 10.2 [ 14601 ]
            Fix Version/s 10.1 [ 16100 ]
            Fix Version/s 10.3 [ 22126 ]
            Resolution Fixed [ 1 ]
            Status In Progress [ 3 ] Closed [ 6 ]
            azurit azurit added a comment -

            What's the release date of 10.1.36? I can see only 10.1.35 in the road map and it was supposed to be released on 2018-07-27. Thank you.

            azurit azurit added a comment - What's the release date of 10.1.36? I can see only 10.1.35 in the road map and it was supposed to be released on 2018-07-27. Thank you.

            I've just checked, 10.1.36 is on the roadmap with the release date 2018-09-14

            serg Sergei Golubchik added a comment - I've just checked, 10.1.36 is on the roadmap with the release date 2018-09-14
            marko Marko Mäkelä made changes -
            marko Marko Mäkelä made changes -
            Affects Version/s 10.3.6 [ 23003 ]
            Affects Version/s 10.2.14 [ 22911 ]
            Affects Version/s 10.1.32 [ 22908 ]
            Affects Version/s 10.2.7 [ 22543 ]
            marko Marko Mäkelä made changes -
            Labels regression

            I believe that this may have been caused by my code clean-up in 10.1. I got confused by the many WSREP-related predicates in the code.

            marko Marko Mäkelä added a comment - I believe that this may have been caused by my code clean-up in 10.1. I got confused by the many WSREP-related predicates in the code.
            marko Marko Mäkelä made changes -
            alice Alice Sherepa made changes -
            alice Alice Sherepa made changes -
            akalantari Ali Kalantari made changes -
            marko Marko Mäkelä made changes -
            Support case ID 25849

            This bug was independently introduced in MariaDB 10.2.2 when applying changes from MySQL 5.7.9.

            In MariaDB 10.1, the logic was only broken between MariaDB 10.1.32 and 10.1.35 (inclusive).

            marko Marko Mäkelä added a comment - This bug was independently introduced in MariaDB 10.2.2 when applying changes from MySQL 5.7.9 . In MariaDB 10.1, the logic was only broken between MariaDB 10.1.32 and 10.1.35 (inclusive).
            marko Marko Mäkelä made changes -
            Affects Version/s 10.3.0 [ 22127 ]
            Affects Version/s 10.2.2 [ 22013 ]
            Affects Version/s 10.2.14 [ 22911 ]
            Affects Version/s 10.3.6 [ 23003 ]
            serg Sergei Golubchik made changes -
            Workflow MariaDB v3 [ 81683 ] MariaDB v4 [ 152493 ]
            mariadb-jira-automation Jira Automation (IT) made changes -
            Zendesk Related Tickets 187207

            People

              jplindst Jan Lindström (Inactive)
              kern Kern Sibbald
              Votes:
              1 Vote for this issue
              Watchers:
              8 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.