Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-31120

Duplicate entry allowed into a UNIQUE column

Details

    Description

      In a woocommerce install the following table is somehow allowing duplicate entries into the unique indexed column on the slave, which breaks replication as this is detected on the slave side.

      Table schema:

      CREATE TABLE `wp_067d483e_woocommerce_sessions` (
        `session_id` bigint(20) unsigned NOT NULL AUTO_INCREMENT,
        `session_key` char(32) NOT NULL,
        `session_value` longtext NOT NULL,
        `session_expiry` bigint(20) unsigned NOT NULL,
        PRIMARY KEY (`session_id`),
        UNIQUE KEY `session_key` (`session_key`)
      ) ENGINE=InnoDB AUTO_INCREMENT=2642 DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_general_ci;
      

      Slave error:

      Last_SQL_Error: Could not execute Write_rows_v1 event on table grao.wp_067d483e_woocommerce_sessions; Entrada duplicada '63966' para la clave 'session_key', Error_code: 1062; handler error HA_ERR_FOUND_DUPP_KEY; the event's master log db1-bin.000001, end_log_pos 109407704
      

      Screenshot attached showing the duplicate entry, confirmed by hex that it is the same data.

      We are not sure how this is happening yet. We have been offered access to the system, I will notify the affected person of this ticket once it is created.

      I have confirmed that `unique_checks` are on.

      Attachments

        1. Screenshot_188.png
          Screenshot_188.png
          8 kB
        2. master.txt
          7 kB
        3. slave.txt
          7 kB
        4. MDEV-31120.test
          2 kB
        5. MDEV-31120.cnf
          0.7 kB

        Issue Links

          Activity

            danblack Daniel Black added a comment -

            Retested 10.11, 11.0, 11.1 with InnoDB gtid_slave_pos running CHECK TABLES EXTENDED in the loop and didn't generate a clustered index warning (or any other errors) in 50k rows.

            Leaving gtid_slave_pos duplicate as a problem for MDEV-27849.

            danblack Daniel Black added a comment - Retested 10.11, 11.0, 11.1 with InnoDB gtid_slave_pos running CHECK TABLES EXTENDED in the loop and didn't generate a clustered index warning (or any other errors) in 50k rows. Leaving gtid_slave_pos duplicate as a problem for MDEV-27849 .
            elenst Elena Stepanova added a comment - - edited

            The culprit is innodb_change_buffering, which is set to all both in this report and in MDEV-31713. inserts is sufficient for duplicate keys to show up.
            Most likely the problem was introduced, or at least became visible, in 10.6.12. On some reason it is very poorly reproducible on 10.6.12, so I couldn't bisect, but I did see it happen on 10.6.12, while never on 10.6.11 (or 10.5). And there were indeed a bunch of seemingly related changes in 10.6.12, so it fits.
            On 10.6.14 and the current tree it is reproducible somewhat better, although maybe it's coincidental.
            Not reproducible on 11.0, not surprisingly since innodb_change_buffering was removed there.
            The basic test flow is such:

            • do a lot of inserts in concurrent threads with potentially duplicate keys;
            • restart the server;
            • do some more inserts (sometimes even a few non-concurrent ones are sufficient);
            • observe duplicate keys.

            The test below is for reproducing only, don't put it into the regression suite!.
            I'm not even sure it is suitable for debugging, but hopefully InnoDB experts will be able to create a better one with this as a starting point.
            I've made the test as tunable as possible, because different builds seem to demand different tweaks – some want more concurrency but are okay with less queries, some vice versa, and so on.
            For some builds the parameters set in the tests would be excessive.
            The test case doesn't have to be run with --repeat=N, instead max_trials value can be increased.
            The failure happens both on optimized and debug builds, although of course the test runs much faster on optimized ones.
            I doubt it is rr-able, but I only tried a few times (without success).

            --source include/have_innodb.inc
             
            let $restart_parameters= --innodb-change-buffering=insert --innodb-buffer-pool-size=1G --innodb-log-file-size=128M;
            let $threads= 16;
            let $queries= 640000; # Total amount, across all threads
            let $max_trials= 20;
             
            let $my_slap_command= $MYSQL_SLAP -uroot --create-schema=test --port=$MASTER_MYPORT --protocol=tcp --concurrency=$threads --query="INSERT INTO test.woocommerce_sessions (session_key, session_value, session_expiry) VALUES (ROUND(RAND()*5000), REPEAT(CHR(ROUND(@@timestamp)%26 + 97),2141), ROUND(@@timestamp)) ON DUPLICATE KEY UPDATE session_value = VALUES(session_value), session_expiry = VALUES(session_expiry)";
             
            CREATE TABLE woocommerce_sessions (
                session_id bigint(20) unsigned NOT NULL AUTO_INCREMENT,
                session_key char(32) NOT NULL,
                session_value longtext NOT NULL,
                session_expiry bigint(20) unsigned NOT NULL,
                PRIMARY KEY (session_id),
                UNIQUE KEY session_key (session_key)
            ) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_general_ci;
             
            # This restart is to pick up innodb-change-buffering,
            # it seems to work (better? or only?) than dynamic setting
            echo #;
            source include/restart_mysqld.inc;
             
            let $trial= 0;
             
            while ($trial < $max_trials)
            {
              inc $trial;
              echo ##################;
              echo # Trial $trial/$max_trials: running ~$queries queries via $threads parallel threads;
             
              --disable_result_log
              exec $my_slap_command --number-of-queries=$queries;
             
              echo #;
              source include/restart_mysqld.inc;
             
              echo #;
              echo # Sending some more queries;
             
              --disable_result_log
              exec $my_slap_command --number-of-queries=100;
             
              if (`SELECT COUNT(*) FROM (SELECT session_key, COUNT(*) cnt FROM test.woocommerce_sessions GROUP BY session_key HAVING cnt > 1) sq`)
              {
                enable_query_log;
                enable_result_log;
                echo #######################;
                echo # Found duplicate keys:;
                echo #######################;
                SELECT session_key, COUNT(*) cnt FROM woocommerce_sessions GROUP BY session_key HAVING cnt > 1;
                CHECK TABLE woocommerce_sessions EXTENDED;
                die # Found duplicate keys, see above;
              }
              echo #;
              echo # No duplicate keys yet;
            }
             
            # Cleanup
            connection default;
            DROP TABLE woocommerce_sessions;
            

            Example of the output:

            10.6 a03ce7b9 non-debug

            # Trial 4/20: running ~640000 queries via 16 parallel threads
            #
            # restart: --innodb-change-buffering=insert --innodb-buffer-pool-size=1G --innodb-log-file-size=128M
            #
            # Sending some more queries
            #######################
            # Found duplicate keys:
            #######################
            SELECT session_key, COUNT(*) cnt FROM woocommerce_sessions GROUP BY session_key HAVING cnt > 1;
            session_key	cnt
            1374	2
            1405	2
            <...cut...>
            603	2
            67	2
            CHECK TABLE woocommerce_sessions EXTENDED;
            Table	Op	Msg_type	Msg_text
            test.woocommerce_sessions	check	status	OK
            bug.t 'innodb'                           [ fail ]
                    Test ended at 2023-07-19 04:02:48
             
            CURRENT_TEST: bug.t
            mysqltest: At line 57: # Found duplicate keys, see above
            

            elenst Elena Stepanova added a comment - - edited The culprit is innodb_change_buffering , which is set to all both in this report and in MDEV-31713 . inserts is sufficient for duplicate keys to show up. Most likely the problem was introduced, or at least became visible, in 10.6.12. On some reason it is very poorly reproducible on 10.6.12, so I couldn't bisect, but I did see it happen on 10.6.12, while never on 10.6.11 (or 10.5). And there were indeed a bunch of seemingly related changes in 10.6.12, so it fits. On 10.6.14 and the current tree it is reproducible somewhat better, although maybe it's coincidental. Not reproducible on 11.0, not surprisingly since innodb_change_buffering was removed there. The basic test flow is such: do a lot of inserts in concurrent threads with potentially duplicate keys; restart the server; do some more inserts (sometimes even a few non-concurrent ones are sufficient); observe duplicate keys. The test below is for reproducing only, don't put it into the regression suite! . I'm not even sure it is suitable for debugging, but hopefully InnoDB experts will be able to create a better one with this as a starting point. I've made the test as tunable as possible, because different builds seem to demand different tweaks – some want more concurrency but are okay with less queries, some vice versa, and so on. For some builds the parameters set in the tests would be excessive. The test case doesn't have to be run with --repeat=N, instead max_trials value can be increased. The failure happens both on optimized and debug builds, although of course the test runs much faster on optimized ones. I doubt it is rr-able, but I only tried a few times (without success). --source include/have_innodb.inc   let $restart_parameters= --innodb-change-buffering=insert --innodb-buffer-pool-size=1G --innodb-log-file-size=128M; let $threads= 16; let $queries= 640000; # Total amount, across all threads let $max_trials= 20;   let $my_slap_command= $MYSQL_SLAP -uroot --create-schema=test --port=$MASTER_MYPORT --protocol=tcp --concurrency=$threads --query="INSERT INTO test.woocommerce_sessions (session_key, session_value, session_expiry) VALUES (ROUND(RAND()*5000), REPEAT(CHR(ROUND(@@timestamp)%26 + 97),2141), ROUND(@@timestamp)) ON DUPLICATE KEY UPDATE session_value = VALUES(session_value), session_expiry = VALUES(session_expiry)";   CREATE TABLE woocommerce_sessions ( session_id bigint (20) unsigned NOT NULL AUTO_INCREMENT, session_key char (32) NOT NULL , session_value longtext NOT NULL , session_expiry bigint (20) unsigned NOT NULL , PRIMARY KEY (session_id), UNIQUE KEY session_key (session_key) ) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE =utf8mb4_general_ci;   # This restart is to pick up innodb-change-buffering, # it seems to work (better? or only ?) than dynamic setting echo #; source include/restart_mysqld.inc;   let $trial= 0;   while ($trial < $max_trials) { inc $trial; echo ##################; echo # Trial $trial/$max_trials: running ~$queries queries via $threads parallel threads;   --disable_result_log exec $my_slap_command --number-of-queries=$queries;   echo #; source include/restart_mysqld.inc;   echo #; echo # Sending some more queries;   --disable_result_log exec $my_slap_command --number-of-queries=100;   if (` SELECT COUNT (*) FROM ( SELECT session_key, COUNT (*) cnt FROM test.woocommerce_sessions GROUP BY session_key HAVING cnt > 1) sq`) { enable_query_log; enable_result_log; echo #######################; echo # Found duplicate keys:; echo #######################; SELECT session_key, COUNT (*) cnt FROM woocommerce_sessions GROUP BY session_key HAVING cnt > 1; CHECK TABLE woocommerce_sessions EXTENDED; die # Found duplicate keys, see above; } echo #; echo # No duplicate keys yet; }   # Cleanup connection default ; DROP TABLE woocommerce_sessions; Example of the output: 10.6 a03ce7b9 non-debug # Trial 4/20: running ~640000 queries via 16 parallel threads # # restart: --innodb-change-buffering=insert --innodb-buffer-pool-size=1G --innodb-log-file-size=128M # # Sending some more queries ####################### # Found duplicate keys: ####################### SELECT session_key, COUNT (*) cnt FROM woocommerce_sessions GROUP BY session_key HAVING cnt > 1; session_key cnt 1374 2 1405 2 <...cut...> 603 2 67 2 CHECK TABLE woocommerce_sessions EXTENDED; Table Op Msg_type Msg_text test.woocommerce_sessions check status OK bug.t 'innodb' [ fail ] Test ended at 2023-07-19 04:02:48   CURRENT_TEST: bug.t mysqltest: At line 57: # Found duplicate keys, see above

            I can reproduce this rather easily on my system, in 57 seconds when using a RelWithDebInfo build, and some 4 to 5 times longer when using an optimized Debug build, each build failing in the first round. The following patch seems to plug this:

            diff --git a/storage/innobase/include/ibuf0ibuf.inl b/storage/innobase/include/ibuf0ibuf.inl
            index 1e21f74ff2b..801353bb5fa 100644
            --- a/storage/innobase/include/ibuf0ibuf.inl
            +++ b/storage/innobase/include/ibuf0ibuf.inl
            @@ -103,7 +103,7 @@ ibuf_should_try(
             	       && !(index->type & (DICT_CLUSTERED | DICT_IBUF))
             	       && ibuf.max_size != 0
             	       && index->table->quiesce == QUIESCE_NONE
            -	       && (ignore_sec_unique || !dict_index_is_unique(index)));
            +	       && (/*ignore_sec_unique ||*/ !dict_index_is_unique(index)));
             }
             
             /******************************************************************//**
            

            It is yet unclear to me why we would have ignore_sec_unique=true here, because there is no SET unique_checks=0 in the test case. I will investigate this deeper, once I have a better confidence that this is it. So far, 5 out of 20 trials have completed without any error.

            marko Marko Mäkelä added a comment - I can reproduce this rather easily on my system, in 57 seconds when using a RelWithDebInfo build, and some 4 to 5 times longer when using an optimized Debug build, each build failing in the first round. The following patch seems to plug this: diff --git a/storage/innobase/include/ibuf0ibuf.inl b/storage/innobase/include/ibuf0ibuf.inl index 1e21f74ff2b..801353bb5fa 100644 --- a/storage/innobase/include/ibuf0ibuf.inl +++ b/storage/innobase/include/ibuf0ibuf.inl @@ -103,7 +103,7 @@ ibuf_should_try( && !(index->type & (DICT_CLUSTERED | DICT_IBUF)) && ibuf.max_size != 0 && index->table->quiesce == QUIESCE_NONE - && (ignore_sec_unique || !dict_index_is_unique(index))); + && (/*ignore_sec_unique ||*/ !dict_index_is_unique(index))); } /******************************************************************//** It is yet unclear to me why we would have ignore_sec_unique=true here, because there is no SET unique_checks=0 in the test case. I will investigate this deeper, once I have a better confidence that this is it. So far, 5 out of 20 trials have completed without any error.

            With the above patch, the test was interrupted due to a timeout in 15 minutes:

            10.6 6e484c3bd99a6709e3554f32a99167042ea88496

            # Trial 18/20: running ~640000 queries via 16 parallel threads
            worker[1] Trying to dump core for [mysqltest - pid: 107364, winpid: 107364]
            worker[1] Trying to dump core for [mysqld.1 - pid: 137456, winpid: 137456]
            innodb.mdev-31120 'innodb'               [ fail ]  timeout after 900 seconds
            

            marko Marko Mäkelä added a comment - With the above patch, the test was interrupted due to a timeout in 15 minutes: 10.6 6e484c3bd99a6709e3554f32a99167042ea88496 # Trial 18/20: running ~640000 queries via 16 parallel threads worker[1] Trying to dump core for [mysqltest - pid: 107364, winpid: 107364] worker[1] Trying to dump core for [mysqld.1 - pid: 137456, winpid: 137456] innodb.mdev-31120 'innodb' [ fail ] timeout after 900 seconds

            There is only one caller of ibuf_should_try():

                  if (btr_op != BTR_NO_OP &&
                      ibuf_should_try(index(), btr_op != BTR_INSERT_OP))
            

            Here, we have btr_op = BTR_INSERT_IGNORE_UNIQUE_OP, so the insert buffering would be allowed in a unique secondary index. It turns out that an incorrect parameter was passed earlier in the call stack:

            diff --git a/storage/innobase/row/row0ins.cc b/storage/innobase/row/row0ins.cc
            index 3188c8c27ce..d9ba8bca64e 100644
            --- a/storage/innobase/row/row0ins.cc
            +++ b/storage/innobase/row/row0ins.cc
            @@ -3005,8 +3005,8 @@ row_ins_sec_index_entry_low(
             			search_mode = btr_latch_mode(
             				search_mode
             				| (thr_get_trx(thr)->check_unique_secondary
            -				   ? BTR_INSERT | BTR_IGNORE_SEC_UNIQUE
            -				   : BTR_INSERT));
            +				   ? BTR_INSERT
            +				   : BTR_INSERT | BTR_IGNORE_SEC_UNIQUE));
             		}
             
             		err = cursor.search_leaf(entry, PAGE_CUR_LE, search_mode,
            

            This is something that I broke in MDEV-29603. There used to be a negation before the thr_get_trx(thr)->check_unique_secondary check.

            marko Marko Mäkelä added a comment - There is only one caller of ibuf_should_try() : if (btr_op != BTR_NO_OP && ibuf_should_try(index(), btr_op != BTR_INSERT_OP)) Here, we have btr_op = BTR_INSERT_IGNORE_UNIQUE_OP , so the insert buffering would be allowed in a unique secondary index. It turns out that an incorrect parameter was passed earlier in the call stack: diff --git a/storage/innobase/row/row0ins.cc b/storage/innobase/row/row0ins.cc index 3188c8c27ce..d9ba8bca64e 100644 --- a/storage/innobase/row/row0ins.cc +++ b/storage/innobase/row/row0ins.cc @@ -3005,8 +3005,8 @@ row_ins_sec_index_entry_low( search_mode = btr_latch_mode( search_mode | (thr_get_trx(thr)->check_unique_secondary - ? BTR_INSERT | BTR_IGNORE_SEC_UNIQUE - : BTR_INSERT)); + ? BTR_INSERT + : BTR_INSERT | BTR_IGNORE_SEC_UNIQUE)); } err = cursor.search_leaf(entry, PAGE_CUR_LE, search_mode, This is something that I broke in MDEV-29603 . There used to be a negation before the thr_get_trx(thr)->check_unique_secondary check.

            People

              marko Marko Mäkelä
              TheLinuxJedi Andrew Hutchings (Inactive)
              Votes:
              2 Vote for this issue
              Watchers:
              14 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.