Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-20939

Race condition between mysqldump import and InnoDB persistent statistics calculation

Details

    Description

      mysqldump ignores some internal tables by default, such as:

      • mysql.apply_status
      • mysql.schema
      • mysql.general_log
      • mysql.slow_log
      • mysql.transaction_registry

      See here:

      https://github.com/MariaDB/server/blob/mariadb-10.4.8/client/mysqldump.c#L1001

      I think the following tables should also be ignored by default:

      • mysql.innodb_index_stats
      • mysql.innodb_table_stats

      The reason is that if someone tries to import a dump created by mysqldump, then a race condition between the import and the InnoDB persistent statistics calculation exists, and the user can see errors like this:

      Duplicate 'PRIMARY-n_diff_pfx01' for key 'primary' for database 'xxxx98'.
      

      This was reported long ago in the following upstream bug report:

      https://bugs.mysql.com/bug.php?id=71814

      The workaround mentioned in that bug report is to disable InnoDB persistent statistics prior to the import:

      SET GLOBAL innodb_stats_auto_recalc=0;
      SET GLOBAL innodb_stats_persistent=0;
      

      However, it would be more ideal to make this workaround unnecessary by fixing the problem.

      Attachments

        Issue Links

          Activity

            I am afraid we can not just ignore the tables:
            2020-03-18 11:51:58 7fa31c4e2700 InnoDB: Error: Table "mysql"."innodb_table_stats" not found.

            sanja Oleksandr Byelkin added a comment - I am afraid we can not just ignore the tables: 2020-03-18 11:51:58 7fa31c4e2700 InnoDB: Error: Table "mysql"."innodb_table_stats" not found.

            sanja,

            What if the schema of the tables were dumped, but the data was ignored? The data will be recalculated by InnoDB anyway, so it doesn't need to be dumped.

            GeoffMontee Geoff Montee (Inactive) added a comment - sanja , What if the schema of the tables were dumped, but the data was ignored? The data will be recalculated by InnoDB anyway, so it doesn't need to be dumped.

            yes, I am moving in this direction

            sanja Oleksandr Byelkin added a comment - yes, I am moving in this direction

            As a side effect there will be a new parameter --ignore-table-data and by default both tables will be in it

            sanja Oleksandr Byelkin added a comment - As a side effect there will be a new parameter --ignore-table-data and by default both tables will be in it

            commit 5514b48673ceeefc0427834df233e2e8e381f862 (HEAD > bb-10.1MDEV-20939, origin/bb-10.1-MDEV-20939)
            Author: Oleksandr Byelkin <sanja@mariadb.com>
            Date: Thu Mar 19 15:02:09 2020 +0100

            MDEV-20939: Race condition between mysqldump import and InnoDB persistent statistics calculation

            For mysql.innodb_index_stats and mysql.innodb_table_stats by default
            backed up only structure.

            --ignore-table-data parameter added.

            sanja Oleksandr Byelkin added a comment - commit 5514b48673ceeefc0427834df233e2e8e381f862 (HEAD > bb-10.1 MDEV-20939 , origin/bb-10.1- MDEV-20939 ) Author: Oleksandr Byelkin <sanja@mariadb.com> Date: Thu Mar 19 15:02:09 2020 +0100 MDEV-20939 : Race condition between mysqldump import and InnoDB persistent statistics calculation For mysql.innodb_index_stats and mysql.innodb_table_stats by default backed up only structure. --ignore-table-data parameter added.

            we decided to make workaround possible with new parameter in mysqldump for 10.1 and above, but this MDEV goes to innodb team for fix inside innodb

            sanja Oleksandr Byelkin added a comment - we decided to make workaround possible with new parameter in mysqldump for 10.1 and above, but this MDEV goes to innodb team for fix inside innodb

            I think that this bug could be fixed by using proper MDL protection for the InnoDB statistics tables. That we can achieve by replacing the interface that is writing to the tables. MDEV-15020 has already been filed for that task.

            I am reassigning this bug to psergey so that he can provide the code that has been requested in MDEV-15020. Once we have that code, both MDEV-15020 and this ticket can be reassigned to me, for implementing the InnoDB changes.

            marko Marko Mäkelä added a comment - I think that this bug could be fixed by using proper MDL protection for the InnoDB statistics tables. That we can achieve by replacing the interface that is writing to the tables. MDEV-15020 has already been filed for that task. I am reassigning this bug to psergey so that he can provide the code that has been requested in MDEV-15020 . Once we have that code, both MDEV-15020 and this ticket can be reassigned to me, for implementing the InnoDB changes.

            That's too much of a refactoring to block such a simple issue. Let's just do what Geoff suggested and ignore the table like mysqldump already does for all other system tables. Or ignore only data, as MDEV-22037 is already pushed.

            serg Sergei Golubchik added a comment - That's too much of a refactoring to block such a simple issue. Let's just do what Geoff suggested and ignore the table like mysqldump already does for all other system tables. Or ignore only data, as MDEV-22037 is already pushed.
            rucha174 Rucha Deodhar added a comment - - edited (previous) Patch: https://github.com/MariaDB/server/commit/96d10fc1da963b78354a5f768c49d05035aaba60 Patch after Daniel Black suggested changes(final patch): https://github.com/MariaDB/server/commit/1dbe38b6f5f4e6b92f23bbf2257d8589b630ec26

            See the second comment — if tables aren't even created, the log will have annoying errors.

            Let's dump the table structure, but not the data.

            Another consideration, should dump_all_stats() still dump innodb stat tables? danblack, what do you think?

            serg Sergei Golubchik added a comment - See the second comment — if tables aren't even created, the log will have annoying errors. Let's dump the table structure, but not the data. Another consideration, should dump_all_stats() still dump innodb stat tables? danblack , what do you think?
            danblack Daniel Black added a comment - - edited

            With --insert-ignore (opt_ignore) or --replace (opt_replace_into) there isn't an issue with dumped sql.

            Because these innodb tables are meant to support a persistent stats (https://mariadb.com/kb/en/innodb-persistent-statistics/) I think it should be still possible to dump them with --system=stats (or a normal dump if --replace or --insert-ignore) is specified.

            In the proposal below I've changed the dump_all_stats, and dump_all_timezones (the two that use inserts directly), to use `REPLACE INTO` in the event --insert-ignore isn't specified in order to prevent PK/unique key conflicts. I've also changed to use the ignore_data table for the base case.

            bb-10.2-danielblack-MDEV-20939-innodb_stats_mysqldump_conflicts (https://github.com/MariaDB/server/commit/0f35033de9a)

            updated commit, includes test. Also unsure why general_log/slow_log aren't excluded in test.

            Acceptable?

            TODO man/mysqldump.1 updates

            danblack Daniel Black added a comment - - edited With --insert-ignore (opt_ignore) or --replace (opt_replace_into) there isn't an issue with dumped sql. Because these innodb tables are meant to support a persistent stats ( https://mariadb.com/kb/en/innodb-persistent-statistics/ ) I think it should be still possible to dump them with --system=stats (or a normal dump if --replace or --insert-ignore) is specified. In the proposal below I've changed the dump_all_stats, and dump_all_timezones (the two that use inserts directly), to use `REPLACE INTO` in the event --insert-ignore isn't specified in order to prevent PK/unique key conflicts. I've also changed to use the ignore_data table for the base case. bb-10.2-danielblack- MDEV-20939 -innodb_stats_mysqldump_conflicts ( https://github.com/MariaDB/server/commit/0f35033de9a ) updated commit, includes test. Also unsure why general_log/slow_log aren't excluded in test. Acceptable? TODO man/mysqldump.1 updates
            rucha174 Rucha Deodhar added a comment - - edited serg Patch: https://github.com/MariaDB/server/commit/2d43578d1ee0fe21dc1cf9ad9b930844b3baef5d
            danblack Daniel Black added a comment -

            Was there a reason the conditional on !(opt_ignore || opt_replace_into) wasn't acceptable?

            rucha174 on your patch.

            • As your adding to ignore_data unconditionally, leave the system=stats to add into ignore_table. Otherwise myqldump --system=stat mysql with have CREATE TABLES which isn't desired.
            • use --skip-comments in your test otherwise you have a mysqldump server version (and debug/release) in the results file that isn't mergeable or even passes release and debug test versoins
            • I do want the dump_all_stats and dump_all_timezones part of https://github.com/MariaDB/server/commit/0f35033de9a to resolve this issue with mysqldump --system= {all,stats,timezones}

              and the corresponding mysqldump-system test result change.

            Should be covered:

            man/mysqldump.1 update to reflect this change
            at top of client/mysqldump.c - increment DUMP_VERSION

            danblack Daniel Black added a comment - Was there a reason the conditional on !(opt_ignore || opt_replace_into) wasn't acceptable? rucha174 on your patch. As your adding to ignore_data unconditionally, leave the system=stats to add into ignore_table. Otherwise myqldump --system=stat mysql with have CREATE TABLES which isn't desired. use --skip-comments in your test otherwise you have a mysqldump server version (and debug/release) in the results file that isn't mergeable or even passes release and debug test versoins I do want the dump_all_stats and dump_all_timezones part of https://github.com/MariaDB/server/commit/0f35033de9a to resolve this issue with mysqldump --system= {all,stats,timezones} and the corresponding mysqldump-system test result change. Should be covered: man/mysqldump.1 update to reflect this change at top of client/mysqldump.c - increment DUMP_VERSION
            rucha174 Rucha Deodhar added a comment - serg Patch after latest fix: https://github.com/MariaDB/server/commit/96702de22923728b255293d4c38ab07a56deb782 Also, danblack has pushed fix about dump_all_stats() and dump_all_timezones() to 10.2 related to this issue. https://github.com/MariaDB/server/commit/c58d2c941c49550a85d610b1ad7b51cf0d4cd8db https://github.com/MariaDB/server/commit/cc2d6d1bb26cdcd649357ec23cd54dd28ddefaf3

            96702de22923728b255293d4c38ab07a56deb782 is ok to push

            serg Sergei Golubchik added a comment - 96702de22923728b255293d4c38ab07a56deb782 is ok to push

            People

              rucha174 Rucha Deodhar
              GeoffMontee Geoff Montee (Inactive)
              Votes:
              1 Vote for this issue
              Watchers:
              10 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.