Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-32250

To benefit from MDEV-515 , please make option --no-autocommit default in mysqldump from 10.6 onwards

Details

    Description

      In order to fully take advantage of bulk insert speed improvement brought by MDEV-515 , we need to have insert statements generated by mysqldump wrapped into a single transaction per table.

      Mysqldump already has an option for that, called --no-autocommit.
      My proposal is to make it TRUE by default from 10.6 (right now it is false).

      Thanks,
      Rick

      Attachments

        Issue Links

          Activity

            Only the first insert statement in .sql file uses bulk insert method (MDEV-24621 mechanism). Remaining insert just avoids the undo log write for it and uses normal code path for insert. MDEV-34703 shouldn't affect this issue.

            thiru Thirunarayanan Balathandayuthapani added a comment - - edited Only the first insert statement in .sql file uses bulk insert method ( MDEV-24621 mechanism). Remaining insert just avoids the undo log write for it and uses normal code path for insert. MDEV-34703 shouldn't affect this issue.

            I think that disabling the merge sort for the PRIMARY KEY (and assuming that the data is already sorted) should be a separate ticket of its own. It is related to this as well as MDEV-34703. My concern was that by changing the default behavior of mariadb-dump we would expose a significant performance regression, but based on your test, the opposite would seem to be the case.

            In any case, we should provide a way to have the old behavior. I think that we must both change the default value and allow an argument for the option:

            diff --git a/client/mysqldump.cc b/client/mysqldump.cc
            index d1da59c07fb..2ad641cf708 100644
            --- a/client/mysqldump.cc
            +++ b/client/mysqldump.cc
            @@ -501,7 +501,7 @@ static struct my_option my_long_options[] =
                 1024*1024L-1025, 4096, 16*1024L*1024L, 0, 1024, 0},
               {"no-autocommit", 0,
                "Wrap tables with autocommit/commit statements.",
            -   &opt_autocommit, &opt_autocommit, 0, GET_BOOL, NO_ARG, 0, 0, 0, 0, 0, 0},
            +   &opt_autocommit, &opt_autocommit, 0, GET_BOOL, OPT_ARG, 1, 0, 0, 0, 0, 0},
               {"no-create-db", 'n',
                "Suppress the CREATE DATABASE ... IF EXISTS statement that normally is "
                "output for each dumped database if --all-databases or --databases is "
            

            I would also suggest to rename the variable to no_autocommit or opt_no_autocommit, because usually the opt_ prefix refers to "optional". The following will otherwise be rather confusing:

                if (opt_autocommit)
                {
                  fprintf(md_result_file, "set autocommit=0;\n");
            

            marko Marko Mäkelä added a comment - I think that disabling the merge sort for the PRIMARY KEY (and assuming that the data is already sorted) should be a separate ticket of its own. It is related to this as well as MDEV-34703 . My concern was that by changing the default behavior of mariadb-dump we would expose a significant performance regression, but based on your test, the opposite would seem to be the case. In any case, we should provide a way to have the old behavior. I think that we must both change the default value and allow an argument for the option: diff --git a/client/mysqldump.cc b/client/mysqldump.cc index d1da59c07fb..2ad641cf708 100644 --- a/client/mysqldump.cc +++ b/client/mysqldump.cc @@ -501,7 +501,7 @@ static struct my_option my_long_options[] = 1024*1024L-1025, 4096, 16*1024L*1024L, 0, 1024, 0}, {"no-autocommit", 0, "Wrap tables with autocommit/commit statements.", - &opt_autocommit, &opt_autocommit, 0, GET_BOOL, NO_ARG, 0, 0, 0, 0, 0, 0}, + &opt_autocommit, &opt_autocommit, 0, GET_BOOL, OPT_ARG, 1, 0, 0, 0, 0, 0}, {"no-create-db", 'n', "Suppress the CREATE DATABASE ... IF EXISTS statement that normally is " "output for each dumped database if --all-databases or --databases is " I would also suggest to rename the variable to no_autocommit or opt_no_autocommit , because usually the opt_ prefix refers to "optional". The following will otherwise be rather confusing: if (opt_autocommit) { fprintf (md_result_file, "set autocommit=0;\n" );

            2.3GB is a very small example for table size (suprisingly, a very slow one). I thought a "realistic" would be 10 or more times larger.

            wlad Vladislav Vaintroub added a comment - 2.3GB is a very small example for table size (suprisingly, a very slow one). I thought a "realistic" would be 10 or more times larger.

            wlad mysqldump produces multiple insert statements. Only the first insert statement in .sql file does bulk insert mechanism.
            Remaining inserts are just avoiding writing the undo log and uses normal codepath (similar to insert statement).

            thiru Thirunarayanan Balathandayuthapani added a comment - wlad mysqldump produces multiple insert statements. Only the first insert statement in .sql file does bulk insert mechanism. Remaining inserts are just avoiding writing the undo log and uses normal codepath (similar to insert statement).

            People

              thiru Thirunarayanan Balathandayuthapani
              rpizzi Rick Pizzi (Inactive)
              Votes:
              2 Vote for this issue
              Watchers:
              14 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.