Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-17377

invalid gap in auto-increment values after LOAD DATA

Details

    Description

      probably 5.5 is affected too, I didn't check

      this is the test case:

      --source include/have_innodb.inc
      create or replace table t1 (pk int auto_increment primary key, f varchar(20)) engine=innodb;
      insert t1 (f) values ('a'), ('b'), ('c'), ('d');
      select null, f into outfile 'load.data' from t1 limit 2;
      load data infile 'load.data' into table t1;
      insert t1 (f) values ('<===');
      select * from t1;
      drop table t1;
      

      What happens here, insert tells the engine to reserve 4 auto-increment values. This "4" number is never reset, so the following load data also reserves four values, even if it needs only two. And the next insert creates a gap in the auto-increment sequence.

      Most engines do not reserve auto-increment values, and generate one value at a time. So with MyISAM, for example, there is no gap in this test.

      The fix would be to reset the list at the end of the insert statement.

      Attachments

        Issue Links

          Activity

            midenok Aleksey Midenkov added a comment - - edited

            The suggested fix fails for this (rpl.rpl_auto_increment,innodb,mix):

            CREATE TRIGGER t1_update AFTER UPDATE ON t1 FOR EACH ROW INSERT INTO slave_only VALUES(NULL);
            

            Failure details

            I'd suggest:

            Longer fix (for 10.3 and lower appropriate versions)

            Expand Sql_cmd hierarchy for INSERT command and isolate its private data such as LEX::many_values, LEX::insert_list, LEX::field_list, LEX::values_list as well as interface tvc_start(), tvc_finalize(), tvc_finalize_derived() (and whatever else).

            This would work because Sql_cmd lifetime will be for parsed trigger, not for single INSERT statement.

            Shorter fix (for versions where Longer fix is impossible)

            Check for sql_command around many_values.

            midenok Aleksey Midenkov added a comment - - edited The suggested fix fails for this ( rpl.rpl_auto_increment,innodb,mix ): CREATE TRIGGER t1_update AFTER UPDATE ON t1 FOR EACH ROW INSERT INTO slave_only VALUES ( NULL ); Failure details I'd suggest: Longer fix (for 10.3 and lower appropriate versions) Expand Sql_cmd hierarchy for INSERT command and isolate its private data such as LEX::many_values , LEX::insert_list , LEX::field_list , LEX::values_list as well as interface tvc_start() , tvc_finalize() , tvc_finalize_derived() (and whatever else). This would work because Sql_cmd lifetime will be for parsed trigger, not for single INSERT statement. Shorter fix (for versions where Longer fix is impossible) Check for sql_command around many_values .

            Checking for sql_command won't help if you have many tables and sub-statements all using INSERT but with their own lists of auto-increment values.

            The simple fix would be not to reserve many values in these complex cases.

            A more involved fix would be to do it properly, per table and per statement. E.g. not in THD or LEX, but in the handler.
            I'm not sure it's worth the troubles, though.

            serg Sergei Golubchik added a comment - Checking for sql_command won't help if you have many tables and sub-statements all using INSERT but with their own lists of auto-increment values. The simple fix would be not to reserve many values in these complex cases. A more involved fix would be to do it properly, per table and per statement. E.g. not in THD or LEX, but in the handler. I'm not sure it's worth the troubles, though.

            serg Can you give an example? Each substatement is parsed separately, so it has its own private `LEX` and therefore `many_values`. Multiple `INSERT` into different tables again parsed separately: `many_values` is reset on start of each `INSERT`.

            midenok Aleksey Midenkov added a comment - serg Can you give an example? Each substatement is parsed separately, so it has its own private `LEX` and therefore `many_values`. Multiple `INSERT` into different tables again parsed separately: `many_values` is reset on start of each `INSERT`.

            tvc_ interface is for SELECT from values-constructed table:

            select * from (values (1)) a;
            

            This is also an implicit SELECT:

            create or replace procedure p1() values (1);
            

            Thus many_values is used for 2 commands INSERT and SELECT so making it private to Sql_cmd is a larger subject which should not go with this bugfix.

            I still believe that checking for sql_command is sufficient because of reasons in previous comment (unless there are some SQL constructs I'm not aware of).

            midenok Aleksey Midenkov added a comment - tvc_ interface is for SELECT from values-constructed table: select * from ( values (1)) a; This is also an implicit SELECT: create or replace procedure p1() values (1); Thus many_values is used for 2 commands INSERT and SELECT so making it private to Sql_cmd is a larger subject which should not go with this bugfix. I still believe that checking for sql_command is sufficient because of reasons in previous comment (unless there are some SQL constructs I'm not aware of).

            what about this simple fix:

            diff --git a/sql/sql_yacc.yy b/sql/sql_yacc.yy
            --- a/sql/sql_yacc.yy
            +++ b/sql/sql_yacc.yy
            @@ -13074,6 +13074,7 @@ load:
                         lex->field_list.empty();
                         lex->update_list.empty();
                         lex->value_list.empty();
            +            lex->many_values.empty();
                       }
                       opt_load_data_charset
                       { Lex->exchange->cs= $15; }
            

            serg Sergei Golubchik added a comment - what about this simple fix: diff --git a/sql/sql_yacc.yy b/sql/sql_yacc.yy --- a/sql/sql_yacc.yy +++ b/sql/sql_yacc.yy @@ -13074,6 +13074,7 @@ load: lex->field_list.empty(); lex->update_list.empty(); lex->value_list.empty(); + lex->many_values.empty(); } opt_load_data_charset { Lex->exchange->cs= $15; }

            Yes, that's better.

            midenok Aleksey Midenkov added a comment - Yes, that's better.

            People

              serg Sergei Golubchik
              serg Sergei Golubchik
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.