Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-29068

Cascade foreign key updates do not apply in online alter

Details

    Description

      --source include/have_debug_sync.inc
      --source include/have_innodb.inc
       
      create table t1 (a int primary key) engine=InnoDB;
      insert into t1 values (1),(2),(3);
      create table t2 (b int, foreign key (b) references t1 (a) on update cascade) engine=InnoDB;
      insert into t2 values (1),(2),(3);
      --send
        set debug_sync= 'now wait_for downgraded';
       
      --connect (con_alter,localhost,root,,test)
      set debug_sync= 'alter_table_online_downgraded signal downgraded wait_for goforit';
      --send
        alter table t2 add c int, algorithm=copy, lock=none;
       
      --connection default
      --reap
      update t1 set a = 22 where a = 2;
      set debug_sync= 'now signal goforit';
       
      --connection con_alter
      --reap
      select * from t2;
       
      # Cleanup
      drop table t2, t1;
      set debug_sync= reset;
      

      bb-10.10-MDEV-16329 49ad87590

      connection default;
      update t1 set a = 22 where a = 2;
      set debug_sync= 'now signal goforit';
      connection con_alter;
      select * from t2;
      b	c
      1	NULL
      2	NULL
      3	NULL
      

      Expected 22 instead of 2, via update cascade. 2 is now an orphan row.

      Attachments

        Issue Links

          Activity

            Thank you. Here is some quick feedback before studying the changes in more detail. This must not only pass my review, but also correctness testing by mleich and performance testing by axel, once we are that far.

            Unlike THD, InnoDB always knew about the internals of TABLE and TABLE_SHARE. So, please do not add sql_get_maria_table_record().

            Frequently invoking current_thd or innodb_find_table_for_vc(current_thd, table) is something that should be avoided due to potential performance regression. (In the purge of committed transaction history we do it, but that is due to bad design, which will hopefully be addressed by MDEV-17598.)

            Can we just ensure that table->vc_templ->mysql_table is always initialized when necessary? There is a mismatch here, because dict_table_t actually corresponds to TABLE_SHARE while row_prebuilt_t corresponds to TABLE. I believe that this dict_table_t::vc_templ is some dummy shared handle that is mainly used for metadata. I did not check how virtual column values are evaluated; hopefully no concurrent calls use the same TABLE::record buffer in table->vc_templ->mysql_table->record.

            In report_row_update(), there is a conversion of a record followed by computing the offsets to the start of each field, followed by converting to the TABLE::record format. Can we please just convert directly from dtuple_t to TABLE::record?

            Please test this also for versioned tables, for both UPDATE and DELETE.

            The Boolean data member upd_node_t::in_mysql_interface could be replaced with a 3-valued enum, using a third value instead of your proposed upd_node_t::sql_is_online_alter. The change to row_upd() cleanup needs to be explained in a source code comment or the commit message.

            marko Marko Mäkelä added a comment - Thank you. Here is some quick feedback before studying the changes in more detail. This must not only pass my review, but also correctness testing by mleich and performance testing by axel , once we are that far. Unlike THD , InnoDB always knew about the internals of TABLE and TABLE_SHARE . So, please do not add sql_get_maria_table_record() . Frequently invoking current_thd or innodb_find_table_for_vc(current_thd, table) is something that should be avoided due to potential performance regression. (In the purge of committed transaction history we do it, but that is due to bad design, which will hopefully be addressed by MDEV-17598 .) Can we just ensure that table->vc_templ->mysql_table is always initialized when necessary? There is a mismatch here, because dict_table_t actually corresponds to TABLE_SHARE while row_prebuilt_t corresponds to TABLE . I believe that this dict_table_t::vc_templ is some dummy shared handle that is mainly used for metadata. I did not check how virtual column values are evaluated; hopefully no concurrent calls use the same TABLE::record buffer in table->vc_templ->mysql_table->record . In report_row_update() , there is a conversion of a record followed by computing the offsets to the start of each field, followed by converting to the TABLE::record format. Can we please just convert directly from dtuple_t to TABLE::record ? Please test this also for versioned tables, for both UPDATE and DELETE . The Boolean data member upd_node_t::in_mysql_interface could be replaced with a 3-valued enum , using a third value instead of your proposed upd_node_t::sql_is_online_alter . The change to row_upd() cleanup needs to be explained in a source code comment or the commit message.
            nikitamalyavin Nikita Malyavin added a comment - - edited

            Unlike THD, InnoDB always knew about the internals of TABLE and TABLE_SHARE. So, please do not add sql_get_maria_table_record().

            Do you suggest to just include "table.h"?

            Frequently invoking current_thd or innodb_find_table_for_vc(current_thd, table) is something that should be avoided due to potential performance regression

            Regarding current_thd, do you think thread_local variables can still be an issue?

            I think I can avoid frequent innodb_find_table_for_vc call, by moving a check to table open (ha_innobase::open), so the default execution branch won't be affected.

            Can we just ensure that table->vc_templ->mysql_table is always initialized when necessary?

            It is done in innodb_find_table_for_vc:

            if (table->vc_templ && table->vc_templ->mysql_table_query_id == thd_get_query_id(thd)) {
            	return table->vc_templ->mysql_table;
            }
            

            So I guess it behaves not quite well when the changes are issued concurrently.
            Besides considering this code down the function:

            	mysql_table = find_fk_open_table(thd, db_buf, db_buf_len,
            					 tbl_buf, tbl_buf_len);
            	if (table->vc_templ) {
            		table->vc_templ->mysql_table = mysql_table;
            		table->vc_templ->mysql_table_query_id = thd_get_query_id(thd);
                    }
            

            It looks like cascade updates can be just done wrong in parallel. Working correctly would require an exclusive lock on vc_templ, while calling. I can't see any, beginning from row_ins_foreign_fill_virtual. I think there still can be an issue.

            Overall, I don't like the vc_templ design, and wouldn't like to rely on it in such touchy place. While vcol indexes can be rarely used, ONLINE ALTER TABLE is going to be default within the release. I don't won't any problems, don't won't to check all the places where it recreates and pray it worked from the first time. In contrast, default ALTER TABLE should just work, no matter what, so I'd chose some potential performance regression (but only when ONLINE ALTER is in progress!) in favour of instability.

            There is a mismatch here, because dict_table_t actually corresponds to TABLE_SHARE while row_prebuilt_t corresponds to TABLE

            If that's so, then maybe we can modifyrow prebuilt_t and add a link to referencing ones? Then we will just pass the correct prebuilt to row_ins_foreign_check_on_constraint and have no overhead!

            Please test this also for versioned tables, for both UPDATE and DELETE.

            Right, will do.

            The Boolean data member upd_node_t::in_mysql_interface could be replaced with a 3-valued enum, using a third value instead of your proposed upd_node_t::sql_is_online_alter.

            I was reasoning about moving the check into ha_innobase::open above. If done so, the check will be in dict_table_t. Hopefully, if ONLINE ALTER TABLE is in progress, then all the transactions using this table are affected, so it can be done there.

            In report_row_update(), there is a conversion of a record followed by computing the offsets to the start of each field, followed by converting to the TABLE::record format. Can we please just convert directly from dtuple_t to TABLE::record?

            Can you suggest a function for that?

            The change to row_upd() cleanup needs to be explained in a source code comment or the commit message

            Well, it's there...
            > upd_node_t::sql_is_online_alter is added to reuse cascade->row and cascade->upd_row.
            I will try to clarify it, or just add a comment in the code.

            nikitamalyavin Nikita Malyavin added a comment - - edited Unlike THD, InnoDB always knew about the internals of TABLE and TABLE_SHARE. So, please do not add sql_get_maria_table_record(). Do you suggest to just include "table.h"? Frequently invoking current_thd or innodb_find_table_for_vc(current_thd, table) is something that should be avoided due to potential performance regression Regarding current_thd, do you think thread_local variables can still be an issue? I think I can avoid frequent innodb_find_table_for_vc call, by moving a check to table open (ha_innobase::open), so the default execution branch won't be affected. Can we just ensure that table->vc_templ->mysql_table is always initialized when necessary? It is done in innodb_find_table_for_vc: if (table->vc_templ && table->vc_templ->mysql_table_query_id == thd_get_query_id(thd)) { return table->vc_templ->mysql_table; } So I guess it behaves not quite well when the changes are issued concurrently. Besides considering this code down the function: mysql_table = find_fk_open_table(thd, db_buf, db_buf_len, tbl_buf, tbl_buf_len); if (table->vc_templ) { table->vc_templ->mysql_table = mysql_table; table->vc_templ->mysql_table_query_id = thd_get_query_id(thd); } It looks like cascade updates can be just done wrong in parallel. Working correctly would require an exclusive lock on vc_templ, while calling. I can't see any, beginning from row_ins_foreign_fill_virtual. I think there still can be an issue. Overall, I don't like the vc_templ design, and wouldn't like to rely on it in such touchy place. While vcol indexes can be rarely used, ONLINE ALTER TABLE is going to be default within the release. I don't won't any problems, don't won't to check all the places where it recreates and pray it worked from the first time. In contrast, default ALTER TABLE should just work, no matter what, so I'd chose some potential performance regression (but only when ONLINE ALTER is in progress!) in favour of instability. There is a mismatch here, because dict_table_t actually corresponds to TABLE_SHARE while row_prebuilt_t corresponds to TABLE If that's so, then maybe we can modifyrow prebuilt_t and add a link to referencing ones? Then we will just pass the correct prebuilt to row_ins_foreign_check_on_constraint and have no overhead! Please test this also for versioned tables, for both UPDATE and DELETE. Right, will do. The Boolean data member upd_node_t::in_mysql_interface could be replaced with a 3-valued enum, using a third value instead of your proposed upd_node_t::sql_is_online_alter. I was reasoning about moving the check into ha_innobase::open above. If done so, the check will be in dict_table_t. Hopefully, if ONLINE ALTER TABLE is in progress, then all the transactions using this table are affected, so it can be done there. In report_row_update(), there is a conversion of a record followed by computing the offsets to the start of each field, followed by converting to the TABLE::record format. Can we please just convert directly from dtuple_t to TABLE::record? Can you suggest a function for that? The change to row_upd() cleanup needs to be explained in a source code comment or the commit message Well, it's there... > upd_node_t::sql_is_online_alter is added to reuse cascade->row and cascade->upd_row. I will try to clarify it, or just add a comment in the code.

            Yes, I would just #include "table.h" in the file that needs it, or move the code to a file that already knows about TABLE. Having a nonzero-overhead-abstraction layer between the SQL layer and the rest of InnoDB might have been understandable back when InnoDB was written in C and the rest of the code in C++. We have no plans to link the InnoDB core with any other code, so I would not mind having more dependencies at the low level. We already have it for things like LF_BACKOFF().

            Nice that you agree that the table->vc_templ is a questionable and potentially incorrect. I think that it needs to be fixed in any case. It would be good to use a solid foundation (and common code base) for everything that is related to InnoDB foreign keys (until that code is replaced in MDEV-22361). I think that when ON UPDATE…CASCADE or ON UPDATE…SET NULL affects a base column, InnoDB must evaluate the dependent virtual columns that the FOREIGN KEY constraint is attached to. If you think that there is a correctness issue with that, please file a bug for it and mark it as blocking this one.

            For converting dtuple_t to TABLE::record, there exists innobase_row_to_mysql(), which is used in the InnoDB native online ALTER TABLE for reporting duplicate key values caused by concurrent DML.

            marko Marko Mäkelä added a comment - Yes, I would just #include "table.h" in the file that needs it, or move the code to a file that already knows about TABLE . Having a nonzero-overhead-abstraction layer between the SQL layer and the rest of InnoDB might have been understandable back when InnoDB was written in C and the rest of the code in C++. We have no plans to link the InnoDB core with any other code, so I would not mind having more dependencies at the low level. We already have it for things like LF_BACKOFF() . Nice that you agree that the table->vc_templ is a questionable and potentially incorrect. I think that it needs to be fixed in any case. It would be good to use a solid foundation (and common code base) for everything that is related to InnoDB foreign keys (until that code is replaced in MDEV-22361 ). I think that when ON UPDATE…CASCADE or ON UPDATE…SET NULL affects a base column, InnoDB must evaluate the dependent virtual columns that the FOREIGN KEY constraint is attached to. If you think that there is a correctness issue with that, please file a bug for it and mark it as blocking this one. For converting dtuple_t to TABLE::record , there exists innobase_row_to_mysql() , which is used in the InnoDB native online ALTER TABLE for reporting duplicate key values caused by concurrent DML.
            nikitamalyavin Nikita Malyavin added a comment - - edited

            I think that when ON UPDATE…CASCADE or ON UPDATE…SET NULL affects a base column, InnoDB must evaluate the dependent virtual columns that the FOREIGN KEY constraint is attached to.

            If there's any. if no, vc_templ is not created. I'd have to alter the creation rule, and all the consequences. Besides, any parallel update will rewrite vc_templ->mysql_table (or, as I suspect, another thread may access an incorrect one).

            Marko, we can go another way, and just add links to other, referencing prebuilt's. Then, virtual columns may use it too! That'll be an improvement anyway, since vc_templ->mysql_table is only valid if it was set up by the same thread. If not, then another search is done.
            Using prebuilt will avoid any concurrency in this zone

            nikitamalyavin Nikita Malyavin added a comment - - edited I think that when ON UPDATE…CASCADE or ON UPDATE…SET NULL affects a base column, InnoDB must evaluate the dependent virtual columns that the FOREIGN KEY constraint is attached to. If there's any. if no, vc_templ is not created. I'd have to alter the creation rule, and all the consequences. Besides, any parallel update will rewrite vc_templ->mysql_table (or, as I suspect, another thread may access an incorrect one). Marko, we can go another way, and just add links to other, referencing prebuilt's. Then, virtual columns may use it too! That'll be an improvement anyway, since vc_templ->mysql_table is only valid if it was set up by the same thread. If not, then another search is done. Using prebuilt will avoid any concurrency in this zone

            Well, that was easy (MDEV-29181). Besides, i found another bug while I was configuring the test (MDEV-29182)
            And why it compares query id? – It seems that TABLE instance can be changed with time, so the one stored can become invalid. But we're still lucky to access it, because it still stays allocated, but I believe it also can also can be attacked, with some cache resize, for example.

            So just dropping the field completely should be correct solution, and moving to use prebuilts is even more performant. You are free to choose, whether we do the initialization on demand, or on open (i like open more)

            nikitamalyavin Nikita Malyavin added a comment - Well, that was easy ( MDEV-29181 ). Besides, i found another bug while I was configuring the test ( MDEV-29182 ) And why it compares query id? – It seems that TABLE instance can be changed with time, so the one stored can become invalid. But we're still lucky to access it, because it still stays allocated, but I believe it also can also can be attacked, with some cache resize, for example. So just dropping the field completely should be correct solution, and moving to use prebuilts is even more performant. You are free to choose, whether we do the initialization on demand, or on open (i like open more)

            Please review branch bb-10.10-MDEV-29181

            nikitamalyavin Nikita Malyavin added a comment - Please review branch bb-10.10-MDEV-29181
            serg Sergei Golubchik added a comment - https://github.com/MariaDB/server/commit/4b758fdc735b0494ed37c02b56c9c9635e96cebb is ok to push

            I think that the commit message (as well as this ticket) needs to prominently document the limitation that was introduced. As far as I understand, the approved fix is to disable online ALTER TABLE when FOREIGN KEY constraints include ON UPDATE or ON DELETE clauses that end in CASCADE or SET NULL.

            marko Marko Mäkelä added a comment - I think that the commit message (as well as this ticket) needs to prominently document the limitation that was introduced. As far as I understand, the approved fix is to disable online ALTER TABLE when FOREIGN KEY constraints include ON UPDATE or ON DELETE clauses that end in CASCADE or SET NULL .

            People

              nikitamalyavin Nikita Malyavin
              elenst Elena Stepanova
              Votes:
              1 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.