Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-25910

Incorrect crash recovery of ALTER TABLE...ALGORITHM=COPY




      The attached copy of a data directory data_copy.tar.bz2 is from a run where according to rr replay the server had been executing ALTER TABLE when it was killed:

      10.6 6fbf978eec4506eb46737ac4da00ea04403ae855

      #11 0x000055b80bc9831d in ha_innobase::write_row (this=0x61d000a582b8, record=0x61a0004e4eb8 "\b\376\t\001") at /data/Server/bb-10.6-MDEV-25062/storage/innobase/handler/ha_innodb.cc:7663
      #12 0x000055b80b3a7320 in handler::ha_write_row (this=0x61d000a582b8, buf=0x61a0004e4eb8 "\b\376\t\001") at /data/Server/bb-10.6-MDEV-25062/sql/handler.cc:7240
      #13 0x000055b80ae631e4 in copy_data_between_tables (thd=0x62b0000cb218, from=0x61900042fe98, to=0x619000178998, create=..., ignore=false, order_num=0, order=0x0, copied=0x371c790da790, deleted=0x371c790da7b0, 
          keys_onoff=Alter_info::LEAVE_AS_IS, alter_ctx=0x371c790dc120) at /data/Server/bb-10.6-MDEV-25062/sql/sql_table.cc:11090
      #14 0x000055b80ae5d746 in mysql_alter_table (thd=0x62b0000cb218, new_db=0x62b0000cfc28, new_name=0x62b0000d0040, create_info=0x371c790dd5d0, table_list=0x62b0000d23c8, alter_info=0x371c790dd4a0, order_num=0, 
          order=0x0, ignore=false, if_exists=false) at /data/Server/bb-10.6-MDEV-25062/sql/sql_table.cc:10353
      #15 0x000055b80afe8a77 in Sql_cmd_alter_table::execute (this=0x62b0000d2bf0, thd=0x62b0000cb218) at /data/Server/bb-10.6-MDEV-25062/sql/sql_alter.cc:550
      #16 0x000055b80abf14a0 in mysql_execute_command (thd=0x62b0000cb218) at /data/Server/bb-10.6-MDEV-25062/sql/sql_parse.cc:5983
      #17 0x000055b80abfd96d in mysql_parse (thd=0x62b0000cb218, rawbuf=0x62b0000d2238 "ALTER TABLE t3 ADD COLUMN IF NOT EXISTS col2_copy INT  /* E_R Thread1 QNO 5315 CON_ID 59 */", length=91, 
          parser_state=0x371c790deb20) at /data/Server/bb-10.6-MDEV-25062/sql/sql_parse.cc:8016

      The server was killed in the middle of this, and on accessing the table t3 we would get into trouble.

      ssh pluto
      rr replay /data/Results/1623345023/TBR-793/dev/shm/vardir/1623345023/59/1/rr/mysqld-2

      This would crash while accessing the table t3 due to an invalid BLOB pointer that had been left behind due to the SIGKILL.
      It should be sufficient to start up the server on the copy of the data directory:

      sql/mariadbd --datadir /dev/shm/data_copy

      CHECK TABLE t3 will report a number of row count mismatch to the client.

      The correct recovery would have been to drop the incomplete #sql-alter- table. What happens here instead is that the valid table t3 was dropped and the invalid table was renamed to t3.


        Issue Links



              marko Marko Mäkelä
              marko Marko Mäkelä
              0 Vote for this issue
              4 Start watching this issue



                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.