Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-25910

Incorrect crash recovery of ALTER TABLE...ALGORITHM=COPY

    XMLWordPrintable

    Details

      Description

      The attached copy of a data directory data_copy.tar.bz2 is from a run where according to rr replay the server had been executing ALTER TABLE when it was killed:

      10.6 6fbf978eec4506eb46737ac4da00ea04403ae855

      #11 0x000055b80bc9831d in ha_innobase::write_row (this=0x61d000a582b8, record=0x61a0004e4eb8 "\b\376\t\001") at /data/Server/bb-10.6-MDEV-25062/storage/innobase/handler/ha_innodb.cc:7663
      #12 0x000055b80b3a7320 in handler::ha_write_row (this=0x61d000a582b8, buf=0x61a0004e4eb8 "\b\376\t\001") at /data/Server/bb-10.6-MDEV-25062/sql/handler.cc:7240
      #13 0x000055b80ae631e4 in copy_data_between_tables (thd=0x62b0000cb218, from=0x61900042fe98, to=0x619000178998, create=..., ignore=false, order_num=0, order=0x0, copied=0x371c790da790, deleted=0x371c790da7b0, 
          keys_onoff=Alter_info::LEAVE_AS_IS, alter_ctx=0x371c790dc120) at /data/Server/bb-10.6-MDEV-25062/sql/sql_table.cc:11090
      #14 0x000055b80ae5d746 in mysql_alter_table (thd=0x62b0000cb218, new_db=0x62b0000cfc28, new_name=0x62b0000d0040, create_info=0x371c790dd5d0, table_list=0x62b0000d23c8, alter_info=0x371c790dd4a0, order_num=0, 
          order=0x0, ignore=false, if_exists=false) at /data/Server/bb-10.6-MDEV-25062/sql/sql_table.cc:10353
      #15 0x000055b80afe8a77 in Sql_cmd_alter_table::execute (this=0x62b0000d2bf0, thd=0x62b0000cb218) at /data/Server/bb-10.6-MDEV-25062/sql/sql_alter.cc:550
      #16 0x000055b80abf14a0 in mysql_execute_command (thd=0x62b0000cb218) at /data/Server/bb-10.6-MDEV-25062/sql/sql_parse.cc:5983
      #17 0x000055b80abfd96d in mysql_parse (thd=0x62b0000cb218, rawbuf=0x62b0000d2238 "ALTER TABLE t3 ADD COLUMN IF NOT EXISTS col2_copy INT  /* E_R Thread1 QNO 5315 CON_ID 59 */", length=91, 
          parser_state=0x371c790deb20) at /data/Server/bb-10.6-MDEV-25062/sql/sql_parse.cc:8016
      

      The server was killed in the middle of this, and on accessing the table t3 we would get into trouble.

      ssh pluto
      rr replay /data/Results/1623345023/TBR-793/dev/shm/vardir/1623345023/59/1/rr/mysqld-2
      

      This would crash while accessing the table t3 due to an invalid BLOB pointer that had been left behind due to the SIGKILL.
      It should be sufficient to start up the server on the copy of the data directory:

      sql/mariadbd --datadir /dev/shm/data_copy
      

      CHECK TABLE t3 will report a number of row count mismatch to the client.

      The correct recovery would have been to drop the incomplete #sql-alter- table. What happens here instead is that the valid table t3 was dropped and the invalid table was renamed to t3.

        Attachments

          Issue Links

            Activity

              People

              Assignee:
              marko Marko Mäkelä
              Reporter:
              marko Marko Mäkelä
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

                Dates

                Created:
                Updated:
                Resolved:

                  Git Integration