Details
-
Bug
-
Status: Closed (View Workflow)
-
Blocker
-
Resolution: Fixed
-
10.6.1
Description
The attached copy of a data directory data_copy.tar.bz2 is from a run where according to rr replay the server had been executing ALTER TABLE when it was killed:
10.6 6fbf978eec4506eb46737ac4da00ea04403ae855 |
#11 0x000055b80bc9831d in ha_innobase::write_row (this=0x61d000a582b8, record=0x61a0004e4eb8 "\b\376\t\001") at /data/Server/bb-10.6-MDEV-25062/storage/innobase/handler/ha_innodb.cc:7663
|
#12 0x000055b80b3a7320 in handler::ha_write_row (this=0x61d000a582b8, buf=0x61a0004e4eb8 "\b\376\t\001") at /data/Server/bb-10.6-MDEV-25062/sql/handler.cc:7240
|
#13 0x000055b80ae631e4 in copy_data_between_tables (thd=0x62b0000cb218, from=0x61900042fe98, to=0x619000178998, create=..., ignore=false, order_num=0, order=0x0, copied=0x371c790da790, deleted=0x371c790da7b0,
|
keys_onoff=Alter_info::LEAVE_AS_IS, alter_ctx=0x371c790dc120) at /data/Server/bb-10.6-MDEV-25062/sql/sql_table.cc:11090
|
#14 0x000055b80ae5d746 in mysql_alter_table (thd=0x62b0000cb218, new_db=0x62b0000cfc28, new_name=0x62b0000d0040, create_info=0x371c790dd5d0, table_list=0x62b0000d23c8, alter_info=0x371c790dd4a0, order_num=0,
|
order=0x0, ignore=false, if_exists=false) at /data/Server/bb-10.6-MDEV-25062/sql/sql_table.cc:10353
|
#15 0x000055b80afe8a77 in Sql_cmd_alter_table::execute (this=0x62b0000d2bf0, thd=0x62b0000cb218) at /data/Server/bb-10.6-MDEV-25062/sql/sql_alter.cc:550
|
#16 0x000055b80abf14a0 in mysql_execute_command (thd=0x62b0000cb218) at /data/Server/bb-10.6-MDEV-25062/sql/sql_parse.cc:5983
|
#17 0x000055b80abfd96d in mysql_parse (thd=0x62b0000cb218, rawbuf=0x62b0000d2238 "ALTER TABLE t3 ADD COLUMN IF NOT EXISTS col2_copy INT /* E_R Thread1 QNO 5315 CON_ID 59 */", length=91,
|
parser_state=0x371c790deb20) at /data/Server/bb-10.6-MDEV-25062/sql/sql_parse.cc:8016
|
The server was killed in the middle of this, and on accessing the table t3 we would get into trouble.
ssh pluto
|
rr replay /data/Results/1623345023/TBR-793/dev/shm/vardir/1623345023/59/1/rr/mysqld-2
|
This would crash while accessing the table t3 due to an invalid BLOB pointer that had been left behind due to the SIGKILL.
It should be sufficient to start up the server on the copy of the data directory:
sql/mariadbd --datadir /dev/shm/data_copy
|
CHECK TABLE t3 will report a number of row count mismatch to the client.
The correct recovery would have been to drop the incomplete #sql-alter- table. What happens here instead is that the valid table t3 was dropped and the invalid table was renamed to t3.
Attachments
- data_copy.tar.bz2
- 2.64 MB
Issue Links
- causes
-
MDEV-35813 Performance regression in INSERT…SELECT due to unnecessarily making InnoDB log durable
-
- Closed
-
- duplicates
-
MDEV-25751 Assertion `!rec_offs_any_null_extern(rec, offsets)' in row_merge_read_clustered_index
-
- Closed
-
- relates to
-
MDEV-25180 Atomic ALTER TABLE
-
- Closed
-
Activity
monty, I extracted some more data for you:
ssh pluto
|
rr replay /data/Results/1623345023/TBR-793/dev/shm/vardir/1623345023/59/1/rr/mysqld-1
|
Thread 13 hit Breakpoint 5, mysql_alter_table (thd=0x62b0000cb218, new_db=0x62b0000cfc28,
|
new_name=0x62b0000d0040, create_info=0x371c790dd5d0, table_list=
|
0x55b80c2d845e <dict_index_set_merge_threshold(dict_index_t*, unsigned long)+1412>,
|
alter_info=0x371c790dd4a0, order_num=0, order=0x0, ignore=false, if_exists=false)
|
at /data/Server/bb-10.6-MDEV-25062/sql/sql_table.cc:9284
|
9284 TABLE_LIST *table_list,
|
1: thd.query_string = {string = {
|
str = 0x62b0000d2238 "ALTER TABLE t3 ADD COLUMN IF NOT EXISTS col2_copy INT /* E_R Thread1 QNO 5315 CON_ID 59 */", length = 91}, cs = 0x55b80e6487c0 <my_charset_latin1>}
|
(rr) when
|
Current event: 914137
|
(rr) enable
|
(rr) continue
|
Continuing.
|
|
Thread 13 hit Breakpoint 3, write_ddl_log_file_entry (entry_pos=21944)
|
at /data/Server/bb-10.6-MDEV-25062/sql/ddl_log.cc:307
|
307 {
|
(rr) bt
|
#0 write_ddl_log_file_entry (entry_pos=21944) at /data/Server/bb-10.6-MDEV-25062/sql/ddl_log.cc:307
|
#1 0x000055b80aeb241e in ddl_log_write_entry (ddl_log_entry=0x371c790da490,
|
active_entry=0x371c790da390) at /data/Server/bb-10.6-MDEV-25062/sql/ddl_log.cc:2479
|
#2 0x000055b80aeb41a5 in ddl_log_write (ddl_state=0x371c790da970, ddl_log_entry=0x371c790da490)
|
at /data/Server/bb-10.6-MDEV-25062/sql/ddl_log.cc:3014
|
#3 0x000055b80aeb73eb in ddl_log_alter_table (thd=0x62b0000cb218, ddl_state=0x371c790da970,
|
org_hton=0x615000002118, db=0x371c790dc130, table=0x371c790dc140, new_hton=0x615000002118,
|
partition_underlying_hton=0x615000002118, new_db=0x371c790dc170, new_table=0x371c790dc1a0,
|
frm_path=0x371c790da8b0, backup_name=0x371c790da8d0, version=0x371c790dd5d8, table_version=5918,
|
is_renamed=false) at /data/Server/bb-10.6-MDEV-25062/sql/ddl_log.cc:3416
|
#4 0x000055b80ae5b3d5 in mysql_alter_table (thd=0x62b0000cb218, new_db=0x62b0000cfc28,
|
new_name=0x62b0000d0040, create_info=0x371c790dd5d0, table_list=0x62b0000d23c8,
|
alter_info=0x371c790dd4a0, order_num=0, order=0x0, ignore=false, if_exists=false)
|
at /data/Server/bb-10.6-MDEV-25062/sql/sql_table.cc:10021
|
#5 0x000055b80afe8a77 in Sql_cmd_alter_table::execute (this=0x62b0000d2bf0, thd=0x62b0000cb218)
|
at /data/Server/bb-10.6-MDEV-25062/sql/sql_alter.cc:550
|
#6 0x000055b80abf14a0 in mysql_execute_command (thd=0x62b0000cb218)
|
at /data/Server/bb-10.6-MDEV-25062/sql/sql_parse.cc:5983
|
#7 0x000055b80abfd96d in mysql_parse (thd=0x62b0000cb218,
|
rawbuf=0x62b0000d2238 "ALTER TABLE t3 ADD COLUMN IF NOT EXISTS col2_copy INT /* E_R Thread1 QNO 5315 CON_ID 59 */", length=91, parser_state=0x371c790deb20)
|
at /data/Server/bb-10.6-MDEV-25062/sql/sql_parse.cc:8016
|
#8 0x000055b80abd5cb0 in dispatch_command (command=COM_QUERY, thd=0x62b0000cb218,
|
packet=0x62900121b219 " ALTER TABLE t3 ADD COLUMN IF NOT EXISTS col2_copy INT /* E_R Thread1 QNO 5315 CON_ID 59 */ ", packet_length=93, blocking=true)
|
at /data/Server/bb-10.6-MDEV-25062/sql/sql_parse.cc:1897
|
#9 0x000055b80abd3091 in do_command (thd=0x62b0000cb218, blocking=true)
|
at /data/Server/bb-10.6-MDEV-25062/sql/sql_parse.cc:1406
|
#10 0x000055b80afcef54 in do_handle_one_connection (connect=0x608000035138, put_in_cache=true)
|
at /data/Server/bb-10.6-MDEV-25062/sql/sql_connect.cc:1410
|
#11 0x000055b80afce8b6 in handle_one_connection (arg=0x608000002cb8)
|
at /data/Server/bb-10.6-MDEV-25062/sql/sql_connect.cc:1312
|
#12 0x00002de67550c609 in start_thread (arg=<optimized out>) at pthread_create.c:477
|
#13 0x00002f6239149293 in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95
|
(rr) when
|
Current event: 914137
|
(rr) c
|
Continuing.
|
|
Thread 13 hit Breakpoint 3, write_ddl_log_file_entry (entry_pos=14108)
|
at /data/Server/bb-10.6-MDEV-25062/sql/ddl_log.cc:307
|
307 {
|
(rr) bt
|
#0 write_ddl_log_file_entry (entry_pos=14108) at /data/Server/bb-10.6-MDEV-25062/sql/ddl_log.cc:307
|
#1 0x000055b80aeb2724 in ddl_log_write_execute_entry (first_entry=1, active_entry=0x371c790da978)
|
at /data/Server/bb-10.6-MDEV-25062/sql/ddl_log.cc:2540
|
#2 0x000055b80aeb4215 in ddl_log_write (ddl_state=0x371c790da970, ddl_log_entry=0x371c790da490)
|
at /data/Server/bb-10.6-MDEV-25062/sql/ddl_log.cc:3015
|
#3 0x000055b80aeb73eb in ddl_log_alter_table (thd=0x62b0000cb218, ddl_state=0x371c790da970,
|
org_hton=0x615000002118, db=0x371c790dc130, table=0x371c790dc140, new_hton=0x615000002118,
|
partition_underlying_hton=0x615000002118, new_db=0x371c790dc170, new_table=0x371c790dc1a0,
|
frm_path=0x371c790da8b0, backup_name=0x371c790da8d0, version=0x371c790dd5d8, table_version=5918,
|
is_renamed=false) at /data/Server/bb-10.6-MDEV-25062/sql/ddl_log.cc:3416
|
#4 0x000055b80ae5b3d5 in mysql_alter_table (thd=0x62b0000cb218, new_db=0x62b0000cfc28,
|
new_name=0x62b0000d0040, create_info=0x371c790dd5d0, table_list=0x62b0000d23c8,
|
alter_info=0x371c790dd4a0, order_num=0, order=0x0, ignore=false, if_exists=false)
|
at /data/Server/bb-10.6-MDEV-25062/sql/sql_table.cc:10021
|
#5 0x000055b80afe8a77 in Sql_cmd_alter_table::execute (this=0x62b0000d2bf0, thd=0x62b0000cb218)
|
at /data/Server/bb-10.6-MDEV-25062/sql/sql_alter.cc:550
|
#6 0x000055b80abf14a0 in mysql_execute_command (thd=0x62b0000cb218)
|
at /data/Server/bb-10.6-MDEV-25062/sql/sql_parse.cc:5983
|
#7 0x000055b80abfd96d in mysql_parse (thd=0x62b0000cb218,
|
rawbuf=0x62b0000d2238 "ALTER TABLE t3 ADD COLUMN IF NOT EXISTS col2_copy INT /* E_R Thread1 QNO 5315 CON_ID 59 */", length=91, parser_state=0x371c790deb20)
|
at /data/Server/bb-10.6-MDEV-25062/sql/sql_parse.cc:8016
|
#8 0x000055b80abd5cb0 in dispatch_command (command=COM_QUERY, thd=0x62b0000cb218,
|
packet=0x62900121b219 " ALTER TABLE t3 ADD COLUMN IF NOT EXISTS col2_copy INT /* E_R Thread1 QNO 5315 CON_ID 59 */ ", packet_length=93, blocking=true)
|
at /data/Server/bb-10.6-MDEV-25062/sql/sql_parse.cc:1897
|
#9 0x000055b80abd3091 in do_command (thd=0x62b0000cb218, blocking=true)
|
at /data/Server/bb-10.6-MDEV-25062/sql/sql_parse.cc:1406
|
#10 0x000055b80afcef54 in do_handle_one_connection (connect=0x608000035138, put_in_cache=true)
|
at /data/Server/bb-10.6-MDEV-25062/sql/sql_connect.cc:1410
|
#11 0x000055b80afce8b6 in handle_one_connection (arg=0x608000002cb8)
|
at /data/Server/bb-10.6-MDEV-25062/sql/sql_connect.cc:1312
|
#12 0x00002de67550c609 in start_thread (arg=<optimized out>) at pthread_create.c:477
|
#13 0x00002f6239149293 in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95
|
(rr) c
|
Continuing.
|
|
Thread 13 hit Breakpoint 3, write_ddl_log_file_entry (entry_pos=25104)
|
at /data/Server/bb-10.6-MDEV-25062/sql/ddl_log.cc:307
|
307 {
|
(rr) bt
|
#0 write_ddl_log_file_entry (entry_pos=25104) at /data/Server/bb-10.6-MDEV-25062/sql/ddl_log.cc:307
|
#1 0x000055b80aeb7c4e in ddl_log_store_query (thd=0x62b0000cb218, ddl_state=0x371c790da970,
|
query=0x62b0000d2293 "", length=0) at /data/Server/bb-10.6-MDEV-25062/sql/ddl_log.cc:3488
|
#2 0x000055b80ae5b41a in mysql_alter_table (thd=0x62b0000cb218, new_db=0x62b0000cfc28,
|
new_name=0x62b0000d0040, create_info=0x371c790dd5d0, table_list=0x62b0000d23c8,
|
alter_info=0x371c790dd4a0, order_num=0, order=0x0, ignore=false, if_exists=false)
|
at /data/Server/bb-10.6-MDEV-25062/sql/sql_table.cc:10032
|
#3 0x000055b80afe8a77 in Sql_cmd_alter_table::execute (this=0x62b0000d2bf0, thd=0x62b0000cb218)
|
at /data/Server/bb-10.6-MDEV-25062/sql/sql_alter.cc:550
|
#4 0x000055b80abf14a0 in mysql_execute_command (thd=0x62b0000cb218)
|
at /data/Server/bb-10.6-MDEV-25062/sql/sql_parse.cc:5983
|
#5 0x000055b80abfd96d in mysql_parse (thd=0x62b0000cb218,
|
rawbuf=0x62b0000d2238 "ALTER TABLE t3 ADD COLUMN IF NOT EXISTS col2_copy INT /* E_R Thread1 QNO 5315 CON_ID 59 */", length=91, parser_state=0x371c790deb20)
|
at /data/Server/bb-10.6-MDEV-25062/sql/sql_parse.cc:8016
|
#6 0x000055b80abd5cb0 in dispatch_command (command=COM_QUERY, thd=0x62b0000cb218,
|
packet=0x62900121b219 " ALTER TABLE t3 ADD COLUMN IF NOT EXISTS col2_copy INT /* E_R Thread1 QNO 5315 CON_ID 59 */ ", packet_length=93, blocking=true)
|
at /data/Server/bb-10.6-MDEV-25062/sql/sql_parse.cc:1897
|
#7 0x000055b80abd3091 in do_command (thd=0x62b0000cb218, blocking=true)
|
at /data/Server/bb-10.6-MDEV-25062/sql/sql_parse.cc:1406
|
#8 0x000055b80afcef54 in do_handle_one_connection (connect=0x608000035138, put_in_cache=true)
|
at /data/Server/bb-10.6-MDEV-25062/sql/sql_connect.cc:1410
|
#9 0x000055b80afce8b6 in handle_one_connection (arg=0x608000002cb8)
|
at /data/Server/bb-10.6-MDEV-25062/sql/sql_connect.cc:1312
|
#10 0x00002de67550c609 in start_thread (arg=<optimized out>) at pthread_create.c:477
|
#11 0x00002f6239149293 in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95
|
(rr) c
|
Continuing.
|
|
Thread 13 hit Breakpoint 3, write_ddl_log_file_entry (entry_pos=14108)
|
at /data/Server/bb-10.6-MDEV-25062/sql/ddl_log.cc:307
|
307 {
|
(rr) bt
|
#0 write_ddl_log_file_entry (entry_pos=14108) at /data/Server/bb-10.6-MDEV-25062/sql/ddl_log.cc:307
|
#1 0x000055b80aeb2724 in ddl_log_write_execute_entry (first_entry=4, active_entry=0x371c790da978)
|
at /data/Server/bb-10.6-MDEV-25062/sql/ddl_log.cc:2540
|
#2 0x000055b80aeb7d9a in ddl_log_store_query (thd=0x62b0000cb218, ddl_state=0x371c790da970,
|
query=0x62b0000d2293 "", length=0) at /data/Server/bb-10.6-MDEV-25062/sql/ddl_log.cc:3496
|
#3 0x000055b80ae5b41a in mysql_alter_table (thd=0x62b0000cb218, new_db=0x62b0000cfc28,
|
new_name=0x62b0000d0040, create_info=0x371c790dd5d0, table_list=0x62b0000d23c8,
|
alter_info=0x371c790dd4a0, order_num=0, order=0x0, ignore=false, if_exists=false)
|
at /data/Server/bb-10.6-MDEV-25062/sql/sql_table.cc:10032
|
#4 0x000055b80afe8a77 in Sql_cmd_alter_table::execute (this=0x62b0000d2bf0, thd=0x62b0000cb218)
|
at /data/Server/bb-10.6-MDEV-25062/sql/sql_alter.cc:550
|
#5 0x000055b80abf14a0 in mysql_execute_command (thd=0x62b0000cb218)
|
at /data/Server/bb-10.6-MDEV-25062/sql/sql_parse.cc:5983
|
#6 0x000055b80abfd96d in mysql_parse (thd=0x62b0000cb218,
|
rawbuf=0x62b0000d2238 "ALTER TABLE t3 ADD COLUMN IF NOT EXISTS col2_copy INT /* E_R Thread1 QNO 5315 CON_ID 59 */", length=91, parser_state=0x371c790deb20)
|
at /data/Server/bb-10.6-MDEV-25062/sql/sql_parse.cc:8016
|
#7 0x000055b80abd5cb0 in dispatch_command (command=COM_QUERY, thd=0x62b0000cb218,
|
packet=0x62900121b219 " ALTER TABLE t3 ADD COLUMN IF NOT EXISTS col2_copy INT /* E_R Thread1 QNO 5315 CON_ID 59 */ ", packet_length=93, blocking=true)
|
at /data/Server/bb-10.6-MDEV-25062/sql/sql_parse.cc:1897
|
#8 0x000055b80abd3091 in do_command (thd=0x62b0000cb218, blocking=true)
|
at /data/Server/bb-10.6-MDEV-25062/sql/sql_parse.cc:1406
|
#9 0x000055b80afcef54 in do_handle_one_connection (connect=0x608000035138, put_in_cache=true)
|
at /data/Server/bb-10.6-MDEV-25062/sql/sql_connect.cc:1410
|
#10 0x000055b80afce8b6 in handle_one_connection (arg=0x608000002cb8)
|
at /data/Server/bb-10.6-MDEV-25062/sql/sql_connect.cc:1312
|
#11 0x00002de67550c609 in start_thread (arg=<optimized out>) at pthread_create.c:477
|
#12 0x00002f6239149293 in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95
|
(rr) c
|
Continuing.
|
|
Thread 13 hit Breakpoint 2, ha_innobase::create (this=0x61d000116cb8,
|
name=0x371c790dcb5e "./test/#sql-alter-3116aa-3b", form=0x371c790d9900, create_info=0x371c790dd5d0)
|
at /data/Server/bb-10.6-MDEV-25062/storage/innobase/handler/ha_innodb.cc:13051
|
13051 return create(name, form, create_info, srv_file_per_table);
|
(rr) bt
|
#0 ha_innobase::create (this=0x61d000116cb8, name=0x371c790dcb5e "./test/#sql-alter-3116aa-3b",
|
form=0x371c790d9900, create_info=0x371c790dd5d0)
|
at /data/Server/bb-10.6-MDEV-25062/storage/innobase/handler/ha_innodb.cc:13051
|
#1 0x000055b80b397c50 in handler::ha_create (this=0x61d000116cb8,
|
name=0x371c790dcb5e "./test/#sql-alter-3116aa-3b", form=0x371c790d9900, info_arg=0x371c790dd5d0)
|
at /data/Server/bb-10.6-MDEV-25062/sql/handler.cc:5145
|
#2 0x000055b80b39bec7 in ha_create_table (thd=0x62b0000cb218,
|
path=0x371c790dcb5e "./test/#sql-alter-3116aa-3b", db=0x62b0000d2ab0 "test",
|
table_name=0x62b0000d2378 "t3", create_info=0x371c790dd5d0, frm=0x371c790da810, skip_frm_file=true)
|
at /data/Server/bb-10.6-MDEV-25062/sql/handler.cc:5610
|
#3 0x000055b80ae5cccf in mysql_alter_table (thd=0x62b0000cb218, new_db=0x62b0000cfc28,
|
new_name=0x62b0000d0040, create_info=0x371c790dd5d0, table_list=0x62b0000d23c8,
|
alter_info=0x371c790dd4a0, order_num=0, order=0x0, ignore=false, if_exists=false)
|
at /data/Server/bb-10.6-MDEV-25062/sql/sql_table.cc:10268
|
#4 0x000055b80afe8a77 in Sql_cmd_alter_table::execute (this=0x62b0000d2bf0, thd=0x62b0000cb218)
|
at /data/Server/bb-10.6-MDEV-25062/sql/sql_alter.cc:550
|
#5 0x000055b80abf14a0 in mysql_execute_command (thd=0x62b0000cb218)
|
at /data/Server/bb-10.6-MDEV-25062/sql/sql_parse.cc:5983
|
#6 0x000055b80abfd96d in mysql_parse (thd=0x62b0000cb218,
|
rawbuf=0x62b0000d2238 "ALTER TABLE t3 ADD COLUMN IF NOT EXISTS col2_copy INT /* E_R Thread1 QNO 5315 CON_ID 59 */", length=91, parser_state=0x371c790deb20)
|
at /data/Server/bb-10.6-MDEV-25062/sql/sql_parse.cc:8016
|
#7 0x000055b80abd5cb0 in dispatch_command (command=COM_QUERY, thd=0x62b0000cb218,
|
packet=0x62900121b219 " ALTER TABLE t3 ADD COLUMN IF NOT EXISTS col2_copy INT /* E_R Thread1 QNO 5315 CON_ID 59 */ ", packet_length=93, blocking=true)
|
at /data/Server/bb-10.6-MDEV-25062/sql/sql_parse.cc:1897
|
#8 0x000055b80abd3091 in do_command (thd=0x62b0000cb218, blocking=true)
|
at /data/Server/bb-10.6-MDEV-25062/sql/sql_parse.cc:1406
|
#9 0x000055b80afcef54 in do_handle_one_connection (connect=0x608000035138, put_in_cache=true)
|
at /data/Server/bb-10.6-MDEV-25062/sql/sql_connect.cc:1410
|
#10 0x000055b80afce8b6 in handle_one_connection (arg=0x608000002cb8)
|
at /data/Server/bb-10.6-MDEV-25062/sql/sql_connect.cc:1312
|
#11 0x00002de67550c609 in start_thread (arg=<optimized out>) at pthread_create.c:477
|
#12 0x00002f6239149293 in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95
|
(rr) when
|
Current event: 914167
|
(rr) c
|
Continuing.
|
|
Thread 21 received signal SIGKILL, Killed.
|
[Switching to Thread 3217066.3229113]
|
0x0000000070000002 in ?? ()
|
(rr) when
|
Current event: 918191
|
I am not familiar with the DDL logging code or the file format, so I do not know how to dump more relevant information for you.
Hopefully the event counters (reported by when) will be useful when you access the trace. You can let rr replay to start at the interesting point by invoking run 914137. Note that the parameters shown for write_ddl_log_file_entry() are likely garbage, due to the way how the binary was built. (I should have created the breakpoint some steps after the function preamble.)
The relevant entry to watch is the call to:
ddl_log_update_phase(ddl_log_state, DDL_ALTER_TABLE_PHASE_INPLACE_COPIED).
This is the only way that I know of that can generate the ddl_recovery.log file you gave me.
I run the program under rr until it ended with SIGKILL
Then I did
break
ddl_log_update_phase
reverse-cont
Thread 4 hit Breakpoint 1, ddl_log_update_phase (state=0x55b80cb890e0, phase=176 '\260') at /data/Server/bb-10.6-MDEV-25062/sql/ddl_log.cc:2931
|
|
(rr) bt
|
#0 ddl_log_update_phase (state=0x55b80cb890e0, phase=176 '\260') at /data/Server/bb-10.6-MDEV-25062/sql/ddl_log.cc:2931
|
#1 0x000055b80ae5e372 in mysql_alter_table (thd=0x62b0000cb218, new_db=0x62b0000cfc28, new_name=0x62b0000d0040, create_info=0x371c790dd5d0,
|
table_list=0x62b0000d23c8, alter_info=0x371c790dd4a0, order_num=0, order=0x0, ignore=false, if_exists=false)
|
at /data/Server/bb-10.6-MDEV-25062/sql/sql_table.cc:10493
|
#2 0x000055b80afe8a77 in Sql_cmd_alter_table::execute (this=0x62b0000d2bf0, thd=0x62b0000cb218)
|
at /data/Server/bb-10.6-MDEV-25062/sql/sql_alter.cc:550
|
#3 0x000055b80abf14a0 in mysql_execute_command (thd=0x62b0000cb218) at /data/Server/bb-10.6-MDEV-25062/sql/sql_parse.cc:5983
|
#4 0x000055b80abfd96d in mysql_parse (thd=0x62b0000cb218,
|
rawbuf=0x62b0000d2238 "ALTER TABLE t3 ADD COLUMN IF NOT EXISTS col2_copy INT /* E_R Thread1 QNO 5315 CON_ID 59 */", length=91,
|
parser_state=0x371c790deb20) at /data/Server/bb-10.6-MDEV-25062/sql/sql_parse.cc:8016
|
#5 0x000055b80abd5cb0 in dispatch_command (command=COM_QUERY, thd=0x62b0000cb218,
|
packet=0x62900121b219 " ALTER TABLE t3 ADD COLUMN IF NOT EXISTS col2_copy INT /* E_R Thread1 QNO 5315 CON_ID 59 */ ", packet_length=93,
|
blocking=true) at /data/Server/bb-10.6-MDEV-25062/sql/sql_parse.cc:1897
|
#6 0x000055b80abd3091 in do_command (thd=0x62b0000cb218, blocking=true) at /data/Server/bb-10.6-MDEV-25062/sql/sql_parse.cc:1406
|
#7 0x000055b80afcef54 in do_handle_one_connection (connect=0x608000035138, put_in_cache=true)
|
at /data/Server/bb-10.6-MDEV-25062/sql/sql_connect.cc:1410
|
#8 0x000055b80afce8b6 in handle_one_connection (arg=0x608000002cb8) at /data/Server/bb-10.6-MDEV-25062/sql/sql_connect.cc:1312
|
#9 0x00002de67550c609 in start_thread (arg=<optimized out>) at pthread_create.c:477
|
#10 0x00002f6239149293 in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95
|
{str = 0x371c790dce50 "#sql-backup-3116aa-3b", length = 21}
This is the right ALTER and the right backup file.
This is after copy_between_tables !
Now trying to run forward to see if there is another alter:
break mysql_alter_table
cont
Thread 3 received signal SIGKILL, Killed.
|
[Switching to Thread 3217066.3229113]
|
0x0000000070000002 in ?? ()
|
No other alter table run after the above back trace.
It is correct of ddl_recovery to use the new table as the table was created,
all data was copied and the copy was committed.
I tried to check what the server was doing when it was killed, but could not
find it (thread apply all bt only showed one thread).
How to show that it was killed in copy_data_between_tables() ?
Side note: A little earlier, there was an asynchronous commit of a CREATE transaction. I think that we would want to wait for it to be persisted. wlad, can you check when we would wait for that write to go through? This was changed by you in MDEV-24341.
#0 log_write_up_to (lsn=22999253, flush_to_disk=true, rotate_key=false, callback=0x371c790d86e0) at /data/Server/bb-10.6-MDEV-25062/storage/innobase/log/log0log.cc:792
|
#1 0x000055b80c15458a in trx_flush_log_if_needed_low (lsn=22999253, trx_state=TRX_STATE_COMMITTED_IN_MEMORY) at /data/Server/bb-10.6-MDEV-25062/storage/innobase/trx/trx0trx.cc:1179
|
#2 0x000055b80c154671 in trx_flush_log_if_needed (lsn=22999253, trx=0x653c2bb34c88) at /data/Server/bb-10.6-MDEV-25062/storage/innobase/trx/trx0trx.cc:1200
|
#3 0x000055b80c15e3df in trx_t::commit_in_memory (this=0x653c2bb34c88, mtr=0x371c790d8ae0) at /data/Server/bb-10.6-MDEV-25062/storage/innobase/trx/trx0trx.cc:1394
|
#4 0x000055b80c15562d in trx_t::commit_low (this=0x653c2bb34c88, mtr=0x371c790d8ae0) at /data/Server/bb-10.6-MDEV-25062/storage/innobase/trx/trx0trx.cc:1490
|
#5 0x000055b80c155763 in trx_t::commit_persist (this=0x653c2bb34c88) at /data/Server/bb-10.6-MDEV-25062/storage/innobase/trx/trx0trx.cc:1504
|
#6 0x000055b80c339125 in trx_t::commit (this=0x653c2bb34c88, deleted=std::vector of length 0, capacity 0) at /data/Server/bb-10.6-MDEV-25062/storage/innobase/dict/drop.cc:229
|
#7 0x000055b80bcea8de in ha_innobase::create (this=0x61d000116cb8, name=0x371c790dcb5e "./test/#sql-alter-3116aa-3b", form=0x371c790d9900, create_info=0x371c790dd5d0, file_per_table=true, trx=0x653c2bb34c88)
|
at /data/Server/bb-10.6-MDEV-25062/storage/innobase/handler/ha_innodb.cc:13027
|
monty, the last persistent change in InnoDB was well before the DDL log call that you identified. For the critical transaction, InnoDB might be explicitly forbidden to persist the change here:
0x000055b80bc85d34 in innobase_commit_ordered_2 (trx=0x653c2bb33b68, thd=0x62b0000cb218) at /data/Server/bb-10.6-MDEV-25062/storage/innobase/handler/ha_innodb.cc:4236
|
4236 trx->flush_log_later = true;
|
(rr) bt
|
#0 0x000055b80bc85d34 in innobase_commit_ordered_2 (trx=0x653c2bb33b68, thd=0x62b0000cb218) at /data/Server/bb-10.6-MDEV-25062/storage/innobase/handler/ha_innodb.cc:4236
|
#1 0x000055b80bc86483 in innobase_commit (hton=0x615000002118, thd=0x62b0000cb218, commit_trx=true) at /data/Server/bb-10.6-MDEV-25062/storage/innobase/handler/ha_innodb.cc:4351
|
#2 0x000055b80b382bf6 in commit_one_phase_2 (thd=0x62b0000cb218, all=true, trans=0x62b0000ceb60, is_real_trans=true) at /data/Server/bb-10.6-MDEV-25062/sql/handler.cc:1998
|
#3 0x000055b80b382ad1 in ha_commit_one_phase (thd=0x62b0000cb218, all=true) at /data/Server/bb-10.6-MDEV-25062/sql/handler.cc:1977
|
#4 0x000055b80b381109 in ha_commit_trans (thd=0x62b0000cb218, all=true) at /data/Server/bb-10.6-MDEV-25062/sql/handler.cc:1771
|
#5 0x000055b80b00a443 in trans_commit_implicit (thd=0x62b0000cb218) at /data/Server/bb-10.6-MDEV-25062/sql/transaction.cc:329
|
#6 0x000055b80b398645 in ha_enable_transaction (thd=0x62b0000cb218, on=true) at /data/Server/bb-10.6-MDEV-25062/sql/handler.cc:5263
|
#7 0x000055b80ae60bbb in mysql_trans_commit_alter_copy_data (thd=0x62b0000cb218) at /data/Server/bb-10.6-MDEV-25062/sql/sql_table.cc:10818
|
#8 0x000055b80ae63e8e in copy_data_between_tables (thd=0x62b0000cb218, from=0x61900042fe98, to=0x619000178998, create=..., ignore=false, order_num=0, order=0x0, copied=0x371c790da790, deleted=0x371c790da7b0,
|
keys_onoff=Alter_info::LEAVE_AS_IS, alter_ctx=0x371c790dc120) at /data/Server/bb-10.6-MDEV-25062/sql/sql_table.cc:11184
|
Not only that: It turns out that the transaction that is being committed had not written any undo log at all!
#0 trx_t::commit_in_memory (this=0x653c2bb33b68, mtr=0x0) at /data/Server/bb-10.6-MDEV-25062/storage/innobase/trx/trx0trx.cc:1262
|
#1 0x000055b80c15562d in trx_t::commit_low (this=0x653c2bb33b68, mtr=0x0) at /data/Server/bb-10.6-MDEV-25062/storage/innobase/trx/trx0trx.cc:1490
|
#2 0x000055b80c155763 in trx_t::commit_persist (this=0x653c2bb33b68) at /data/Server/bb-10.6-MDEV-25062/storage/innobase/trx/trx0trx.cc:1504
|
#3 0x000055b80c1558bc in trx_t::commit (this=0x653c2bb33b68) at /data/Server/bb-10.6-MDEV-25062/storage/innobase/trx/trx0trx.cc:1510
|
#4 0x000055b80c1560ce in trx_commit_for_mysql (trx=0x653c2bb33b68) at /data/Server/bb-10.6-MDEV-25062/storage/innobase/trx/trx0trx.cc:1626
|
#5 0x000055b80bc858e1 in innobase_commit_low (trx=0x653c2bb33b68) at /data/Server/bb-10.6-MDEV-25062/storage/innobase/handler/ha_innodb.cc:4141
|
#6 0x000055b80bc85d98 in innobase_commit_ordered_2 (trx=0x653c2bb33b68, thd=0x62b0000cb218) at /data/Server/bb-10.6-MDEV-25062/storage/innobase/handler/ha_innodb.cc:4247
|
#7 0x000055b80bc86483 in innobase_commit (hton=0x615000002118, thd=0x62b0000cb218, commit_trx=true) at /data/Server/bb-10.6-MDEV-25062/storage/innobase/handler/ha_innodb.cc:4351
|
#8 0x000055b80b382bf6 in commit_one_phase_2 (thd=0x62b0000cb218, all=true, trans=0x62b0000ceb60, is_real_trans=true) at /data/Server/bb-10.6-MDEV-25062/sql/handler.cc:1998
|
#9 0x000055b80b382ad1 in ha_commit_one_phase (thd=0x62b0000cb218, all=true) at /data/Server/bb-10.6-MDEV-25062/sql/handler.cc:1977
|
#10 0x000055b80b381109 in ha_commit_trans (thd=0x62b0000cb218, all=true) at /data/Server/bb-10.6-MDEV-25062/sql/handler.cc:1771
|
#11 0x000055b80b00a443 in trans_commit_implicit (thd=0x62b0000cb218) at /data/Server/bb-10.6-MDEV-25062/sql/transaction.cc:329
|
#12 0x000055b80b398645 in ha_enable_transaction (thd=0x62b0000cb218, on=true) at /data/Server/bb-10.6-MDEV-25062/sql/handler.cc:5263
|
#13 0x000055b80ae60bbb in mysql_trans_commit_alter_copy_data (thd=0x62b0000cb218) at /data/Server/bb-10.6-MDEV-25062/sql/sql_table.cc:10818
|
#14 0x000055b80ae63e8e in copy_data_between_tables (thd=0x62b0000cb218, from=0x61900042fe98, to=0x619000178998, create=..., ignore=false, order_num=0, order=0x0, copied=0x371c790da790, deleted=0x371c790da7b0,
|
keys_onoff=Alter_info::LEAVE_AS_IS, alter_ctx=0x371c790dc120) at /data/Server/bb-10.6-MDEV-25062/sql/sql_table.cc:11184
|
That is why we did not even initiate a log write.
The reason why no undo log was written must be the performance optimization in MDEV-11415 for ALGORITHM=COPY.
The easiest fix should be to persist the log a little earlier:
#0 ha_innobase::extra (this=0x61d000a582b8, operation=HA_EXTRA_END_ALTER_COPY) at /data/Server/bb-10.6-MDEV-25062/storage/innobase/handler/ha_innodb.cc:15242
|
#1 0x000055b80ae63e62 in copy_data_between_tables (thd=0x62b0000cb218, from=0x61900042fe98, to=0x619000178998, create=..., ignore=false, order_num=0, order=0x0, copied=0x371c790da790, deleted=0x371c790da7b0,
|
keys_onoff=Alter_info::LEAVE_AS_IS, alter_ctx=0x371c790dc120) at /data/Server/bb-10.6-MDEV-25062/sql/sql_table.cc:11179
|
mleich, can you try the following patch:
diff --git a/storage/innobase/handler/ha_innodb.cc b/storage/innobase/handler/ha_innodb.cc
|
index 2c0a5f8f114..5dabab139b1 100644
|
--- a/storage/innobase/handler/ha_innodb.cc
|
+++ b/storage/innobase/handler/ha_innodb.cc
|
@@ -15308,6 +15308,7 @@ ha_innobase::extra(
|
break;
|
case HA_EXTRA_END_ALTER_COPY:
|
m_prebuilt->table->skip_alter_undo = 0;
|
+ log_write_up_to(LSN_MAX, true);
|
break;
|
default:/* Do nothing */
|
; |
Note: This should fix ALGORITHM=COPY only. Based on my debugging above, it is not clear to me whether we have necessary log writes for other cases. Especially if innodb_flush_log_at_trx_commit=0, I suspect that we may be missing necessary log writes.
marko, async operations such as this group commit are waited for before we send anything to the client, to keep the promise that we do not reply OK to the client before the data is persistent.
Thus, the wait, if any, is inside net_real_write(). It is thd->async_state.wait_for_pending_ops() there. unless you use threadpool, which adds even more async, this would be called at the end of the dispatch_command.
But You would notice that XA (iirc, this includes binlog) is not asynchronous. You can move thd->async_state.wait_for_pending_ops() to another place if you have something XA-like, and want transaction to become persistent fr example , before writing to ddl log, if desired. Or you can disable async in log_write_up_to, if you can detect DDL earlier.
wlad, thank you for clarifying it.
I decided that the easiest way to ensure the commits of DDL operations are durable is to add explicit calls like the following:
log_write_up_to(trx->commit_lsn, true); |
I hope that all DDL is safe now. Before these fixes, the easiest way to get into trouble should have been with innodb_file_per_table=0 and innodb_flush_log_at_trx_commit=0.
I have looked at the binary log.
In the log, the phase of alter table is (from ddl_log.cc, the comment says it all):
case DDL_ALTER_TABLE_PHASE_COPIED:
{
char *from_end;
/*
New table is created and we have the query for the binary log.
We should remove the original table and in the next stage replace
it with the new one.
*/
This means that copy_between_tables() had executed fully and the data have been copied.
This tag is added here in sql_table.cc
ddl_log_update_phase(&ddl_log_state, DDL_ALTER_TABLE_PHASE_COPIED);
debug_crash_here("ddl_log_alter_after_log"); // Use new table
THD_STAGE_INFO(thd, stage_rename_result_table);
if (wait_while_table_is_used(thd, table, HA_EXTRA_PREPARE_FOR_RENAME))
goto err_new_table_cleanup;
Are you sure that the ddl log is for the above trace?
I don't see any way that the phase could be wrong if it crashed in copy_data_between tables().
Just before copy_between_tables, the phase is set to DDL_ALTER_TABLE_PHASE_CREATED, which is what it should be in the above case if the crash happened there.