Details
-
Bug
-
Status: Closed (View Workflow)
-
Critical
-
Resolution: Fixed
-
10.2(EOL), 10.3(EOL), 10.4(EOL), 10.5, 10.6
Description
This came up in MDEV-23842. The scenario involves logging multiple RENAME operations without any log checkpoint in between.
First of all, at least starting with MariaDB 10.5 (and the MDEV-12353 changes), we are writing a duplicate record:
diff --git a/storage/innobase/fil/fil0fil.cc b/storage/innobase/fil/fil0fil.cc
|
index 5e6bd9575b1..d279e77a105 100644
|
--- a/storage/innobase/fil/fil0fil.cc
|
+++ b/storage/innobase/fil/fil0fil.cc
|
@@ -2530,7 +2530,6 @@ fil_rename_tablespace(
|
ut_ad(strchr(new_file_name, OS_PATH_SEPARATOR) != NULL);
|
|
if (!recv_recovery_is_on()) {
|
- fil_name_write_rename(id, old_file_name, new_file_name);
|
log_mutex_enter();
|
}
|
|
With that fix applied, the recovery would fail on the third rename starting from checkpoint_lsn=77739:
Thread 1 hit Breakpoint 2, fil_op_replay_rename (space_id=6, name=name@entry=0x7f51bd09f1f1 "./test/t2.ibd", new_name=new_name@entry=0x7f51bd09f1ff "./test/t1.ibd")
|
at /mariadb/10.6/storage/innobase/fil/fil0fil.cc:1873
|
1873 {
|
Current event: 6248Thread 1 hit Breakpoint 2, fil_op_replay_rename (space_id=5, name=name@entry=0x7f51bd09f41d "./test/t5.ibd", new_name=new_name@entry=0x7f51bd09f42b "./test/t2.ibd")
|
at /mariadb/10.6/storage/innobase/fil/fil0fil.cc:1873
|
1873 {
|
Current event: 6308Thread 1 hit Breakpoint 2, fil_op_replay_rename (space_id=5, name=name@entry=0x7f51bd09f60e "./test/t2.ibd", new_name=new_name@entry=0x7f51bd09f61c "./test/t5.ibd")
|
at /mariadb/10.6/storage/innobase/fil/fil0fil.cc:1873
|
We are renaming tablespace 5 from t2 to t1, then renaming tablespace 6 from t5 to t2 and back. The tablespace 5 already was named t1, and the tablespace 6 was named t2. So, only the third operation results in an attempted rename. But, there already was a t5 at tablespace 8. It had been successfully renamed before the server was killed:
(rr) break mtr_t::log_file_op
|
Breakpoint 1 at 0x55b02cabd18a: file /mariadb/10.6/storage/innobase/fil/fil0fil.cc, line 1760.
|
(rr) cond 1 type==FILE_RENAME
|
(rr) display log_sys.lsn._M_i
|
1: log_sys.lsn._M_i = 0
|
(rr) command
|
Type commands for breakpoint(s) 1, one per line.
|
End with a line saying just "end".
|
>when
|
>c
|
>end
|
(rr) c
|
Continuing.
|
2020-11-13 14:32:44 0 [Note] /dev/shm/10.6/sql/mariadbd (mysqld 10.6.0-MariaDB-debug-log) starting as process 523956 ...
|
2020-11-13 14:32:44 0 [Warning] Could not increase number of max_open_files to more than 1024 (request: 32183)
|
2020-11-13 14:32:44 0 [Warning] Changed limits: max_open_files: 1024 max_connections: 151 (was 151) table_cache: 421 (was 2000)
|
[New Thread 523956.523961]
|
[New Thread 523956.523957]
|
[New Thread 523956.523958]
|
[New Thread 523956.523960]
|
[New Thread 523956.523962]
|
[New Thread 523956.523963]
|
[New Thread 523956.523964]
|
[New Thread 523956.523966]
|
[New Thread 523956.523967]
|
Thread 1 hit Breakpoint 1, mtr_t::log_file_op (this=this@entry=0x7fff47c42950, type=type@entry=FILE_RENAME, space_id=space_id@entry=5, path=path@entry=0x55b02f10c5a8 "./test/t2.ibd",
|
new_path=new_path@entry=0x55b02f500de8 "./test/t5.ibd") at /mariadb/10.6/storage/innobase/fil/fil0fil.cc:1760
|
1760 {
|
1: log_sys.lsn._M_i = 79333
|
Current event: 47152
|
Thread 1 hit Breakpoint 1, mtr_t::log_file_op (this=this@entry=0x7fff47c42950, type=type@entry=FILE_RENAME, space_id=space_id@entry=6, path=path@entry=0x55b02f0f8568 "./test/t1.ibd",
|
new_path=new_path@entry=0x55b02f5c24f8 "./test/t2.ibd") at /mariadb/10.6/storage/innobase/fil/fil0fil.cc:1760
|
1760 {
|
1: log_sys.lsn._M_i = 79786
|
Current event: 49203
|
Thread 1 hit Breakpoint 1, mtr_t::log_file_op (this=this@entry=0x7fff47c42950, type=type@entry=FILE_RENAME, space_id=space_id@entry=5, path=path@entry=0x55b02f4b6248 "./test/t5.ibd",
|
new_path=new_path@entry=0x55b02f5c2528 "./test/t1.ibd") at /mariadb/10.6/storage/innobase/fil/fil0fil.cc:1760
|
1760 {
|
1: log_sys.lsn._M_i = 80239
|
Current event: 52863
|
[New Thread 523956.523988]
|
[New Thread 523956.523986]
|
[New Thread 523956.523987]
|
[Switching to Thread 523956.523988]
|
Thread 11 hit Breakpoint 1, mtr_t::log_file_op (this=this@entry=0x7fdc6c0a61e0, type=type@entry=FILE_RENAME, space_id=space_id@entry=5, path=path@entry=0x55b02f11ebd8 "./test/t1.ibd",
|
new_path=new_path@entry=0x55b02f495d18 "./test/t5.ibd") at /mariadb/10.6/storage/innobase/fil/fil0fil.cc:1760
|
1760 {
|
1: log_sys.lsn._M_i = 82059
|
Current event: 86144
|
Thread 11 hit Breakpoint 1, mtr_t::log_file_op (this=this@entry=0x7fdc6c0a61e0, type=type@entry=FILE_RENAME, space_id=space_id@entry=6, path=path@entry=0x55b02f10c5a8 "./test/t2.ibd",
|
new_path=new_path@entry=0x7fdc5003b648 "./test/t1.ibd") at /mariadb/10.6/storage/innobase/fil/fil0fil.cc:1760
|
1760 {
|
1: log_sys.lsn._M_i = 82530
|
Current event: 87797
|
Thread 11 hit Breakpoint 1, mtr_t::log_file_op (this=this@entry=0x7fdc6c0a61e0, type=type@entry=FILE_RENAME, space_id=space_id@entry=5, path=path@entry=0x7fdc5005a168 "./test/t5.ibd",
|
new_path=new_path@entry=0x7fdc50041ee8 "./test/t2.ibd") at /mariadb/10.6/storage/innobase/fil/fil0fil.cc:1760
|
1760 {
|
1: log_sys.lsn._M_i = 83111
|
Current event: 89390
|
Thread 11 hit Breakpoint 1, mtr_t::log_file_op (this=this@entry=0x7fdc6c0a61e0, type=type@entry=FILE_RENAME, space_id=space_id@entry=8, path=path@entry=0x55b02f0e4558 "./test/t4.ibd",
|
new_path=new_path@entry=0x7fdc50041f18 "./test/t5.ibd") at /mariadb/10.6/storage/innobase/fil/fil0fil.cc:1760
|
1760 {
|
1: log_sys.lsn._M_i = 83688
|
Current event: 90881
|
Thread 13 received signal SIGKILL, Killed.
|
[Switching to Thread 523956.523987]
|
0x0000000070000002 in ?? ()
|
1: log_sys.lsn._M_i = 83764
|
I think that we should fix the redo log apply as follows:
- Merge all recovered FILE_RENAME or MLOG_FILE_RENAME2 records, preserving the tablespace identifier and the final desired name.
- Implement a debug check that there are no duplicates of the final desired name.
- Remove those records where the fil_space_t::chain.start.name already is correct.
- Implement a debug check that at most one rename needs to be performed.
- Finally, perform any outstanding renames.
I believe that by design, we will have to perform at most one rename operation. This should be guaranteed by the log_write_up_to() call in fil_name_write_rename().
Attachments
Issue Links
- blocks
-
MDEV-23842 Atomic RENAME TABLE
- Closed
- is caused by
-
MDEV-14717 RENAME TABLE in InnoDB is not crash-safe
- Closed
- relates to
-
MDEV-16519 mariabackup --backup fails with concurrent RENAME TABLE
- Closed
-
MDEV-18336 Remove backup_fix_ddl() during backup
- Open
-
MDEV-25854 Restoring a backup may result in garbage intermediate tables from DDL
- Closed
-
MDEV-24189 Rename Table twice raise error "Tablespace is missing for a table"
- Closed