Details
-
Bug
-
Status: Closed (View Workflow)
-
Major
-
Resolution: Incomplete
-
10.1.11, 10.1.18, 10.2.31, 10.4.14
Description
A Windows user upgraded from MariaDB 10.0.21 to MariaDB 10.1.11. After running mysql_upgrade, the user saw errors like this in the error log:
InnoDB: tried to purge sec index entry not marked for deletion in
|
InnoDB: index "column" of table "db1"."tab1"
|
InnoDB: tuple DATA TUPLE: 3 fields;
|
0: len 15; hex XXXX; asc field1;;
|
1: len 3; hex XXXX; asc sum;;
|
2: len 33; hex XXXX; asc field3;;
|
|
InnoDB: record PHYSICAL RECORD: n_fields 3; compact format; info bits 0
|
0: len 15; hex XXXX; asc field1;;
|
1: len 3; hex XXXX; asc sum;;
|
2: len 30; hex XXXX; asc field3; (total 33 bytes);
|
Some time after that, the user tried inserting more records into the database, and this caused an assertion failure:
2016-02-12 14:58:34 4428 InnoDB: Assertion failure in thread 17448 in file row0ins.cc line 283
|
InnoDB: Failing assertion: *cursor->index->name == TEMP_INDEX_PREFIX
|
InnoDB: We intentionally generate a memory trap.
|
InnoDB: Submit a detailed bug report to http://bugs.mysql.com.
|
InnoDB: If you get repeated assertion failures or crashes, even
|
InnoDB: immediately after the mysqld startup, there may be
|
InnoDB: corruption in the InnoDB tablespace. Please refer to
|
InnoDB: http://dev.mysql.com/doc/refman/5.6/en/forcing-innodb-recovery.html
|
InnoDB: about forcing recovery.
|
Attachments
Issue Links
- duplicates
-
MDEV-9530 Galera Cluster crashed after some hours of usage
-
- Closed
-
- is duplicated by
-
MDEV-11756 Delete-marked records unexpectedly exist in a secondary index
-
- Closed
-
-
MDEV-12463 Unable to find a record to delete-mark just after upgrading from 10.2.4 to 10.2.5
-
- Closed
-
-
MDEV-26917 InnoDB: Clustered record for sec rec not found index
-
- Closed
-
- relates to
-
MDEV-13206 INSERT ON DUPLICATE KEY UPDATE foreign key fail
-
- Closed
-
-
MDEV-13542 Crashing on a corrupted page is unhelpful
-
- Closed
-
-
MDEV-13899 IMPORT TABLESPACE may corrupt ROW_FORMAT=REDUNDANT tables
-
- Closed
-
-
MDEV-13980 InnoDB fails to discard record lock when discarding an index page
-
- Closed
-
-
MDEV-14481 Execute InnoDB crash recovery in the background
-
- Closed
-
-
MDEV-15199 Triangular FKs - Cascade delete causes broken referential integrity
-
- Closed
-
-
MDEV-15326 InnoDB: Failing assertion: !other_lock || wsrep_thd_is_BF(lock->trx->mysql_thd, FALSE) || wsrep_thd_is_BF(other_lock->trx->mysql_thd, FALSE)
-
- Closed
-
-
MDEV-16515 InnoDB: Failing assertion: ++retries < 10000 in file dict0dict.cc line 2737
-
- Closed
-
-
MDEV-16759 InnoDB: Assertion failure in thread 139946191502080 in file row0ins.cc line 285
-
- Closed
-
-
MDEV-18272 InnoDB fails to rollback after exceeding FOREIGN KEY recursion depth
-
- Closed
-
-
MDEV-20066 Wrong value on instantly added column after DELETE and UPDATE
-
- Closed
-
-
MDEV-21726 Index corruption after 10.4 upgrade
-
- Closed
-
-
MDEV-22373 Unable to find a record to delete-mark ends up crashing mysqld process after upgrading from 10.1.43 to 10.4
-
- Closed
-
-
MDEV-22759 Failing assertion: !cursor->index->is_committed() upon update on table with HASH index
-
- Closed
-
-
MDEV-23565 Corruption of PAGE_BTR_IBUF_FREE_LIST
-
- Open
-
-
MDEV-24402 CHECK TABLE may miss some cases of index inconsistencies
-
- Closed
-
-
MDEV-24449 Corruption of system tablespace or last recovered page
-
- Closed
-
-
MDEV-24709 Assertion !recv_no_ibuf_operations failed in ibuf_page_low()
-
- Closed
-
-
MDEV-27734 Set innodb_change_buffering=none by default
-
- Closed
-
-
MDEV-30009 InnoDB shutdown hangs when the change buffer is corrupted
-
- Closed
-
-
MDEV-13082 Upgrade Instructions 10.1 to 10.2 Result In Indefinite Shutdown Duration
-
- Closed
-
-
MDEV-14066 [Draft] Assertion failed: rec_get_deleted_flag(rec, dict_table_is_comp(cursor->index->table))
-
- Closed
-
-
MDEV-14643 InnoDB: Failing assertion: !cursor->index->is_committed()
-
- Closed
-
-
MDEV-15826 Purge attempts to free BLOB page after BEGIN;INSERT;UPDATE;ROLLBACK
-
- Closed
-
-
MDEV-15916 Change buffer crash during TRUNCATE or DROP TABLE
-
- Closed
-
-
MDEV-22986 [ERROR] InnoDB: Record in index was not found on update in mysqld.log on slave
-
- Closed
-
-
MDEV-31245 Assertion Failure results in crash
-
- Closed
-
Activity
I found this: https://bugs.launchpad.net/percona-server/+bug/1195614 and https://bugs.mysql.com/bug.php?id=59845
Not sure if directly related but assertion does look same. Issue does not look familiar to me.
Backtrace
3 mysqld.exe!abort Line 81
*4 mysqld.exe!row_ins_sec_index_entry_by_modify Line 285
5 mysqld.exe!row_ins_sec_index_entry_low Line 2868
6 mysqld.exe!row_ins_sec_index_entry Line 3062
7 mysqld.exe!row_ins Line 3318
8 mysqld.exe!row_ins_step Line 3444
9 mysqld.exe!row_insert_for_mysql Line 1377
10 mysqld.exe!ha_innobase::write_row Line 8505
11 mysqld.exe!handler::ha_write_row Line 5875
12 mysqld.exe!write_record Line 1881
13 mysqld.exe!mysql_insert Line 991
14 mysqld.exe!mysql_execute_command Line 3898
15 mysqld.exe!mysql_parse Line 7308
16 mysqld.exe!dispatch_command Line 1491
17 mysqld.exe!do_command Line 1109
18 mysqld.exe!threadpool_process_request Line 239
19 mysqld.exe!io_completion_callback Line 568
The user upgraded to 10.1.18, and is now seeing the following:
InnoDB: tried to purge sec index entry not marked for deletion in
|
InnoDB: index "column" of table "db1"."tab1"
|
InnoDB: tuple DATA TUPLE: 3 fields;
|
0: len 15; hex XXXX; asc field1;
|
1: len 3; hex XXXX; asc sum;;
|
2: len 33; hex XXXX; asc field3;;
|
|
InnoDB: record PHYSICAL RECORD: n_fields 3; compact format; info bits 0
|
0: len 15; hex XXXX; asc field1;
|
1: len 3; hex XXXX; asc sum;;
|
2: len 30; hex XXXX; asc field3; (total 33 bytes);
|
|
InnoDB: tried to purge sec index entry not marked for deletion in
|
InnoDB: index "column" of table "db1"."tab1"
|
InnoDB: tuple DATA TUPLE: 3 fields;
|
0: len 15; hex XXXX; asc field1;;
|
1: len 13; hex XXXX; asc notemptycount;;
|
2: len 33; hex XXXX; asc field3;;
|
|
InnoDB: record PHYSICAL RECORD: n_fields 3; compact format; info bits 0
|
0: len 15; hex XXXX; asc field1;;
|
1: len 13; hex XXXX; asc notemptycount;;
|
2: len 30; hex XXXX; asc field3; (total 33 bytes);
|
|
InnoDB: tried to purge sec index entry not marked for deletion in
|
InnoDB: index "column" of table "db1"."tab1"
|
InnoDB: tuple DATA TUPLE: 3 fields;
|
0: len 32; hex XXXX; asc field1s1;;
|
1: len 3; hex XXXX; asc min;;
|
2: len 33; hex XXXX; asc field3;;
|
|
InnoDB: record PHYSICAL RECORD: n_fields 3; compact format; info bits 0
|
0: len 30; hex XXXX; asc field1; (total 32 bytes);
|
1: len 3; hex XXXX; asc min;;
|
2: len 30; hex XXXX; asc field3; (total 33 bytes);
|
|
2016-10-12 14:51:44 47a8 InnoDB: Assertion failure in thread 18344 in file row0ins.cc line 283
|
InnoDB: Failing assertion: *cursor->index->name == TEMP_INDEX_PREFIX
|
InnoDB: We intentionally generate a memory trap.
|
InnoDB: Submit a detailed bug report to http://bugs.mysql.com.
|
InnoDB: If you get repeated assertion failures or crashes, even
|
InnoDB: immediately after the mysqld startup, there may be
|
InnoDB: corruption in the InnoDB tablespace. Please refer to
|
InnoDB: http://dev.mysql.com/doc/refman/5.6/en/forcing-innodb-recovery.html
|
InnoDB: about forcing recovery.
|
161012 14:51:44 [ERROR] mysqld got exception 0x80000003 ;
|
This could be because you hit a bug. It is also possible that this binary
|
or one of the libraries it was linked against is corrupt, improperly built,
|
or misconfigured. This error can also be caused by malfunctioning hardware.
|
|
To report this bug, see https://mariadb.com/kb/en/reporting-bugs
|
|
We will try our best to scrape up some info that will hopefully help
|
diagnose the problem, but since we have already crashed,
|
something is definitely wrong and this may fail.
|
|
Server version: 10.1.18-MariaDB
|
key_buffer_size=0
|
read_buffer_size=2097152
|
max_used_connections=10
|
max_threads=1001
|
thread_count=9
|
It is possible that mysqld could use up to
|
key_buffer_size + (read_buffer_size + sort_buffer_size)*max_threads = 544537 K bytes of memory
|
Hope that's ok; if not, decrease some variables in the equation.
|
|
Thread pointer: 0x0x7ebdc0f88
|
Attempting backtrace. You can use the following information to find out
|
where mysqld died. If you see no messages after this, something went
|
terribly wrong...
|
mysqld.exe!my_parameter_handler()[my_init.c:259]
|
mysqld.exe!raise()[winsig.c:587]
|
mysqld.exe!abort()[abort.c:82]
|
mysqld.exe!row_ins_sec_index_entry_by_modify()[row0ins.cc:283]
|
mysqld.exe!row_ins_sec_index_entry_low()[row0ins.cc:2868]
|
mysqld.exe!row_ins_sec_index_entry()[row0ins.cc:3062]
|
mysqld.exe!row_ins()[row0ins.cc:3318]
|
mysqld.exe!row_ins_step()[row0ins.cc:3444]
|
mysqld.exe!row_insert_for_mysql()[row0mysql.cc:1377]
|
mysqld.exe!ha_innobase::write_row()[ha_innodb.cc:8748]
|
mysqld.exe!handler::ha_write_row()[handler.cc:5916]
|
mysqld.exe!write_record()[sql_insert.cc:1882]
|
mysqld.exe!mysql_insert()[sql_insert.cc:991]
|
mysqld.exe!mysql_execute_command()[sql_parse.cc:3897]
|
mysqld.exe!mysql_parse()[sql_parse.cc:7323]
|
mysqld.exe!dispatch_command()[sql_parse.cc:1490]
|
mysqld.exe!do_command()[sql_parse.cc:1108]
|
mysqld.exe!threadpool_process_request()[threadpool_common.cc:238]
|
mysqld.exe!io_completion_callback()[threadpool_win.cc:568]
|
kernel32.dll!BaseFormatTimeOut()
|
ntdll.dll!RtlEqualDomainName()
|
ntdll.dll!TpIsTimerSet()
|
kernel32.dll!BaseThreadInitThunk()
|
ntdll.dll!RtlUserThreadStart()
|
I have hit probably the same bug on 10.1.18 on Centos 6:
InnoDB: tried to purge sec index entry not marked for deletion in
|
InnoDB: index "ts_db_tbl" of table "percona"."checksum"
|
InnoDB: tuple DATA TUPLE: 4 fields;
|
0: len 4; hex 5822da8f; asc X" ;;
|
1: len 64; hex 494d4720202020202020202020202020202020202020202020202020202020202020202020202020202020202020202020202020202020202020202020202020; asc IMG ;;
|
2: len 64; hex 70686f746f2020202020202020202020202020202020202020202020202020202020202020202020202020202020202020202020202020202020202020202020; asc photo ;;
|
3: len 4; hex 80000001; asc ;;
|
|
InnoDB: record PHYSICAL RECORD: n_fields 4; compact format; info bits 0
|
0: len 4; hex 5822da8f; asc X" ;;
|
1: len 30; hex 494d47202020202020202020202020202020202020202020202020202020; asc IMG ; (total 64 bytes);
|
2: len 30; hex 70686f746f20202020202020202020202020202020202020202020202020; asc photo ; (total 64 bytes);
|
3: len 4; hex 80000001; asc ;;
|
No strack trace. This MDB run as replication slave. Replication master run on 10.1.17 and was upgraded from 5.5.xx. Master doesn't hit this issue for now.
Please see the comments that I made in MDEV-11756.
Also, note that these assertion failures are just the ‘messenger’ that we should not shoot. The logical corruption of the dataset may have been introduced much earlier. I suspect that the InnoDB change buffering could be to blame.
It is also possible that the corruption was originally introduced to the data files by an old version of MariaDB or MySQL that did not have the fixes for the bugs that I have listed at the end of the MySQL Bug #61104 page.
We seem to have hit the same bug using 10.1.20 on CentOS 5:
InnoDB: tried to purge sec index entry not marked for deletion in
|
InnoDB: index "REF_IND" of table "SYS_FOREIGN"
|
InnoDB: tuple DATA TUPLE: 2 fields;
|
0: len 17; hex 763464657670622f646f63756d656e7473; asc v4devpb/documents;;
|
1: len 26; hex 763464657670622f464b41353134353838303634364233343331; asc v4devpb/FKA5145880646B3431;;
|
|
InnoDB: record PHYSICAL RECORD: n_fields 2; 1-byte offsets; info bits 0
|
0: len 17; hex 763464657670622f646f63756d656e7473; asc v4devpb/documents;;
|
1: len 26; hex 763464657670622f464b41353134353838303634364233343331; asc v4devpb/FKA5145880646B3431;;
|
This happened after upgrading the database from MySQL 5.6 to MariaDB 10.1.20 earlier in the day.
After the upgrade, I ran mysql_upgrade which reported OK for all the tables in this database.
Later in the day, when attempting to drop some foreign key and indexes, the above error occurred in the DB log, along with the following in our application log:
SQL state [HY000]; error code [1025]; Error on rename of './v4devpb/#sql-1534_2c5' to './v4devpb/documents'
|
The following day, Maria crashed with the following message:
2017-01-05 11:59:08 2b72eec48d40 InnoDB: Assertion failure in thread 47772632124736 in file row0ins.cc line 283
|
InnoDB: Failing assertion: *cursor->index->name == TEMP_INDEX_PREFIX
|
170105 11:59:08 [ERROR] mysqld got signal 6 ;
|
...
|
stack_bottom = 0x2b72eec48468 thread_stack 0x48400
|
InnoDB: Warning: a long semaphore wait:
|
--Thread 47772428119360 has waited at dict0dict.cc line 1172 for 241.00 seconds the semaphore:
|
Mutex at 0x2b72070112e8 '&dict_sys->mutex', lock var 1
|
Last time reserved by thread 47772632124736 in file not yet reserved line 0, waiters flag 1
|
InnoDB: Warning: semaphore wait:
|
--Thread 47772428119360 has waited at dict0dict.cc line 1172 for 241.00 seconds the semaphore:
|
Mutex at 0x2b72070112e8 '&dict_sys->mutex', lock var 1
|
Last time reserved by thread 47772632124736 in file not yet reserved line 0, waiters flag 1
|
InnoDB: Warning: semaphore wait:
|
...
|
--Thread 47772616559936 has waited at dict0dict.cc line 1172 for 33.000 seconds the semaphore:
|
Mutex at 0x2b72070112e8 '&dict_sys->mutex', lock var 1
|
Last time reserved by thread 47772632124736 in file not yet reserved line 0, waiters flag 1
|
InnoDB: ###### Starts InnoDB Monitor for 30 secs to print diagnostic info:
|
InnoDB: Pending preads 0, pwrites 0
|
|
Subsequently, MariaDB hung or crashed whenever I tried to drop a recreate a database with the same name.
About two weeks later, when attempting to alter the schema of another database on the same server, we saw a very similar issue.
It seems MySQL 5.7.5 fixes a similar looking bug that was introduced in MySQL 5.7.4 for cursor->index->name == TEMP_INDEX_PREFIX assertion failure. Bug #18723872): https://forums.mysql.com/read.php?3,621060,621060
murrayw24, I do not think that the one-line change that I reviewed or the fix that introduced the regression are related to this. MySQL 5.7.9 (which contains the fix) was merged to MariaDB Server 10.2.2, while MariaDB Server 10.1 is based on the InnoDB of MySQL 5.6.
While there have been bugs that easily cause inconsistency between indexes (MDEV-11927 could be one of them), I believe that there exists a hard-to-reproduce bug in all InnoDB versions. We never got a reproducible reasonably small test case internally at Oracle either. If only we could reproduce it more easily, we could test if it goes away with innodb_change_buffering=none.
murrayw24, given that you are using 10.1, you cannot be hitting MDEV-11927, which affects 10.2.3 and 10.2.4 only.
Do you have a backup of the files right before they were updated to MariaDB 10.1.20? I would like to repeat this in a controlled fashion, so that I can start work on a reduced test case and a fix.
marko, given that it was several weeks between the upgrade and this problem appearing, I'm not sure we still have any backups, but I will take a look
Hi marko,
I have the following artifacts still available:
- The results of running mysql_upgrade after the MariaDB installation
- The full .err log file since installing MariaDB
- Backups of the mysql DB and the actual DB schemas post-installation but before any corruption in the schema became apparent
- The server in it's current state with 2 db schemas corrupted
- A backup of the entire database from December 201*5*
which was probably taken prior to an upgrade from MySQL 5.5 to 5.6
Please let me know if any of this is of any use to you in diagnosing what has gone wrong.
Thanks,
Murray
murrayw24, could you create a new instance using the backup from December 2015? Issue CHECK TABLE on every InnoDB table.
If no corruption is reported, we should perhaps look at later query logs or binlogs to figure out what kind of a combination of SQL operations could have caused the corruption.
If the backup fails these CHECK TABLE checks, then it should have been corrupted earlier. Theoretically, there could also be a bug in MariaDB 10.1 that corrupts the backup straight away. To rule this out, you could try the CHECK TABLE using MySQL 5.5 or 5.6, after restoring the backup again, of course.
Hi marko,
Here's what I think you're asking me to do:
- Create a new instance of MariaDB 10.1.20 (on the server where we have seen this problem) using MySQL Sandbox
- Import the December 2015 backups for the mysql schema and the schemas that have become corrupted in the current instance
- Run CHECK TABLE against all tables in all schemas
- If any of the CHECK TABLE statements fail:
- Create a new instance of MySQL 5.6 (on the server where we have seen this problem) using MySQL Sandbox
- Import the December 2015 backups for the mysql schema and the schemas that have become corrupted in the current instance
- Run CHECK TABLE against all tables in all schemas
Please could you let me know if I've correctly understood and let me know either way.
Hi marko,
I decided to follow a more thorough path as follows:
- Created a new instance of MySQL 5.6.34 (on the server where we have seen this problem) using MySQL Sandbox
- Configured it as per our typical config settings
- Imported the December 2015 mysqldump of the mysql schema and one of the schemas ("v4stgrelnest") that had become corrupted in the current instance
- Ran mysql_check against mysql and v4stgrelnest schemas
- All reported OK
- Stopped the new MySQL 5.6.34 instance
- Created a new instance of MariaDB 10.1.20 (on the server where we have seen this problem) using MySQL Sandbox
- Configured it as per our typical config settings
- Pointed it at the data folder from the MySQL 5.6.34
- Started this new MariaDB 10.1.20 instance. Got the following errors (which I assume are minor):
- Missing system table mysql.roles_mapping; please run mysql_upgrade to create it
- Incorrect definition of table mysql.event: expected column 'sql_mode' at position 14 to have type...
- mysqld: Event Scheduler: An error occurred when initializing system tables. Disabling the Event Scheduler
- Failed to load slave replication state from table mysql.gtid_slave_pos: 1146: Table 'mysql.gtid_slave_pos' doesn't exist
- Ran mysql_upgrade
- There were a few errors for views but all tables reported OK
- The err file had the following error: Incorrect definition of table mysql.proc: expected column 'sql_mode' at position 14 to have type...
- Ran mysql_check against mysql and v4stgrelnest schemas
- All reported OK
- Deleted the incorrectly defined views
- Restarted the database
- No errors in logs
- Ran mysql_upgrade --force
- Everything reported OK, no errors in logs
- Ran the alter table scripts that caused/revealed the schema corruption last time
- Everything reported OK, no errors in logs
- Ran mysql_check against mysql and v4stgrelnest schemas
- All reported OK
I have kept a detailed record of the commands that I ran and the output from the checks, etc. Please let me know if you'd like me to email these to you.
Is there anything else you would like me to try?
Thanks,
Murray
Hi Murray, thank you very much for your efforts.
I am afraid that this corruption is very difficult to trigger. If it is related to change buffering like I suspect, it would only trigger in specific workloads.
The InnoDB change buffer is used for buffering changes to the leaf pages of non-unique secondary indexes when those leaf pages do not happen to be already in the buffer pool. If there are index scans or lookups, the pages would be read into the buffer pool and any previously buffered changes would be merged, and any future changes would be applied directly to the pages as long as they reside in the buffer pool.
To add insult to the injury, one source of input to the change buffer (starting with MySQL 5.5) is the purge of transaction history. This purge process is nondeterministic as well. So, we would have two background processes that are designed to be invisible to the SQL connections but whose scheduling is dependent on the SQL workload.
I have no evidence that my hypothesis is correct. I can only hope that someone who hits this bug reasonably often could
SET GLOBAL innodb_change_buffering=none;
|
and report if this makes the problem go away. Note: if the dataset is already corrupted, this will not heal it (DROP INDEX followed by ALTER TABLE…ADD INDEX… will). It might only prevent future corruption.
Hi marko,
Thanks for getting back to me.
The corruption happened when we were running a script which dropped a lot of (duplicate) indexes and foreign keys in quick succession, which sounds like it would fit with your theory of a problem with change buffering. I am pretty sure the corruption happened on different tables for different databases, even though the schemas and the schema change scripts would have been pretty much the same, which again would appear to fit with a concurrency issue.
Thanks,
Murray
Hi again marko,
Just one other thought: I believe that some of the schemas that got corrupted did so when foreign keys (rather than indexes) were being dropped (although the drop foreign key statements were interleaved with drop index statements). Does this still fit with your theory?
Thanks,
Murray
Hi murrayw24, sorry, I missed your updates.
DROP FOREIGN KEY should be a fast schema-only change.
Also, dropping secondary indexes should actually be the correct course of action. You would want to drop the secondary indexes that are inconsistent with the clustered index. In the original reported message, it would be
ALTER TABLE db1.tab1 DROP INDEX `column`;
|
The clustered index is where the actual data is stored.
One idea why the issues might only have been reported after the ALTER TABLE operations would be if one of the ALTER TABLE statements was executed with ALGORITHM=COPY instead of ALGORITHM=INPLACE. This could for example be because of old_alter_table=0 being in effect (if MariaDB supports that option). Another idea would be that the operations caused a ‘stuck’ purge (MDEV-11802) to be ‘unstuck’.
FYI: I am still seeing a lot of these crashes in 10.1.26-MariaDB-1~stretch. I am on a Debian 9 system were I have downloaded the Mariadb from the official repository to try to fix the same problem in the native Debian version. My understanding is the it should be fixed by now but obviously not?
thuben, there has been no fix to address this issue, either in MySQL or MariaDB.
My comment from 2017-01-23 is still the latest information on this.
The corruption is very hard to repeat.
My hypothesis is that setting innodb_change_buffering=none could prevent these problems. My thinking is that because the change buffer only kicks in when the dataset is too big to fit in the buffer pool. triggering this bug would require very suitable access patterns and timing.
Of course, setting innodb_change_buffering=none would not help against any hidden corruption that previous use of the change buffer may have caused.
Even if we got a test case contribution for repeating this problem, it would likely take considerable effort to track down the problem, because considerable time might pass between the original writes to the change buffer and the merge followed by the detection of the corruption.
Thank you for your quick answer!
In my case it seems to have happen when I did the Debian 8 to Debian 9 were Mariadb is introduced in Debian. My experience is in line with what described by @Murray Williams above.
Nor setting innodb_change_buffering=none or any of the other workaround mentioned worked for me. I am more into that the tables are being corrupted during the transition from Mysql to MariaDB...
I have now wiped the failing database completely and re-installed a backup, Since then I have not seen the crashes: I will continue to more actively monitor the logs to see what is happening.
thuben, I believe that you experienced a corruption that had been hidden for a long time before upgrading.
My history of InnoDB internals development dates back to 2003, and I was developing InnoDB under MySQL until I joined MariaDB late 2016. I implemented the delete buffering in MySQL 5.5.
There are a few internal Oracle bugs (unaccessible to the public) on this same issue, and to my knowledge, no progress has been made, because the corruption is so hard to repeat and narrow down.
While there are some file format differences between MariaDB and MySQL (mostly related to encryption and page_compressed, which were introduced in MariaDB 10.1), I do not believe that the problems could be caused by that. As far as I can tell, the last bug fix for the change buffer ([ignoring the two changes in MySQL 5.7 that were mentioned earlier) is this comment clarification in MySQL 5.5.44, 5.6.25, 5.7.8, approved by me in March 2015.
At the end of MySQL Bug #61104 there is a list of fixed bugs that have caused change buffer corruption. If you were running a server version with one of these bugs present, it is of course possible that some data got corrupted. In addition to these fixed bugs, I believe that there may still be some open bug that causes such corruption.
Once more, setting innodb_change_buffering=none does not help if the database was already corrupted. It only prevents future use of the change buffer.
I would very much appreciate a copy of a database before the failed upgrade, so that I could at least confirm my educated guess that the change buffer is playing a role in the corruption.
Thanks for your interesting insights!
My reason for believing that is was related to the upgrade is that I have now seen the same kind of problem on another of the databases on the same server running the same instance of mysql / mariadb. I think I solved the second onet by dumping the database using mysqldump, deleting the database, creating it again, and read back the mysqldump.
I don't have a copy of the database before the "failing" mysql -> mariadb upgrade. I do however have a image of the virtual machine were this database is as it was before the crash from last week. I assume you need the raw database files? If so I will try to dig them out!
thuben, yes, I would want the raw database files as they were after shutdown. The minimum would be the system tablespace (ibdata* files) and the *.ibd of one table for which corruption would be reported by shutdown. Ideally, I would want a set of files with which the problem can be reproduced.
By the way, mysqldump avoids accessing the secondary indexes. Change buffering only applies to non-unique secondary indexes.
(Side note: if you SET unique_checks=0 and then insert duplicate keys to unique secondary indexes, the duplicates will not be detected if the inserts were buffered.
mysqldump is generating this risky statement. By using unique_checks=0 and inserting duplicates into secondary indexes you could get corrupted unique indexes.
But I believe that this form of corruption has been reported for non-unique indexes too, so the unique_checks=0 cannot explain all this.)
I will send you a private mail in order to discuss how to send the files to you.
In the dataset that thuben submitted, CHECK TABLE in MariaDB 10.1 reported 3 corrupted tables, each with corrupted non-unique secondary indexes, so the change buffer could be possible to blame.
For table 1, "Index 'A' contains 1693 entries, should be 1698."
For table 2, "Index 'B' contains 60577 entries, should be 60576."
For table 3: "Index 'C' contains 28437 entries, should be 28438.", "Index 'D' contains 28437 entries, should be 28438."
I obfuscated the table and index names to protect any confidential data.
Next, I will try to see if the CHECK TABLE triggered any change buffer merge for these tables. If not, it will be very challenging to guess where the corruption could come from. I can also try to dump and match the 1693+1698 records for the first table, because it is so small.
In the submitted dataset, the change buffer was already empty.
I dumped the contents of both indexes (PRIMARY and the secondary index) of the first table, using CHECK TABLE and this patch:
diff --git a/storage/xtradb/btr/btr0btr.cc b/storage/xtradb/btr/btr0btr.cc
|
index 85a083aaee0..be737de336c 100644
|
--- a/storage/xtradb/btr/btr0btr.cc
|
+++ b/storage/xtradb/btr/btr0btr.cc
|
@@ -4698,6 +4698,9 @@ btr_index_page_validate(
|
|
page_cur_move_to_next(&cur);
|
|
+ fprintf(stderr, "index %s page %u\n",
|
+ index->name, cur.block->page.offset);
|
+
|
for (;;) {
|
if (page_cur_is_after_last(&cur)) {
|
|
@@ -4709,6 +4712,9 @@ btr_index_page_validate(
|
return(FALSE);
|
}
|
|
+ rec_print(stderr, cur.rec, index);
|
+ putc('\n', stderr);
|
+
|
/* Verify that page_rec_get_nth_const() is correctly
|
retrieving each record. */
|
DBUG_EXECUTE_IF("check_table_rec_next", |
Next, I will try to find the non-matching records (ones that only exist in the PRIMARY index, but not in the secondary index, or vice versa).
I copied the error log output for the CHECK TABLE with the above patch into two files.
In the PRIMARY KEY of the first table, only one value (0) has been assigned to the INT NOT NULL DEFAULT 0 column of the corrupted secondary index:
grep '^ 4:' mdev-9663-primary.txt|uniq -c
|
1698 4: len 4; hex 80000000; asc ;;
|
In the secondary index, there is also one value for this column:
grep '^ 0:' mdev-9663-secondary.txt|uniq -c
|
1693 0: len 4; hex 80000000; asc ;;
|
None of the records are delete-marked (info_bits is always 0, never 32). Most records in the PRIMARY key were inserted; 10 were updated (or inserted on top of purgeable delete-marked records):
grep '^ 2:.*hex [0-7]' mdev-9663-primary.txt|wc -l
|
10
|
To match the records, it suffices to extract the single PRIMARY KEY column (varchar(255) NOT NULL) from both indexes, and match them with each other:
diff -u <(sed -ne 's/^ 0: .*; hex \([0-9a-f]*\);.*/\1/p' mdev-9663-primary.txt) <(sed -ne 's/^ 1: .*; hex \([0-9a-f]*\);.*/\1/p' mdev-9663-secondary.txt)
|
Because all secondary index key values are the same, these keys must be in the same order in both indexes (no sorting needed).
There are 5 records in the PRIMARY index that are missing from the secondary index. There are no ‘extra’ records in the secondary index.
For the first such key, I made a very alarming finding: there are 10 binary-equal copies of the same PRIMARY KEY in the clustered index. This should be simply impossible. Why was a duplicate key error not reported? An attempt to insert a record when the key already exists should have one of two outcomes (assuming that there is no locking conflict): If the existing record was not delete-marked, report a duplicate key error. Else, it must be a purgeable delete-marked record that was not yet purged, and we would "update in place" this delete-marked record to insert our own record.
All the 10 copies of the first key were fresh inserts (the most significant bit of DB_ROLL_PTR is set).
2 of these copies were missing from the secondary index.
For the second problematic key, there are 6 copies, again fresh inserts.
1 copy is missing from the secondary index.
For the third problematic key, there are 15 copies in the PRIMARY index, 14 in the secondary index. Again, fresh inserts.
For the fourth problematic key, there are 5 copies in the PRIMARY index, 4 in the secondary index.
We must find out why the PRIMARY KEY was corrupted (duplicates were inserted). And we must add a check for these duplicates to CHECK TABLE and possibly other B-tree code.
In logic, anything can be inferred from a controversy. In computer science, this is often formulated as ‘garbage in, garbage out’. Once something gets corrupted and we carry on, the corruption can get worse from that point on. For this corruption, I would not blame the change buffer; this must be caused by something else.
Sorry, my above analysis is flawed. I missed that the diagnostic function rec_print() is only displaying the first 30 bytes of the data. I must redump all of the data, and not just compare the 30-byte prefixes of the VARCHAR(255) column.
With this additional patch, I see 5 unique primary keys records that are only present in the PRIMARY index, and missing from the secondary index:
diff --git a/storage/xtradb/rem/rem0rec.cc b/storage/xtradb/rem/rem0rec.cc
|
index c62e8c90434..1875fac0cd6 100644
|
--- a/storage/xtradb/rem/rem0rec.cc
|
+++ b/storage/xtradb/rem/rem0rec.cc
|
@@ -1849,7 +1849,7 @@ rec_print_comp(
|
fprintf(file, " %lu:", (ulong) i);
|
|
if (len != UNIV_SQL_NULL) {
|
- if (len <= 30) {
|
+ if (len <= 768) {
|
|
ut_print_buf(file, data, len);
|
} else if (rec_offs_nth_extern(offsets, i)) { |
All 5 records were fresh inserts in the PRIMARY index (the high-order bit of DB_ROLL_PTR is set, that is, no updates to these records were ever committed, and the original inserts did not replace any purgeable delete-marked record).
How could the records end up missing from the secondary index? InnoDB would implement INSERT operations like this:
- Write an insert_undo record (so that the record can be removed in case of ROLLBACK)
- Insert into the PRIMARY index
- Insert into each secondary index (one mini-transaction per index), possibly via the change buffer.
- On commit, discard the insert_undo records.
If the transaction were rolled back, InnoDB would process the undo log, look up the index records corresponding to each undo record, and undo the changes. Finally, after the undo log was emptied, the ‘empty’ transaction would be committed.
In the dataset, there were no recovered (incomplete) transactions, so the inserts must have completed. We could lose buffered inserts if innodb_force_recovery was ever set to the value 4 or more, but I do not think that this would be the case. Perhaps a buffered insert was lost due to some other reason (perhaps it was never written to the change buffer tree)?
We are a little wiser now, but we still need a way to reproduce this corruption. My primary suspect is the change buffer.
One thing that I forgot: While the inserts are normally only buffered for non-unique secondary indexes (SET unique_checks=0 overrides that and risks corrupting unique indexes if duplicates are inserted), delete-marks and purges can be buffered for any secondary index, including unique ones.
As discussed separately in a private mail thread the following was run as a part of daily maintenance on the mysql instance where I originally saw the crash:
mysqladmin --defaults-file=/etc/mysql/debian.cnf --local flush-error-log flush-engine-log flush-general-log flush-slow-log
|
(I assume that flush and purge are equal in this context)
thuben, the InnoDB change buffer merge and the InnoDB purge of old transaction history are background operations that are mostly invisible to the user. The mysqladmin command that you quoted would appear to control some informational or diagnostic logs that have no direct connection with the InnoDB data or log files.
A possible (certainly not the only) cause of this bug is MDEV-13899 IMPORT TABLESPACE may corrupt ROW_FORMAT=REDUNDANT tables
It is definitely not the only cause, because in the bug description, the diagnostic output says 'compact format'.
thuben, I might have found an explanation to the corruption that you experienced. It is a bug that is present in all InnoDB versions:
MDEV-13980 InnoDB fails to discard record lock when discarding an index page
Unfortunately I have no test case. As far as I can tell, it requires a very improbable sequence of events. It is already very difficult to come up with an insertion and deletion sequence that causes a record deletion to lead into btr_discard_page() instead of a page merge. And you would need something (I do not know what) to be executed in the same transaction after the btr_discard_page() that would exploit the wrongly granted record lock.
marko thanks for the info. Unfortunately I have not seen anything in my test installation yet. If you find a test case, please let me know and I will give it a try!
thuben, I wonder if your corruption matches the comments in this corruption bug fix in MySQL 5.7.4 which is part of MariaDB 10.2 (and which is related to MDEV-13206).
1. The table has one primary index, one unique secondary index and one non-unique secondary index.
2. The INSERT ... ON DUPLICATE UPDATE ... is executed on the table.
3. Insert into the clustered index reported DB_DUPLICATE_KEY. This error information is saved. We proceed to take gap locks in all
unique secondary indexes.
4. Insert into the unique secondary index reported DB_LOCK_WAIT.
5. Step 4 is repeated from a higher layer row_ins(). When this is done, the earlier error information saved in step 3 is lost.
6. Next instead of taking just gap locks or skipping non-unique secondary indexes, because of loss of information regarding the error already saved, an actual insert is performed on the non-unique secondary index. This triggers the assert.
I plan to run the test case on 5.5, 10.0, 10.1 to see if this scenario is possible in those versions. So far, we did not seem to get reports of this bug for 10.2, but maybe earlier versions are being used more widely.
commit c44ece7342f14498630e4ab403ae125971137457
Author: Jan Lindström <jan.lindstrom@mariadb.com>
Date: Thu Nov 16 12:56:54 2017 +0200
MDEV-9663: InnoDB assertion failure: *cursor->index->name == TEMP_INDEX_PREFIX
MariaDB adjustments to test case innodb-replace-debug.
commit f7b110bdc1fb0eb6b383f9647caa299de9d64aba
Author: Jan Lindström <jan.lindstrom@mariadb.com>
Date: Thu Nov 16 12:39:41 2017 +0200
MDEV-9663: InnoDB assertion failure: *cursor->index->name == TEMP_INDEX_PREFIX
Imported missing test case from MySQL 5.7 for
commit 25781c154396dbbc21023786aa3be070057d6999
Author: Annamalai Gurusami <annamalai.gurusami@oracle.com>
Date: Mon Feb 24 14:00:03 2014 +0530
Bug #17604730 ASSERTION: *CURSOR->INDEX->NAME == TEMP_INDEX_PREFIX
MariaDB 5.5 does not seem to be affected.
commit d8ccc61f76d56b761e52564701814739abc190d1
Author: Jan Lindström <jan.lindstrom@mariadb.com>
Date: Thu Nov 16 14:03:02 2017 +0200
MDEV-9663: InnoDB assertion failure: *cursor->index->name == TEMP_INDEX_PREFIX
Add missing instrumentation to row0ins.cc.
commit 93326ef051350787a3b289f68137365224a5e77a
Author: Jan Lindström <jan.lindstrom@mariadb.com>
Date: Thu Nov 16 13:21:07 2017 +0200
MDEV-9663: InnoDB assertion failure: *cursor->index->name == TEMP_INDEX_PREFIX
MariaDB adjustments to test case innodb-replace-debug.
MariaDB 10.0 does not seem to be affected.
commit 923ea5dbf6644fab088e35122523b2b8ef03b7ea
Author: Jan Lindström <jan.lindstrom@mariadb.com>
Date: Thu Nov 16 13:18:22 2017 +0200
MDEV-9663: InnoDB assertion failure: *cursor->index->name == TEMP_INDEX_PREFIX
Imported missing test case from MySQL 5.7 for
commit 25781c154396dbbc21023786aa3be070057d6999
Author: Annamalai Gurusami <annamalai.gurusami@oracle.com>
Date: Mon Feb 24 14:00:03 2014 +0530
Bug #17604730 ASSERTION: *CURSOR->INDEX->NAME == TEMP_INDEX_PREFIX
commit d7349e204b873cd882666c2598b1ec1bf1490563
Author: Jan Lindström <jan.lindstrom@mariadb.com>
Date: Thu Nov 16 13:21:07 2017 +0200
MDEV-9663: InnoDB assertion failure: *cursor->index->name == TEMP_INDEX_PREFIX
MariaDB adjustments to test case innodb-replace-debug and
add missing instrumentation to row0ins.cc.
MariaDB 10.1 does not seem to be affected.
commit eeec64d75eb39155e0080f464c9f2103897dd809
Author: Jan Lindström <jan.lindstrom@mariadb.com>
Date: Thu Nov 16 13:18:22 2017 +0200
MDEV-9663: InnoDB assertion failure: *cursor->index->name == TEMP_INDEX_PREFIX
Imported missing test case from MySQL 5.7 for
commit 25781c154396dbbc21023786aa3be070057d6999
Author: Annamalai Gurusami <annamalai.gurusami@oracle.com>
Date: Mon Feb 24 14:00:03 2014 +0530
Bug #17604730 ASSERTION: *CURSOR->INDEX->NAME == TEMP_INDEX_PREFIX
Thank you, jplindst.
So, MySQL 5.7.4 (and now MariaDB 10.2) fixed one source of secondary index corruption bugs.
But there definitely are others, because all the corruptions reported here in MDEV-9663 seem to have been originated earlier than 10.2.
Please continue by doing
git bisect
|
in MySQL 5.7 to see when Bug #17604730 was originally introduced. Maybe it will lead us to the right track to fixing the remaining issue.
The bisect script will have to apply a source code patch. The last revision to bisect is 25781c154396dbbc21023786aa3be070057d6999~ (the parent of the bug fix in MySQL 5.7.4). Assuming that MySQL 5.6 is not affected, the first one is commit 8c603abc635d63e69ea55aefd15318160af89d8a (the very first commit to 5.7 that is not in 5.6).
For the record, as noted in MDEV-15199, secondary indexes can also become corrupted in MariaDB 10.2.2 to 10.2.13 (and MySQL 5.7.2 to 5.7.21) because of FOREIGN KEY constraints with CASCADE or SET NULL options. Furthermore, MariaDB 10.2.8 introduced another related FOREIGN KEY bug that was also fixed in 10.2.13 by MDEV-15219.
That said, we still have some secondary index corruption bug that exists at least since MySQL 5.5 and MariaDB 5.5.
It seems to me that the InnoDB change buffer in its current form is fundamentally broken (not crash-safe, unsafe with hot backup).
InnoDB crash recovery implements a shortcut: In the last batch of applying redo log records to pages, it will merge any buffered changes to the pages, in the function recv_apply_hashed_log_recs().
InnoDB would first merge any buffered changes to the page in buf_page_get_gen(), then apply redo log records by invoking recv_recover_page().
This is fine, as long as the very last persisted changes to the secondary index leaf page were direcltly via the redo log, and not via the change buffer (or redo log records for the change buffer).
There should be a problem in the following scenario:
- The leaf page is modified directly.
- The leaf page is evicted from the buffer pool.
- Some change to the page is made via the change buffer
- All buffered redo log records are written to the redo log up to this point.
- The server is killed, without any redo log checkpoint occurring in this time frame. (Alternatively, a hot backup completes after this point, but before redo log records for change buffer merge for this leaf page are written.)
- InnoDB would now apply changes to the leaf page in the wrong order (first, the newer changes from the change buffer, then the older ones from the redo log). This can corrupt the page!
The above is a very unlikely sequence of events. Normally redo log checkpoints are performed periodically. The page eviction, the redo log flush and the server kill should be timed very carefully.
With hot backup (xtrabackup or mariabackup), the scenario is much more likely. The log would be applied from an older checkpoint that was the latest when the backup process was started.
Because change buffer merge is writing redo log records, it should always be safe to first apply the log records, and then merge the change buffer.
It is cleanest to defer the change buffer merge until all log records have been applied. In this way, the redo log apply will not generate any redo log records and cannot possibly trigger a redo log checkpoint (which could lead to a deadlock or infinite loop when the checkpoint needs to complete the last batch of redo log apply).
Before allowing "normal user access" to buf_page_get_gen() the buffered changes must be merged, or pages with buffered changes must be evicted from the buffer pool. This constraint will make it harder to implement crash recovery in the background (MDEV-14481).
As yinfeng-zwx pointed to me privately, normally the change buffer is being merged in buf_page_io_complete(). That function first invokes recv_recover_page() to apply any redo log records, and then invokes ibuf_merge_or_delete_for_page() to merge any buffered changes. That is the correct order.
The only other caller that invokes ibuf_merge_or_delete_for_page() with a non-NULL block (that is, merging changes, not discarding them) is buf_page_get_gen(). That code path is only invoked for ROW_FORMAT=COMPRESSED tables. So, my above analysis of the change buffer not being crash-safe should only apply to ROW_FORMAT=COMPRESSED tables.
We have another bug report about the same assertion failure: MDEV-16759
The reporter provided everything he could, it's not enough to reproduce the issue, but maybe looking at these two reports together will help.
MDEV-16759 was explained by the corruption of InnoDB adaptive hash index (AHI) during a TRUNCATE TABLE operation. It turned out to have been fixed by MDEV-16515 (MariaDB Server 10.1.35, 10.2.9, 10.3.8), which fixed a regression in an earlier fix of MDEV-16283 (which also fixes MDEV-14727 and MDEV-14491) in MariaDB Server 10.1.34, 10.2.16, 10.3.8.
So, we have proof that the use of DDL operations and the InnoDB adaptive hash index could cause permanent corruption of data, which might go undetected for a long time (possibly detected after ugprading).
I believe that MDEV-16759 case of TRUNCATE TABLE should trigger similar corruption in 5.5 and 10.0 (and also MySQL 5.6). Some forms of ALTER TABLE could be affected as well, maybe even DROP TABLE (but that would more likely result in a server crash when dereferencing a stale block->index pointer if the memory has been reused). In MariaDB 10.2 (and MySQL 5.7), TRUNCATE TABLE works in a different way, so the same mechanism for corruption might not work.
For older versions, a work-around to prevent future corruption (not to heal any existing corruption) would be
SET GLOBAL innodb_adaptive_hash_index=OFF; |
The corruption was repeated even with innodb_change_buffering=none, so it looks like I was wrong to suspect the InnoDB change buffer.
I remember seeing this form of corruption starting from MySQL 5.5, not sure of the year, but something between 2010 and 2012. Now we have finally found a regression that was introduced by a MySQL bug fix in 2010. This could explain corruption for users of FOREIGN KEY constraints that include CASCADE or SET NULL rules:
MDEV-18272 InnoDB fails to rollback after exceeding FOREIGN KEY recursion depth
I do not have enough information to tell if this is the only remaining cause of such corruption. There have also been later corruption bugs introduced and fixed in later versions, as noted in earlier comments in this ticket. And the corruption would not necessarily be noticed until much later.
MDEV-19338 is a newer regression causing !cursor->index->is_committed() to happen.
The one is specific to VIRTUAL columns that are INDEXed, and replicated over Galera. No FOREIGN KEYs.
Starting with MariaDB 10.2.2 (and MariaDB 5.7.9), this class of corruption might be a consequence MDEV-15326, which is a race condition at transaction commit, which causes a corruption of the transactional locks. It is hard to reproduce that corruption, and I cannot say for sure if it could lead to indexes getting inconsistent with each other.
Starting with MariaDB 10.3.2 (and MDEV-11369 instant ADD COLUMN), instantly added columns can get corrupted if a delete-marked record is replaced by another one (MDEV-20066). If the instantly added column is indexed, the value in the secondary index would not match the wrong value in the clustered index.
Corruption can be caused by various things, and I am afraid that some cases (such as MDEV-23565) may always remain a mystery.
Currently, MDEV-22924 looks most promising for this class of problems. It has been claimed there that we reproduced some corruption of secondary indexes in our internal testing. However, the initial rr replay trace that I analyzed there was a false start, because it involved a scenario similar to MDEV-15532 where an incorrect error handling for XA transactions would allow ALTER ONLINE TABLE to corrupt the table.
It is hard to reproduce this failure, but I am confident that it will eventually happen. With an rr replay trace, it will be possible to follow the exact sequence of events.
Note: I am not claiming that MDEV-22924 will fix all cases of this kind of secondary index index corruption. And the database will not be ‘uncorrupted’ automatically. It would have to be repaired by dropping and creating the secondary indexes, or by rebuilding the affected tables.
On MDEV-22373 (which might be related to this) we did recreate the indexes (https://jira.mariadb.org/browse/MDEV-22373?focusedCommentId=161049&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-161049)
We, in fact, rebuilt the whole DB from a mysqdump file and crashes showed up as well.
marostegui, yes, MDEV-22373 belongs to this family of bugs. I am basically looking for an SQL-only way of reproducing this problem. Ideally, we would have an rr replay trace leading from an empty database (or something that was just initialized from a SQL dump) to the corruption. https://rr-project.org is a very powerful tool for analyzing any nondeterministic failures. But, obviously users or customers cannot share their confidential data, and it is not easy to shrink the dataset either. As far as I understand, your data is not confidential, but still, it probably is practically impossible to shrink the input to something that would reproduce the problem within a reasonable number of rr replay steps.
I have speculated earlier that innodb_change_buffering=none or innodb_adaptive_hash_index=OFF could make this corruption disappear. That is only a gut feeling, and the performance impact could prevent from deploying those settings this in practice.
Yes, unfortunately the hosts where we are seeing the crashes are 9TB, so hard to provide a dataset for that. But as you say, it is indeed public data
Btw, the backtrace in this comment looks very close/similar to the backtrace in MDEV-22255 (debug only), and also check MDEV-22855 (debug only).
MDEV-22924 was filed for a bug that MDEV-20301 introduced to the secondary index MVCC code in MariaDB Server 10.2.27, 10.3.18, 10.4.8, 10.5.0. As far as I understand, that bug should not cause ‘transient corruption’, that is, some secondary index records may be wrongly treated as non-existing. If such reads were used as input for an UPDATE or DELETE transaction, then the database could become corrupted. But, UPDATE and DELETE themselves always perform locking reads, which should be unaffected by the bug. Also the purge of transactions, which is performing non-locking reads, is using a different code path (row_search_on_row_ref()) that looks unaffected by the bug.
We are aware of ‘permanent corruption’ bugs that predate MDEV-22924. The chase for those continues.
rr:/home/mleich/RQG/storage/1603383112/MDEV-9663-TBR-1/dev/shm/vardir/1603383112/37/1/rr
|
_RR_TRACE_DIR="." rr replay --mark-stdio
|
|
(rr) bt
|
#0 __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:50
|
#1 0x000029ab0b252859 in __GI_abort () at abort.c:79
|
#2 0x000055bf8d5353b8 in ut_dbg_assertion_failed (expr=expr@entry=0x55bf8e47e7c0 "!cursor->index->is_committed()", file=file@entry=0x55bf8e47ab60 "/home/mleich/Server/bb-10.5-MDEV-23855D/storage/innobase/row/row0ins.cc", line=line@entry=218)
|
at /home/mleich/Server/bb-10.5-MDEV-23855D/storage/innobase/ut/ut0dbg.cc:60
|
#3 0x000055bf8d25f9ce in row_ins_sec_index_entry_by_modify (flags=flags@entry=0, mode=mode@entry=2, cursor=cursor@entry=0x1c9318e18bd0, offsets=offsets@entry=0x1c9318e18a40, offsets_heap=<optimized out>, heap=heap@entry=0x61a000363c98,
|
entry=<optimized out>, thr=<optimized out>, mtr=<optimized out>) at /home/mleich/Server/bb-10.5-MDEV-23855D/storage/innobase/row/row0ins.cc:218
|
#4 0x000055bf8d26ea2c in row_ins_sec_index_entry_low (flags=flags@entry=0, mode=<optimized out>, mode@entry=2, index=index@entry=0x61700020b1a0, offsets_heap=offsets_heap@entry=0x61a000678698, heap=heap@entry=0x61a000363c98,
|
entry=entry@entry=0x6170002d46a0, trx_id=<optimized out>, thr=<optimized out>) at /home/mleich/Server/bb-10.5-MDEV-23855D/storage/innobase/row/row0ins.cc:3062
|
#5 0x000055bf8d27d99f in row_ins_sec_index_entry (index=index@entry=0x61700020b1a0, entry=entry@entry=0x6170002d46a0, thr=thr@entry=0x62100004a8a8, check_foreign=check_foreign@entry=true)
|
at /home/mleich/Server/bb-10.5-MDEV-23855D/storage/innobase/row/row0ins.cc:3276
|
#6 0x000055bf8d27e563 in row_ins_index_entry (index=0x61700020b1a0, entry=0x6170002d46a0, thr=thr@entry=0x62100004a8a8) at /home/mleich/Server/bb-10.5-MDEV-23855D/storage/innobase/row/row0ins.cc:3323
|
#7 0x000055bf8d27e982 in row_ins_index_entry_step (node=node@entry=0x62100004a3c8, thr=thr@entry=0x62100004a8a8) at /usr/include/c++/9/bits/stl_iterator.h:819
|
#8 0x000055bf8d2801c1 in row_ins (node=node@entry=0x62100004a3c8, thr=thr@entry=0x62100004a8a8) at /home/mleich/Server/bb-10.5-MDEV-23855D/storage/innobase/row/row0ins.cc:3627
|
#9 0x000055bf8d280eac in row_ins_step (thr=thr@entry=0x62100004a8a8) at /home/mleich/Server/bb-10.5-MDEV-23855D/storage/innobase/row/row0ins.cc:3766
|
#10 0x000055bf8d2d74c7 in row_insert_for_mysql (mysql_rec=mysql_rec@entry=0x61a0004266b8 "\350\001", prebuilt=0x621000049da0, ins_mode=<optimized out>) at /home/mleich/Server/bb-10.5-MDEV-23855D/storage/innobase/row/row0mysql.cc:1421
|
#11 0x000055bf8cea698b in ha_innobase::write_row (this=0x61d000f6feb8, record=<optimized out>) at /home/mleich/Server/bb-10.5-MDEV-23855D/storage/innobase/handler/ha_innodb.cc:7572
|
#12 0x000055bf8c2a138f in handler::ha_write_row (this=0x61d000f6feb8, buf=<optimized out>) at /home/mleich/Server/bb-10.5-MDEV-23855D/sql/handler.cc:7146
|
#13 0x000055bf8b8a30bc in write_record (thd=thd@entry=0x62b0000d9218, table=table@entry=0x61900001c798, info=info@entry=0x1c9318e1b630, sink=sink@entry=0x0) at /home/mleich/Server/bb-10.5-MDEV-23855D/sql/sql_insert.cc:2104
|
#14 0x000055bf8b8cb423 in mysql_insert (thd=thd@entry=0x62b0000d9218, table_list=0x62b0000a8590, fields=..., values_list=..., update_fields=..., update_values=..., duplic=<optimized out>, ignore=<optimized out>, result=<optimized out>)
|
at /home/mleich/Server/bb-10.5-MDEV-23855D/sql/sql_insert.cc:1099
|
#15 0x000055bf8b9bb917 in mysql_execute_command (thd=thd@entry=0x62b0000d9218) at /home/mleich/Server/bb-10.5-MDEV-23855D/sql/sql_parse.cc:4550
|
#16 0x000055bf8b972c7c in mysql_parse (thd=thd@entry=0x62b0000d9218, rawbuf=<optimized out>, length=<optimized out>, parser_state=parser_state@entry=0x1c9318e1ce50, is_com_multi=is_com_multi@entry=false, is_next_command=is_next_command@entry=false)
|
at /home/mleich/Server/bb-10.5-MDEV-23855D/sql/sql_parse.cc:8010
|
#17 0x000055bf8b9a42c6 in dispatch_command (command=command@entry=COM_QUERY, thd=thd@entry=0x62b0000d9218,
|
packet=packet@entry=0x629000ba9219 " INSERT INTO t4 (col1,col2, col_int, col_string, col_text) VALUES /* 1 */ (1,1,1,REPEAT(SUBSTR(CAST( 1 AS CHAR),1,1), 10),REPEAT(SUBSTR(CAST( 1 AS CHAR),1,1), @fill_amount) ), (1,1,1,REPEAT(SUBSTR(CAS"...,
|
packet_length=packet_length@entry=317, is_com_multi=is_com_multi@entry=false, is_next_command=is_next_command@entry=false) at /home/mleich/Server/bb-10.5-MDEV-23855D/sql/sql_class.h:1252
|
#18 0x000055bf8b9af76d in do_command (thd=0x62b0000d9218) at /home/mleich/Server/bb-10.5-MDEV-23855D/sql/sql_parse.cc:1352
|
#19 0x000055bf8be44a02 in do_handle_one_connection (connect=<optimized out>, connect@entry=0x608000008a38, put_in_cache=put_in_cache@entry=true) at /home/mleich/Server/bb-10.5-MDEV-23855D/sql/sql_connect.cc:1410
|
#20 0x000055bf8be45c38 in handle_one_connection (arg=arg@entry=0x608000008a38) at /home/mleich/Server/bb-10.5-MDEV-23855D/sql/sql_connect.cc:1312
|
#21 0x000055bf8cc65fc5 in pfs_spawn_thread (arg=0x61500000c398) at /home/mleich/Server/bb-10.5-MDEV-23855D/storage/perfschema/pfs.cc:2201
|
#22 0x00007b591d963609 in start_thread (arg=<optimized out>) at pthread_create.c:477
|
#23 0x000029ab0b34f103 in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95
|
(rr)
|
|
The corresponding archive is
|
/home/mleich/RQG/storage/1603383112/008920.tgz
|
--source include/have_innodb.inc
|
CREATE TABLE t1 (f1 int not null primary key,
|
f2 int)engine=innodb;
|
INSERT INTO t1 VALUES(1, 1);
|
INSERT INTO t1 VALUES(2, 2);
|
INSERT INTO t1 VALUES(3, 3);
|
connect(con1,localhost,root,,,);
|
XA START 'p1';
|
DELETE FROM t1 where f1 = 1;
|
XA end 'p1';
|
XA prepare 'p1';
|
--error 1399
|
commit;
|
|
connection default;
|
set DEBUG_SYNC='innodb_commit_inplace_alter_table_wait SIGNAL xa_commit WAIT_FOR alter_';
|
--send
|
ALTER TABLE t1 ADD INDEX idx(f2, f1), algorithm=nocopy;
|
|
connection con1;
|
set DEBUG_SYNC='now WAIT_FOR xa_commit';
|
XA COMMIT 'p1';
|
set DEBUG_SYNC="now SIGNAL alter_";
|
|
connection default;
|
reap;
|
INSERT INTO t1 VALUES(1, 1);
|
The above test case repeats the matthias's reported stack trace. It is same issue as MDEV-15532
Has anyone experienced this corruption on a fresh MariaDB 10.5 installation that was not upgraded from an earlier version, or on any MariaDB version where innodb_change_buffering=none was always set? I suspect that MDEV-24449 could explain this bug (as well as MDEV-20934). If my hypothesis is correct, then this corruption should not be possible in MariaDB 10.5. (But the database may have been corrupted before upgrade.)
As noted in MDEV-24449, I was able to reproduce one type of corruption, but not exactly this type. I am rather convinced that the bug that was fixed by the one-line fix in MDEV-24449 can explain various types of corruption, including this MDEV-9663 type that I am aware of since the MySQL 5.1 or 5.5 days (possibly since about 2008, and for sure since 2010). That race condition is present in the very first InnoDB commit. The probability of encountering the race condition was significantly increased by the widespread use of hot backup tools (at least Percona XtraBackup or Mariabackup).
I am afraid that we will have to wait for feedback for several months to confirm whether MDEV-24449 (or MDEV-19514 in MariaDB 10.5) actually was the last bug that can cause corruption of secondary index pages.
Two months have passed. I think we can close this. If anything will pop up, it can reopened anytime.
As much as I would like to believe that MDEV-24449 and MDEV-24709 fixed all remaining causes of this, we recently had a support customer who encountered corruption of a secondary index (of non-virtual columns) after starting from a logical dump, without restoring any backup or invoking crash recovery in between. So, I am afraid that some bug may still be out there. But, it might be best filed as a new ticket.
The CHECK TABLE…EXTENDED that was implemented in MDEV-24402 will flag secondary indexes corrupted if they contain entries that should not exist.
Crashes due to this corruption should have been (mostly) fixed in MDEV-13542. Because we were not able to reproduce this corruption, I cannot fully guarantee it.
While analyzing a failure from a stress test of MDEV-30009, I may have found a possible explanation of this. The scenario is as follows.
- Some changes were buffered to a secondary index leaf page that was not located in the buffer pool.
- The page was freed (possibly as part of DROP INDEX).
- During ibuf_read_merge_pages(), we will reset the change buffer bitmap bits but will not remove the change buffer records.
- The same page is allocated and reused for something else.
- The page is evicted from the buffer pool.
- Something is added to the change buffer for the page.
- On a change buffer merge, we will apply both old (bogus) and new entries to the page.
The extra delete-unmarked records could simply originate from previously buffered inserts that were not discarded as they were supposed to, in the above scenario.
As far as I can tell, all MySQL and MariaDB versions are affected by this. The code changes that were applied in MDEV-20934 did not fix this, because that code would only be executed on shutdown with innodb_fast_shutdown=0.
The InnoDB change buffer was disabled by default in MDEV-27734.
Note: We still have many open bugs related to the corruption of indexes that include virtual columns. Unlike this corruption, those corruptions are much easier to reproduce. Implementing some file format changes such as MDEV-17598 could help a lot with those bugs.
jplindst, could you please look into it?