[MDEV-13512] buf_flush_update_zip_checksum() corrupts SPATIAL INDEX in ROW_FORMAT=COMPRESSED tables - Jira

Details

Type: Bug
Status: Closed (View Workflow)
Priority: Major
Resolution: Fixed
Affects Version/s: 10.1(EOL), 10.2(EOL)
Fix Version/s: 10.1.29, 10.2.10
Component/s: Storage Engine - InnoDB
Labels:
None

Sprint:
10.2.10

Description

10.1 1ff65c271ff3edef5ccba72141047fda19170163
2017-08-13 20:28:43 7f970a88fb00 InnoDB: Assertion failure in thread 140286693538560 in file page0zip.cc line 3035
InnoDB: Failing assertion: !memcmp(FIL_PAGE_TYPE + page, FIL_PAGE_TYPE + page_zip->data, PAGE_HEADER - FIL_PAGE_TYPE)

#5 0x00007f97085b53fa in abort () from /lib/x86_64-linux-gnu/libc.so.6
#6 0x00007f9701baa253 in page_zip_decompress (page_zip=0x7f96ff3c0eb0, page=0x7f96ff6f4000 "\f%", all=0) at /data/src/10.1/storage/innobase/page/page0zip.cc:3033
#7 0x00007f9701b945de in page_cur_insert_rec_zip (cursor=0x7f970a88ca68, index=0x7f96f24dc878, rec=0x7f96f240fa00 "\200", offsets=0x7f96f240fa38, mtr=0x7f970a88cae0) at /data/src/10.1/storage/innobase/page/page0cur.cc:1427
#8 0x00007f97019f88a3 in page_cur_tuple_insert (cursor=0x7f970a88ca68, tuple=0x7f96f24dc5f8, index=0x7f96f24dc878, offsets=0x7f970a88ca20, heap=0x7f970a88ca30, n_ext=0, mtr=0x7f970a88cae0) at /data/src/10.1/storage/innobase/include/page0cur.ic:288
#9 0x00007f97019fc7e0 in btr_cur_optimistic_insert (flags=0, cursor=0x7f970a88ca60, offsets=0x7f970a88ca20, heap=0x7f970a88ca30, entry=0x7f96f24dc5f8, rec=0x7f970a88ca38, big_rec=0x7f970a88ca28, n_ext=0, thr=0x7f96f2457b68, mtr=0x7f970a88cae0) at /data/src/10.1/storage/innobase/btr/btr0cur.cc:1478
#10 0x00007f9701be9a52 in row_ins_clust_index_entry_low (flags=0, mode=2, index=0x7f96f24dc878, n_uniq=1, entry=0x7f96f24dc5f8, n_ext=0, thr=0x7f96f2457b68) at /data/src/10.1/storage/innobase/row/row0ins.cc:2518
#11 0x00007f9701beaac7 in row_ins_clust_index_entry (index=0x7f96f24dc878, entry=0x7f96f24dc5f8, thr=0x7f96f2457b68, n_ext=0) at /data/src/10.1/storage/innobase/row/row0ins.cc:2929
#12 0x00007f9701beadd5 in row_ins_index_entry (index=0x7f96f24dc878, entry=0x7f96f24dc5f8, thr=0x7f96f2457b68) at /data/src/10.1/storage/innobase/row/row0ins.cc:3027
#13 0x00007f9701beb098 in row_ins_index_entry_step (node=0x7f96f2457780, thr=0x7f96f2457b68) at /data/src/10.1/storage/innobase/row/row0ins.cc:3104
#14 0x00007f9701beb3ab in row_ins (node=0x7f96f2457780, thr=0x7f96f2457b68) at /data/src/10.1/storage/innobase/row/row0ins.cc:3244
#15 0x00007f9701beb712 in row_ins_step (thr=0x7f96f2457b68) at /data/src/10.1/storage/innobase/row/row0ins.cc:3369
#16 0x00007f9701c02b99 in row_insert_for_mysql (mysql_rec=0x7f96f2459688 "\242", prebuilt=0x7f96f2457078) at /data/src/10.1/storage/innobase/row/row0mysql.cc:1424
#17 0x00007f9701af9ad1 in ha_innodb::write_row (this=0x7f96f2499088, record=0x7f96f2459688 "\242") at /data/src/10.1/storage/innobase/handler/ha_innodb.cc:8257
#18 0x000055aea6b69831 in handler::ha_write_row (this=0x7f96f2499088, buf=0x7f96f2459688 "\242") at /data/src/10.1/sql/handler.cc:5945
#19 0x000055aea6902690 in write_record (thd=0x7f96fefc9070, table=0x7f96f24b9e70, info=0x7f970a88d910) at /data/src/10.1/sql/sql_insert.cc:1914
#20 0x000055aea6900244 in mysql_insert (thd=0x7f96fefc9070, table_list=0x7f96f24431a0, fields=..., values_list=..., update_fields=..., update_values=..., duplic=DUP_ERROR, ignore=false) at /data/src/10.1/sql/sql_insert.cc:1039
#21 0x000055aea6924579 in mysql_execute_command (thd=0x7f96fefc9070) at /data/src/10.1/sql/sql_parse.cc:3926
#22 0x000055aea692ef2b in mysql_parse (thd=0x7f96fefc9070, rawbuf=0x7f96f2443088 "INSERT INTO t1 (pk, i2) VALUES (NULL, -1013317632)", length=50, parser_state=0x7f970a88e5e0) at /data/src/10.1/sql/sql_parse.cc:7344
#23 0x000055aea691de8a in dispatch_command (command=COM_QUERY, thd=0x7f96fefc9070, packet=0x7f9700f1c071 "INSERT INTO t1 (pk, i2) VALUES (NULL, -1013317632)", packet_length=50) at /data/src/10.1/sql/sql_parse.cc:1484
#24 0x000055aea691cc27 in do_command (thd=0x7f96fefc9070) at /data/src/10.1/sql/sql_parse.cc:1106
#25 0x000055aea6a54fa1 in do_handle_one_connection (thd_arg=0x7f96fefc9070) at /data/src/10.1/sql/sql_connect.cc:1349
#26 0x000055aea6a54d05 in handle_one_connection (arg=0x7f96fefc9070) at /data/src/10.1/sql/sql_connect.cc:1261
#27 0x000055aea6e0ccf2 in pfs_spawn_thread (arg=0x7f9700ee36f0) at /data/src/10.1/storage/perfschema/pfs.cc:1860
#28 0x00007f970a51e494 in start_thread (arg=0x7f970a88fb00) at pthread_create.c:333
#29 0x00007f970866993f in clone () from /lib/x86_64-linux-gnu/libc.so.6

The test case is attached. If it doesn't fail right away, try running with --repeat=N.

The test is still rather long, but reducing it further seems to decrease the probability of failure.
Please note there is a server restart in the middle of the test, after initial creation/population of the table. I couldn't reproduce the failure without it.

Also reproducible on 10.3.

Attachments

- Sort By Name
- Sort By Date
- Ascending
- Descending
- Thumbnails
- List
- Download All

t3.test
236 kB
2017-08-13 17:32

Issue Links

relates to

MDEV-14227 innodb_gis.rtree_compress2 '4k,16k,innodb' test failure - assert fail

Closed

Activity

Ascending order - Click to sort in descending order

Elena Stepanova created issue - 2017-08-13 17:32

Elena Stepanova made changes - 2017-08-13 17:33

Field	Original Value	New Value
Description	{noformat:title=10.1 1ff65c271ff3edef5ccba72141047fda19170163} 2017-08-13 20:28:43 7f970a88fb00 InnoDB: Assertion failure in thread 140286693538560 in file page0zip.cc line 3035 InnoDB: Failing assertion: !memcmp(FIL_PAGE_TYPE + page, FIL_PAGE_TYPE + page_zip->data, PAGE_HEADER - FIL_PAGE_TYPE) #5 0x00007f97085b53fa in abort () from /lib/x86_64-linux-gnu/libc.so.6 #6 0x00007f9701baa253 in page_zip_decompress (page_zip=0x7f96ff3c0eb0, page=0x7f96ff6f4000 "\f%", all=0) at /data/src/10.1/storage/innobase/page/page0zip.cc:3033 #7 0x00007f9701b945de in page_cur_insert_rec_zip (cursor=0x7f970a88ca68, index=0x7f96f24dc878, rec=0x7f96f240fa00 "\200", offsets=0x7f96f240fa38, mtr=0x7f970a88cae0) at /data/src/10.1/storage/innobase/page/page0cur.cc:1427 #8 0x00007f97019f88a3 in page_cur_tuple_insert (cursor=0x7f970a88ca68, tuple=0x7f96f24dc5f8, index=0x7f96f24dc878, offsets=0x7f970a88ca20, heap=0x7f970a88ca30, n_ext=0, mtr=0x7f970a88cae0) at /data/src/10.1/storage/innobase/include/page0cur.ic:288 #9 0x00007f97019fc7e0 in btr_cur_optimistic_insert (flags=0, cursor=0x7f970a88ca60, offsets=0x7f970a88ca20, heap=0x7f970a88ca30, entry=0x7f96f24dc5f8, rec=0x7f970a88ca38, big_rec=0x7f970a88ca28, n_ext=0, thr=0x7f96f2457b68, mtr=0x7f970a88cae0) at /data/src/10.1/storage/innobase/btr/btr0cur.cc:1478 #10 0x00007f9701be9a52 in row_ins_clust_index_entry_low (flags=0, mode=2, index=0x7f96f24dc878, n_uniq=1, entry=0x7f96f24dc5f8, n_ext=0, thr=0x7f96f2457b68) at /data/src/10.1/storage/innobase/row/row0ins.cc:2518 #11 0x00007f9701beaac7 in row_ins_clust_index_entry (index=0x7f96f24dc878, entry=0x7f96f24dc5f8, thr=0x7f96f2457b68, n_ext=0) at /data/src/10.1/storage/innobase/row/row0ins.cc:2929 #12 0x00007f9701beadd5 in row_ins_index_entry (index=0x7f96f24dc878, entry=0x7f96f24dc5f8, thr=0x7f96f2457b68) at /data/src/10.1/storage/innobase/row/row0ins.cc:3027 #13 0x00007f9701beb098 in row_ins_index_entry_step (node=0x7f96f2457780, thr=0x7f96f2457b68) at /data/src/10.1/storage/innobase/row/row0ins.cc:3104 #14 0x00007f9701beb3ab in row_ins (node=0x7f96f2457780, thr=0x7f96f2457b68) at /data/src/10.1/storage/innobase/row/row0ins.cc:3244 #15 0x00007f9701beb712 in row_ins_step (thr=0x7f96f2457b68) at /data/src/10.1/storage/innobase/row/row0ins.cc:3369 #16 0x00007f9701c02b99 in row_insert_for_mysql (mysql_rec=0x7f96f2459688 "\242", prebuilt=0x7f96f2457078) at /data/src/10.1/storage/innobase/row/row0mysql.cc:1424 #17 0x00007f9701af9ad1 in ha_innodb::write_row (this=0x7f96f2499088, record=0x7f96f2459688 "\242") at /data/src/10.1/storage/innobase/handler/ha_innodb.cc:8257 #18 0x000055aea6b69831 in handler::ha_write_row (this=0x7f96f2499088, buf=0x7f96f2459688 "\242") at /data/src/10.1/sql/handler.cc:5945 #19 0x000055aea6902690 in write_record (thd=0x7f96fefc9070, table=0x7f96f24b9e70, info=0x7f970a88d910) at /data/src/10.1/sql/sql_insert.cc:1914 #20 0x000055aea6900244 in mysql_insert (thd=0x7f96fefc9070, table_list=0x7f96f24431a0, fields=..., values_list=..., update_fields=..., update_values=..., duplic=DUP_ERROR, ignore=false) at /data/src/10.1/sql/sql_insert.cc:1039 #21 0x000055aea6924579 in mysql_execute_command (thd=0x7f96fefc9070) at /data/src/10.1/sql/sql_parse.cc:3926 #22 0x000055aea692ef2b in mysql_parse (thd=0x7f96fefc9070, rawbuf=0x7f96f2443088 "INSERT INTO t1 (pk, i2) VALUES (NULL, -1013317632)", length=50, parser_state=0x7f970a88e5e0) at /data/src/10.1/sql/sql_parse.cc:7344 #23 0x000055aea691de8a in dispatch_command (command=COM_QUERY, thd=0x7f96fefc9070, packet=0x7f9700f1c071 "INSERT INTO t1 (pk, i2) VALUES (NULL, -1013317632)", packet_length=50) at /data/src/10.1/sql/sql_parse.cc:1484 #24 0x000055aea691cc27 in do_command (thd=0x7f96fefc9070) at /data/src/10.1/sql/sql_parse.cc:1106 #25 0x000055aea6a54fa1 in do_handle_one_connection (thd_arg=0x7f96fefc9070) at /data/src/10.1/sql/sql_connect.cc:1349 #26 0x000055aea6a54d05 in handle_one_connection (arg=0x7f96fefc9070) at /data/src/10.1/sql/sql_connect.cc:1261 #27 0x000055aea6e0ccf2 in pfs_spawn_thread (arg=0x7f9700ee36f0) at /data/src/10.1/storage/perfschema/pfs.cc:1860 #28 0x00007f970a51e494 in start_thread (arg=0x7f970a88fb00) at pthread_create.c:333 #29 0x00007f970866993f in clone () from /lib/x86_64-linux-gnu/libc.so.6 {noformat} The test case is attached. If it doesn't fail right away, try running with {{--repeat=N}}. The test is still rather long, but reducing it further seems to decrease the probability of failure. Please note there is a server restart in the middle of the test, after initial creation/population of the table. I couldn't reproduce the failure without it.	{noformat:title=10.1 1ff65c271ff3edef5ccba72141047fda19170163} 2017-08-13 20:28:43 7f970a88fb00 InnoDB: Assertion failure in thread 140286693538560 in file page0zip.cc line 3035 InnoDB: Failing assertion: !memcmp(FIL_PAGE_TYPE + page, FIL_PAGE_TYPE + page_zip->data, PAGE_HEADER - FIL_PAGE_TYPE) #5 0x00007f97085b53fa in abort () from /lib/x86_64-linux-gnu/libc.so.6 #6 0x00007f9701baa253 in page_zip_decompress (page_zip=0x7f96ff3c0eb0, page=0x7f96ff6f4000 "\f%", all=0) at /data/src/10.1/storage/innobase/page/page0zip.cc:3033 #7 0x00007f9701b945de in page_cur_insert_rec_zip (cursor=0x7f970a88ca68, index=0x7f96f24dc878, rec=0x7f96f240fa00 "\200", offsets=0x7f96f240fa38, mtr=0x7f970a88cae0) at /data/src/10.1/storage/innobase/page/page0cur.cc:1427 #8 0x00007f97019f88a3 in page_cur_tuple_insert (cursor=0x7f970a88ca68, tuple=0x7f96f24dc5f8, index=0x7f96f24dc878, offsets=0x7f970a88ca20, heap=0x7f970a88ca30, n_ext=0, mtr=0x7f970a88cae0) at /data/src/10.1/storage/innobase/include/page0cur.ic:288 #9 0x00007f97019fc7e0 in btr_cur_optimistic_insert (flags=0, cursor=0x7f970a88ca60, offsets=0x7f970a88ca20, heap=0x7f970a88ca30, entry=0x7f96f24dc5f8, rec=0x7f970a88ca38, big_rec=0x7f970a88ca28, n_ext=0, thr=0x7f96f2457b68, mtr=0x7f970a88cae0) at /data/src/10.1/storage/innobase/btr/btr0cur.cc:1478 #10 0x00007f9701be9a52 in row_ins_clust_index_entry_low (flags=0, mode=2, index=0x7f96f24dc878, n_uniq=1, entry=0x7f96f24dc5f8, n_ext=0, thr=0x7f96f2457b68) at /data/src/10.1/storage/innobase/row/row0ins.cc:2518 #11 0x00007f9701beaac7 in row_ins_clust_index_entry (index=0x7f96f24dc878, entry=0x7f96f24dc5f8, thr=0x7f96f2457b68, n_ext=0) at /data/src/10.1/storage/innobase/row/row0ins.cc:2929 #12 0x00007f9701beadd5 in row_ins_index_entry (index=0x7f96f24dc878, entry=0x7f96f24dc5f8, thr=0x7f96f2457b68) at /data/src/10.1/storage/innobase/row/row0ins.cc:3027 #13 0x00007f9701beb098 in row_ins_index_entry_step (node=0x7f96f2457780, thr=0x7f96f2457b68) at /data/src/10.1/storage/innobase/row/row0ins.cc:3104 #14 0x00007f9701beb3ab in row_ins (node=0x7f96f2457780, thr=0x7f96f2457b68) at /data/src/10.1/storage/innobase/row/row0ins.cc:3244 #15 0x00007f9701beb712 in row_ins_step (thr=0x7f96f2457b68) at /data/src/10.1/storage/innobase/row/row0ins.cc:3369 #16 0x00007f9701c02b99 in row_insert_for_mysql (mysql_rec=0x7f96f2459688 "\242", prebuilt=0x7f96f2457078) at /data/src/10.1/storage/innobase/row/row0mysql.cc:1424 #17 0x00007f9701af9ad1 in ha_innodb::write_row (this=0x7f96f2499088, record=0x7f96f2459688 "\242") at /data/src/10.1/storage/innobase/handler/ha_innodb.cc:8257 #18 0x000055aea6b69831 in handler::ha_write_row (this=0x7f96f2499088, buf=0x7f96f2459688 "\242") at /data/src/10.1/sql/handler.cc:5945 #19 0x000055aea6902690 in write_record (thd=0x7f96fefc9070, table=0x7f96f24b9e70, info=0x7f970a88d910) at /data/src/10.1/sql/sql_insert.cc:1914 #20 0x000055aea6900244 in mysql_insert (thd=0x7f96fefc9070, table_list=0x7f96f24431a0, fields=..., values_list=..., update_fields=..., update_values=..., duplic=DUP_ERROR, ignore=false) at /data/src/10.1/sql/sql_insert.cc:1039 #21 0x000055aea6924579 in mysql_execute_command (thd=0x7f96fefc9070) at /data/src/10.1/sql/sql_parse.cc:3926 #22 0x000055aea692ef2b in mysql_parse (thd=0x7f96fefc9070, rawbuf=0x7f96f2443088 "INSERT INTO t1 (pk, i2) VALUES (NULL, -1013317632)", length=50, parser_state=0x7f970a88e5e0) at /data/src/10.1/sql/sql_parse.cc:7344 #23 0x000055aea691de8a in dispatch_command (command=COM_QUERY, thd=0x7f96fefc9070, packet=0x7f9700f1c071 "INSERT INTO t1 (pk, i2) VALUES (NULL, -1013317632)", packet_length=50) at /data/src/10.1/sql/sql_parse.cc:1484 #24 0x000055aea691cc27 in do_command (thd=0x7f96fefc9070) at /data/src/10.1/sql/sql_parse.cc:1106 #25 0x000055aea6a54fa1 in do_handle_one_connection (thd_arg=0x7f96fefc9070) at /data/src/10.1/sql/sql_connect.cc:1349 #26 0x000055aea6a54d05 in handle_one_connection (arg=0x7f96fefc9070) at /data/src/10.1/sql/sql_connect.cc:1261 #27 0x000055aea6e0ccf2 in pfs_spawn_thread (arg=0x7f9700ee36f0) at /data/src/10.1/storage/perfschema/pfs.cc:1860 #28 0x00007f970a51e494 in start_thread (arg=0x7f970a88fb00) at pthread_create.c:333 #29 0x00007f970866993f in clone () from /lib/x86_64-linux-gnu/libc.so.6 {noformat} The test case is attached. If it doesn't fail right away, try running with {{--repeat=N}}. The test is still rather long, but reducing it further seems to decrease the probability of failure. Please note there is a server restart in the middle of the test, after initial creation/population of the table. I couldn't reproduce the failure without it. Also reproducible on 10.3.

Jan Lindström (Inactive) added a comment - 2017-08-14 04:31

Is that ENCRYPTED=YES necessary for crash ?

Jan Lindström (Inactive) added a comment - 2017-08-14 04:31 Is that ENCRYPTED=YES necessary for crash ?

Elena Stepanova added a comment - 2017-08-14 09:39

It appears that the table must be encrypted in order to hit the failure, at least I never got the failure otherwise. Whether it's encrypted because it has ENCRYPTED=YES or because the tables are set to be encrypted by default, wasn't important in the test initially, but on some reason the latest version of the test failed for me with the explicit clause and didn't fail otherwise, so I kept the clause.

Elena Stepanova added a comment - 2017-08-14 09:39 It appears that the table must be encrypted in order to hit the failure, at least I never got the failure otherwise. Whether it's encrypted because it has ENCRYPTED=YES or because the tables are set to be encrypted by default, wasn't important in the test initially, but on some reason the latest version of the test failed for me with the explicit clause and didn't fail otherwise, so I kept the clause.

Jan Lindström (Inactive) made changes - 2017-09-19 09:13

Status

Open [ 1 ]

In Progress [ 3 ]

Jan Lindström (Inactive) added a comment - 2017-09-19 10:23

https://github.com/MariaDB/server/commit/26301165ffa30cee5d237560d961834736d9b9ed

Jan Lindström (Inactive) added a comment - 2017-09-19 10:23 https://github.com/MariaDB/server/commit/26301165ffa30cee5d237560d961834736d9b9ed

Jan Lindström (Inactive) made changes - 2017-09-19 10:23

Assignee	Jan Lindström [ jplindst ]	Marko Mäkelä [ marko ]
Status	In Progress [ 3 ]	In Review [ 10002 ]

Marko Mäkelä added a comment - 2017-09-19 11:49

I think that this needs another round of review.
There are not many encryption tests for row_format=compressed. Perhaps there should be a test combination for encryption that would be used in some innodb_zip tests.

Marko Mäkelä added a comment - 2017-09-19 11:49 I think that this needs another round of review. There are not many encryption tests for row_format=compressed. Perhaps there should be a test combination for encryption that would be used in some innodb_zip tests.

Marko Mäkelä made changes - 2017-09-19 11:49

Assignee	Marko Mäkelä [ marko ]	Jan Lindström [ jplindst ]
Status	In Review [ 10002 ]	Stalled [ 10000 ]

Jan Lindström (Inactive) added a comment - 2017-09-20 08:23

Following existing test also test row_format=compressed:

innodb-compressed-blob

innodb-first-page-read

innodb-force-corrupt

innodb-key-rotation-disable

innodb_onlinealter_encryption

innodb_page_encryption_key_change

innodb-page_encryption_log_encryption

innodb-page_encryption

innodb-redo-badkey

innodb-redo-nokeys

Using some of innodb_zip tests is possible but they seem to create quite a few pages i.e. would need to wait until key rotation really has time to encrypt something before table is gone.

Jan Lindström (Inactive) added a comment - 2017-09-20 08:23 Following existing test also test row_format=compressed: innodb-compressed-blob innodb-first-page-read innodb-force-corrupt innodb-key-rotation-disable innodb_onlinealter_encryption innodb_page_encryption_key_change innodb-page_encryption_log_encryption innodb-page_encryption innodb-redo-badkey innodb-redo-nokeys Using some of innodb_zip tests is possible but they seem to create quite a few pages i.e. would need to wait until key rotation really has time to encrypt something before table is gone.

Marko Mäkelä added a comment - 2017-09-20 09:26

At the very least, I would like to see a test that reproduces the problem when the fix is not present.

Marko Mäkelä added a comment - 2017-09-20 09:26 At the very least, I would like to see a test that reproduces the problem when the fix is not present.

Jan Lindström (Inactive) made changes - 2017-09-20 10:57

Status

Stalled [ 10000 ]

In Progress [ 3 ]

Jan Lindström (Inactive) added a comment - 2017-09-20 10:57

https://github.com/MariaDB/server/commit/9b998c18283c8acd4ed3d8e6df503fe861af04f0

Jan Lindström (Inactive) added a comment - 2017-09-20 10:57 https://github.com/MariaDB/server/commit/9b998c18283c8acd4ed3d8e6df503fe861af04f0

Jan Lindström (Inactive) made changes - 2017-09-20 10:57

Assignee	Jan Lindström [ jplindst ]	Marko Mäkelä [ marko ]
Status	In Progress [ 3 ]	In Review [ 10002 ]

Sergei Golubchik made changes - 2017-10-04 14:30

Sprint

10.2.10 [ 183 ]

Marko Mäkelä added a comment - 2017-10-06 08:50

I do not understand what the fix is supposed to achieve, apart from skipping the comparison that is failing.

A design constraint of ROW_FORMAT=COMPRESSED which I designed and implemented in the InnoDB Plugin for MySQL 5.1 is that the compressed and uncompressed index page headers must match at all times. That is, if one of the headers is updated, then the other one must be update as well, while holding X-latch or SX-latch on block->lock. Once the latch is released, the two copies must be equivalent. Something is apparently breaking this constraint, and we must find and fix the cause of this. I do not think that the check should be relaxed without fully understanding the problem.

I cannot repeat any problem with the included test case (the commit features an slightly extended version of t3.test), either in 10.1 or 10.2. I did verify that page_zip_decompress() is being successfully called after restart and that the data file is encrypted.

If I could repeat the problem, I could better suggest how to fix this.

Marko Mäkelä added a comment - 2017-10-06 08:50 I do not understand what the fix is supposed to achieve, apart from skipping the comparison that is failing. A design constraint of ROW_FORMAT=COMPRESSED which I designed and implemented in the InnoDB Plugin for MySQL 5.1 is that the compressed and uncompressed index page headers must match at all times. That is, if one of the headers is updated, then the other one must be update as well, while holding X-latch or SX-latch on block->lock. Once the latch is released, the two copies must be equivalent. Something is apparently breaking this constraint, and we must find and fix the cause of this. I do not think that the check should be relaxed without fully understanding the problem. I cannot repeat any problem with the included test case (the commit features an slightly extended version of t3.test ), either in 10.1 or 10.2. I did verify that page_zip_decompress() is being successfully called after restart and that the data file is encrypted. If I could repeat the problem, I could better suggest how to fix this.

Marko Mäkelä made changes - 2017-10-06 08:50

Assignee	Marko Mäkelä [ marko ]	Jan Lindström [ jplindst ]
Status	In Review [ 10002 ]	Stalled [ 10000 ]

Marko Mäkelä added a comment - 2017-10-06 09:41

If the mismatch is caused by encryption updating only one of block->frame or block->page.zip.data for the buf_block_t* block of a ROW_FORMAT=COMPRESSED table, then the root cause should be fixed by updating both copies at the same time.

Note that during buffer pool flushing, both the uncompressed and compressed page must exist in the buffer pool. This is because flushing is protected by buf_block_t::lock. At other times, the page might exist in compressed format only. If it contains changes that must be written to the *.ibd file, then the buf_page_t::state will be BUF_BLOCK_ZIP_DIRTY.

Marko Mäkelä added a comment - 2017-10-06 09:41 If the mismatch is caused by encryption updating only one of block->frame or block->page.zip.data for the buf_block_t* block of a ROW_FORMAT=COMPRESSED table, then the root cause should be fixed by updating both copies at the same time. Note that during buffer pool flushing, both the uncompressed and compressed page must exist in the buffer pool. This is because flushing is protected by buf_block_t::lock. At other times, the page might exist in compressed format only. If it contains changes that must be written to the *.ibd file, then the buf_page_t::state will be BUF_BLOCK_ZIP_DIRTY.

Marko Mäkelä added a comment - 2017-10-06 14:48

I found the problem by running the test on perro. On two subsequent runs, the problem occurred on page 4:10, giving some hope of determinism.
So, I decided to run it inside --manual-gdb. Until the shutdown, there was no problem.
Before the restart, I set a breakpoint:

break buf_page_get_gen

cond 1 space==4&&offset==10

run

finish

disable

display/x $1->page[24]@14

display/x $1->page.zip.data[24]@14

At this point, I noticed that both the uncompressed frame and the compressed page contain non-zero bytes. The mismatch that I had observed in the core dumps was that the block->page.zip.data contained zeroes while block->frame contained nonzero bytes.

So, I started to suspect that the problem is that block->page.zip.data is zeroed out, and not that block->frame is being written with nonzero bytes without updating block->page.zip.data accordingly.

I set a watchpoint:

p (unsigned*)&$1->page.zip.data[28]

watch *$2

continue

This watchpoint was triggered in a page flush:

#1  0x000055555616dda7 in buf_flush_update_zip_checksum (

    page=0x7fffed4f8000 "\f%", zip_size=8192, lsn=2195866)

    at /home/mariadb/git/10.1-mdev13512/storage/xtradb/buf/buf0flu.cc:760

#2  0x000055555616df0c in buf_flush_init_for_writing (

    page=0x7fffed4f4000 "\f%", page_zip_=0x7fffecfbfaf8, newest_lsn=2195866)

    at /home/mariadb/git/10.1-mdev13512/storage/xtradb/buf/buf0flu.cc:804

#3  0x000055555616e3e3 in buf_flush_write_block_low (bpage=0x7fffecfbfae8,

    flush_type=BUF_FLUSH_LIST, sync=false)

    at /home/mariadb/git/10.1-mdev13512/storage/xtradb/buf/buf0flu.cc:950

#4  0x000055555616ea0e in buf_flush_page (buf_pool=0x7ffff0fb0478,

    bpage=0x7fffecfbfae8, flush_type=BUF_FLUSH_LIST, sync=false)

    at /home/mariadb/git/10.1-mdev13512/storage/xtradb/buf/buf0flu.cc:1130

#5  0x000055555616efe8 in buf_flush_try_neighbors (space=4, offset=10,

    flush_type=BUF_FLUSH_LIST, n_flushed=1, n_to_flush=100)

    at /home/mariadb/git/10.1-mdev13512/storage/xtradb/buf/buf0flu.cc:1344

#6  0x000055555616f308 in buf_flush_page_and_try_neighbors (

    bpage=0x7fffecfbfae8, flush_type=BUF_FLUSH_LIST, n_to_flush=100,

    count=0x7fffe33feae8)

    at /home/mariadb/git/10.1-mdev13512/storage/xtradb/buf/buf0flu.cc:1433

#7  0x000055555616fe48 in buf_do_flush_list_batch (buf_pool=0x7ffff0fb0478,

    min_n=100, lsn_limit=18446744073709551615)

buf_flush_update_zip_checksum() zeroes out the FIL_PAGE_FILE_FLUSH_LSN_OR_KEY_VERSION field, which it perhaps should not do?

	ib_uint32_t	checksum = static_cast<ib_uint32_t>(

		page_zip_calc_checksum(

			page, zip_size,

			static_cast<srv_checksum_algorithm_t>(

				srv_checksum_algorithm)));

	mach_write_to_8(page + FIL_PAGE_LSN, lsn);

	memset(page + FIL_PAGE_FILE_FLUSH_LSN_OR_KEY_VERSION, 0, 8);

	mach_write_to_4(page + FIL_PAGE_SPACE_OR_CHKSUM, checksum);

Note that the page here is the compressed page frame (in this case, zip_size is half the srv_page_size aka innodb_page_size).

Without encryption, this field would always be zero in both the compressed and uncompressed page header.
Perhaps we should remove this memset() altogether. The field is excluded from the checksum calculation.
This memset() became redundant with my MySQL Bug #53306 fix which added memset() to fsp_init_file_page_low(). So, it is safe to remove the memset().

It looks like the memset() must be breaking SPATIAL INDEX in 10.2 for ROW_FORMAT=COMPRESSED, by zeroing out the SSN (split sequence number) field. In MySQL 5.7.5, two memset() were removed from buf0flu.cc in WL#6968 InnoDB GIS: R-tree index support.

The proper fix is to remove the 2 offending memset() from both InnoDB and XtraDB, rather than to shoot or cripple the messenger (the consistency check).

diff --git a/storage/innobase/buf/buf0flu.cc b/storage/innobase/buf/buf0flu.cc

index 31feb322826..dacddfca385 100644

--- a/storage/innobase/buf/buf0flu.cc

+++ b/storage/innobase/buf/buf0flu.cc

@@ -715,7 +715,6 @@ buf_flush_update_zip_checksum(

 				srv_checksum_algorithm)));

 	mach_write_to_8(page + FIL_PAGE_LSN, lsn);

-	memset(page + FIL_PAGE_FILE_FLUSH_LSN_OR_KEY_VERSION, 0, 8);

 	mach_write_to_4(page + FIL_PAGE_SPACE_OR_CHKSUM, checksum);

@@ -894,8 +893,6 @@ buf_flush_write_block_low(

 				bpage->newest_modification);

 		ut_a(page_zip_verify_checksum(frame, zip_size));

-		memset(frame + FIL_PAGE_FILE_FLUSH_LSN_OR_KEY_VERSION, 0, 8);

 		break;

 	case BUF_BLOCK_FILE_PAGE:

 		frame = bpage->zip.data;

I believe that the reason why I was unable to repeat the bug on my laptop is that there was not enough concurrency. On perro, there are more processor cores and more true concurrency between the page flush and the SQL execution thread.

I tried to prove my claim that SPATIAL INDEX is broken on ROW_FORMAT=COMPRESSED tables, but we do not have the larger --suite=innodb_gis tests from 5.7 (until ~~MDEV-13626~~ addresses that), and a modification of innodb.instant_alter in 10.3 to use ROW_FORMAT=COMPRESSED did not trigger any assertion failure. I suppose that the SSN only matters in certain SPATIAL INDEX operations.

Marko Mäkelä added a comment - 2017-10-06 14:48 I found the problem by running the test on perro. On two subsequent runs, the problem occurred on page 4:10, giving some hope of determinism. So, I decided to run it inside --manual-gdb. Until the shutdown, there was no problem. Before the restart, I set a breakpoint: break buf_page_get_gen cond 1 space==4&&offset==10 run finish disable display/x $1->page[24]@14 display/x $1->page.zip.data[24]@14 At this point, I noticed that both the uncompressed frame and the compressed page contain non-zero bytes. The mismatch that I had observed in the core dumps was that the block->page.zip.data contained zeroes while block->frame contained nonzero bytes. So, I started to suspect that the problem is that block->page.zip.data is zeroed out, and not that block->frame is being written with nonzero bytes without updating block->page.zip.data accordingly. I set a watchpoint: p (unsigned*)&$1->page.zip.data[28] watch *$2 continue This watchpoint was triggered in a page flush: #1 0x000055555616dda7 in buf_flush_update_zip_checksum ( page=0x7fffed4f8000 "\f%", zip_size=8192, lsn=2195866) at /home/mariadb/git/10.1-mdev13512/storage/xtradb/buf/buf0flu.cc:760 #2 0x000055555616df0c in buf_flush_init_for_writing ( page=0x7fffed4f4000 "\f%", page_zip_=0x7fffecfbfaf8, newest_lsn=2195866) at /home/mariadb/git/10.1-mdev13512/storage/xtradb/buf/buf0flu.cc:804 #3 0x000055555616e3e3 in buf_flush_write_block_low (bpage=0x7fffecfbfae8, flush_type=BUF_FLUSH_LIST, sync=false) at /home/mariadb/git/10.1-mdev13512/storage/xtradb/buf/buf0flu.cc:950 #4 0x000055555616ea0e in buf_flush_page (buf_pool=0x7ffff0fb0478, bpage=0x7fffecfbfae8, flush_type=BUF_FLUSH_LIST, sync=false) at /home/mariadb/git/10.1-mdev13512/storage/xtradb/buf/buf0flu.cc:1130 #5 0x000055555616efe8 in buf_flush_try_neighbors (space=4, offset=10, flush_type=BUF_FLUSH_LIST, n_flushed=1, n_to_flush=100) at /home/mariadb/git/10.1-mdev13512/storage/xtradb/buf/buf0flu.cc:1344 #6 0x000055555616f308 in buf_flush_page_and_try_neighbors ( bpage=0x7fffecfbfae8, flush_type=BUF_FLUSH_LIST, n_to_flush=100, count=0x7fffe33feae8) at /home/mariadb/git/10.1-mdev13512/storage/xtradb/buf/buf0flu.cc:1433 #7 0x000055555616fe48 in buf_do_flush_list_batch (buf_pool=0x7ffff0fb0478, min_n=100, lsn_limit=18446744073709551615) buf_flush_update_zip_checksum() zeroes out the FIL_PAGE_FILE_FLUSH_LSN_OR_KEY_VERSION field, which it perhaps should not do? ib_uint32_t checksum = static_cast <ib_uint32_t>( page_zip_calc_checksum( page, zip_size, static_cast <srv_checksum_algorithm_t>( srv_checksum_algorithm))); mach_write_to_8(page + FIL_PAGE_LSN, lsn); memset (page + FIL_PAGE_FILE_FLUSH_LSN_OR_KEY_VERSION, 0, 8); mach_write_to_4(page + FIL_PAGE_SPACE_OR_CHKSUM, checksum); Note that the page here is the compressed page frame (in this case, zip_size is half the srv_page_size aka innodb_page_size). Without encryption, this field would always be zero in both the compressed and uncompressed page header. Perhaps we should remove this memset() altogether. The field is excluded from the checksum calculation. This memset() became redundant with my MySQL Bug #53306 fix which added memset() to fsp_init_file_page_low() . So, it is safe to remove the memset(). It looks like the memset() must be breaking SPATIAL INDEX in 10.2 for ROW_FORMAT=COMPRESSED, by zeroing out the SSN (split sequence number) field. In MySQL 5.7.5, two memset() were removed from buf0flu.cc in WL#6968 InnoDB GIS: R-tree index support . The proper fix is to remove the 2 offending memset() from both InnoDB and XtraDB, rather than to shoot or cripple the messenger (the consistency check). diff --git a/storage/innobase/buf/buf0flu.cc b/storage/innobase/buf/buf0flu.cc index 31feb322826..dacddfca385 100644 --- a/storage/innobase/buf/buf0flu.cc +++ b/storage/innobase/buf/buf0flu.cc @@ -715,7 +715,6 @@ buf_flush_update_zip_checksum( srv_checksum_algorithm))); mach_write_to_8(page + FIL_PAGE_LSN, lsn); - memset(page + FIL_PAGE_FILE_FLUSH_LSN_OR_KEY_VERSION, 0, 8); mach_write_to_4(page + FIL_PAGE_SPACE_OR_CHKSUM, checksum); } @@ -894,8 +893,6 @@ buf_flush_write_block_low( bpage->newest_modification); ut_a(page_zip_verify_checksum(frame, zip_size)); - - memset(frame + FIL_PAGE_FILE_FLUSH_LSN_OR_KEY_VERSION, 0, 8); break; case BUF_BLOCK_FILE_PAGE: frame = bpage->zip.data; I believe that the reason why I was unable to repeat the bug on my laptop is that there was not enough concurrency. On perro, there are more processor cores and more true concurrency between the page flush and the SQL execution thread. I tried to prove my claim that SPATIAL INDEX is broken on ROW_FORMAT=COMPRESSED tables, but we do not have the larger --suite=innodb_gis tests from 5.7 (until MDEV-13626 addresses that), and a modification of innodb.instant_alter in 10.3 to use ROW_FORMAT=COMPRESSED did not trigger any assertion failure. I suppose that the SSN only matters in certain SPATIAL INDEX operations.

Marko Mäkelä made changes - 2017-10-06 14:49

Summary

InnoDB: Failing assertion: !memcmp(FIL_PAGE_TYPE + page, FIL_PAGE_TYPE + page_zip->data, PAGE_HEADER - FIL_PAGE_TYPE)

buf_flush_update_zip_checksum() corrupts SPATIAL INDEX in ROW_FORMAT=COMPRESSED tables

Marko Mäkelä made changes - 2017-10-06 15:00

Assignee

Jan Lindström [ jplindst ]

Marko Mäkelä [ marko ]

Marko Mäkelä added a comment - 2017-10-06 15:01

I removed the offending memset().

Marko Mäkelä added a comment - 2017-10-06 15:01 I removed the offending memset() .

Marko Mäkelä made changes - 2017-10-06 15:01

issue.field.resolutiondate

2017-10-06 15:01:35.0

2017-10-06 15:01:35.216

Marko Mäkelä made changes - 2017-10-06 15:01

Fix Version/s		10.1.29 [ 22636 ]
Fix Version/s		10.2.10 [ 22615 ]
Fix Version/s	10.2 [ 14601 ]
Fix Version/s	10.1 [ 16100 ]
Resolution		Fixed [ 1 ]
Status	Stalled [ 10000 ]	Closed [ 6 ]

Marko Mäkelä made changes - 2018-01-04 16:19

Link

This issue relates to ~~MDEV-14227~~ [ ~~MDEV-14227~~ ]

Sergei Golubchik made changes - 2021-12-06 21:45

Workflow

MariaDB v3 [ 82059 ]

MariaDB v4 [ 152606 ]

People

Assignee:: Marko Mäkelä

Reporter:: Elena Stepanova

Votes:: 0 Vote for this issue

Watchers:: 3 Start watching this issue

Dates

Created:: 2017-08-13 17:32

Updated:: 2018-01-04 16:19

Resolved:: 2017-10-06 15:01

Git Integration

Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.

MariaDB Server

Details

Description

Attachments

Attachments

Issue Links

Activity

People

Dates

Git Integration