Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-13512

buf_flush_update_zip_checksum() corrupts SPATIAL INDEX in ROW_FORMAT=COMPRESSED tables

Details

    • 10.2.10

    Description

      10.1 1ff65c271ff3edef5ccba72141047fda19170163

      2017-08-13 20:28:43 7f970a88fb00  InnoDB: Assertion failure in thread 140286693538560 in file page0zip.cc line 3035
      InnoDB: Failing assertion: !memcmp(FIL_PAGE_TYPE + page, FIL_PAGE_TYPE + page_zip->data, PAGE_HEADER - FIL_PAGE_TYPE)
       
      #5  0x00007f97085b53fa in abort () from /lib/x86_64-linux-gnu/libc.so.6
      #6  0x00007f9701baa253 in page_zip_decompress (page_zip=0x7f96ff3c0eb0, page=0x7f96ff6f4000 "\f%", all=0) at /data/src/10.1/storage/innobase/page/page0zip.cc:3033
      #7  0x00007f9701b945de in page_cur_insert_rec_zip (cursor=0x7f970a88ca68, index=0x7f96f24dc878, rec=0x7f96f240fa00 "\200", offsets=0x7f96f240fa38, mtr=0x7f970a88cae0) at /data/src/10.1/storage/innobase/page/page0cur.cc:1427
      #8  0x00007f97019f88a3 in page_cur_tuple_insert (cursor=0x7f970a88ca68, tuple=0x7f96f24dc5f8, index=0x7f96f24dc878, offsets=0x7f970a88ca20, heap=0x7f970a88ca30, n_ext=0, mtr=0x7f970a88cae0) at /data/src/10.1/storage/innobase/include/page0cur.ic:288
      #9  0x00007f97019fc7e0 in btr_cur_optimistic_insert (flags=0, cursor=0x7f970a88ca60, offsets=0x7f970a88ca20, heap=0x7f970a88ca30, entry=0x7f96f24dc5f8, rec=0x7f970a88ca38, big_rec=0x7f970a88ca28, n_ext=0, thr=0x7f96f2457b68, mtr=0x7f970a88cae0) at /data/src/10.1/storage/innobase/btr/btr0cur.cc:1478
      #10 0x00007f9701be9a52 in row_ins_clust_index_entry_low (flags=0, mode=2, index=0x7f96f24dc878, n_uniq=1, entry=0x7f96f24dc5f8, n_ext=0, thr=0x7f96f2457b68) at /data/src/10.1/storage/innobase/row/row0ins.cc:2518
      #11 0x00007f9701beaac7 in row_ins_clust_index_entry (index=0x7f96f24dc878, entry=0x7f96f24dc5f8, thr=0x7f96f2457b68, n_ext=0) at /data/src/10.1/storage/innobase/row/row0ins.cc:2929
      #12 0x00007f9701beadd5 in row_ins_index_entry (index=0x7f96f24dc878, entry=0x7f96f24dc5f8, thr=0x7f96f2457b68) at /data/src/10.1/storage/innobase/row/row0ins.cc:3027
      #13 0x00007f9701beb098 in row_ins_index_entry_step (node=0x7f96f2457780, thr=0x7f96f2457b68) at /data/src/10.1/storage/innobase/row/row0ins.cc:3104
      #14 0x00007f9701beb3ab in row_ins (node=0x7f96f2457780, thr=0x7f96f2457b68) at /data/src/10.1/storage/innobase/row/row0ins.cc:3244
      #15 0x00007f9701beb712 in row_ins_step (thr=0x7f96f2457b68) at /data/src/10.1/storage/innobase/row/row0ins.cc:3369
      #16 0x00007f9701c02b99 in row_insert_for_mysql (mysql_rec=0x7f96f2459688 "\242", prebuilt=0x7f96f2457078) at /data/src/10.1/storage/innobase/row/row0mysql.cc:1424
      #17 0x00007f9701af9ad1 in ha_innodb::write_row (this=0x7f96f2499088, record=0x7f96f2459688 "\242") at /data/src/10.1/storage/innobase/handler/ha_innodb.cc:8257
      #18 0x000055aea6b69831 in handler::ha_write_row (this=0x7f96f2499088, buf=0x7f96f2459688 "\242") at /data/src/10.1/sql/handler.cc:5945
      #19 0x000055aea6902690 in write_record (thd=0x7f96fefc9070, table=0x7f96f24b9e70, info=0x7f970a88d910) at /data/src/10.1/sql/sql_insert.cc:1914
      #20 0x000055aea6900244 in mysql_insert (thd=0x7f96fefc9070, table_list=0x7f96f24431a0, fields=..., values_list=..., update_fields=..., update_values=..., duplic=DUP_ERROR, ignore=false) at /data/src/10.1/sql/sql_insert.cc:1039
      #21 0x000055aea6924579 in mysql_execute_command (thd=0x7f96fefc9070) at /data/src/10.1/sql/sql_parse.cc:3926
      #22 0x000055aea692ef2b in mysql_parse (thd=0x7f96fefc9070, rawbuf=0x7f96f2443088 "INSERT INTO t1 (pk, i2) VALUES (NULL, -1013317632)", length=50, parser_state=0x7f970a88e5e0) at /data/src/10.1/sql/sql_parse.cc:7344
      #23 0x000055aea691de8a in dispatch_command (command=COM_QUERY, thd=0x7f96fefc9070, packet=0x7f9700f1c071 "INSERT INTO t1 (pk, i2) VALUES (NULL, -1013317632)", packet_length=50) at /data/src/10.1/sql/sql_parse.cc:1484
      #24 0x000055aea691cc27 in do_command (thd=0x7f96fefc9070) at /data/src/10.1/sql/sql_parse.cc:1106
      #25 0x000055aea6a54fa1 in do_handle_one_connection (thd_arg=0x7f96fefc9070) at /data/src/10.1/sql/sql_connect.cc:1349
      #26 0x000055aea6a54d05 in handle_one_connection (arg=0x7f96fefc9070) at /data/src/10.1/sql/sql_connect.cc:1261
      #27 0x000055aea6e0ccf2 in pfs_spawn_thread (arg=0x7f9700ee36f0) at /data/src/10.1/storage/perfschema/pfs.cc:1860
      #28 0x00007f970a51e494 in start_thread (arg=0x7f970a88fb00) at pthread_create.c:333
      #29 0x00007f970866993f in clone () from /lib/x86_64-linux-gnu/libc.so.6
      

      The test case is attached. If it doesn't fail right away, try running with --repeat=N.

      The test is still rather long, but reducing it further seems to decrease the probability of failure.
      Please note there is a server restart in the middle of the test, after initial creation/population of the table. I couldn't reproduce the failure without it.

      Also reproducible on 10.3.

      Attachments

        Issue Links

          Activity

            elenst Elena Stepanova created issue -
            elenst Elena Stepanova made changes -
            Field Original Value New Value
            Description {noformat:title=10.1 1ff65c271ff3edef5ccba72141047fda19170163}
            2017-08-13 20:28:43 7f970a88fb00 InnoDB: Assertion failure in thread 140286693538560 in file page0zip.cc line 3035
            InnoDB: Failing assertion: !memcmp(FIL_PAGE_TYPE + page, FIL_PAGE_TYPE + page_zip->data, PAGE_HEADER - FIL_PAGE_TYPE)

            #5 0x00007f97085b53fa in abort () from /lib/x86_64-linux-gnu/libc.so.6
            #6 0x00007f9701baa253 in page_zip_decompress (page_zip=0x7f96ff3c0eb0, page=0x7f96ff6f4000 "\f%", all=0) at /data/src/10.1/storage/innobase/page/page0zip.cc:3033
            #7 0x00007f9701b945de in page_cur_insert_rec_zip (cursor=0x7f970a88ca68, index=0x7f96f24dc878, rec=0x7f96f240fa00 "\200", offsets=0x7f96f240fa38, mtr=0x7f970a88cae0) at /data/src/10.1/storage/innobase/page/page0cur.cc:1427
            #8 0x00007f97019f88a3 in page_cur_tuple_insert (cursor=0x7f970a88ca68, tuple=0x7f96f24dc5f8, index=0x7f96f24dc878, offsets=0x7f970a88ca20, heap=0x7f970a88ca30, n_ext=0, mtr=0x7f970a88cae0) at /data/src/10.1/storage/innobase/include/page0cur.ic:288
            #9 0x00007f97019fc7e0 in btr_cur_optimistic_insert (flags=0, cursor=0x7f970a88ca60, offsets=0x7f970a88ca20, heap=0x7f970a88ca30, entry=0x7f96f24dc5f8, rec=0x7f970a88ca38, big_rec=0x7f970a88ca28, n_ext=0, thr=0x7f96f2457b68, mtr=0x7f970a88cae0) at /data/src/10.1/storage/innobase/btr/btr0cur.cc:1478
            #10 0x00007f9701be9a52 in row_ins_clust_index_entry_low (flags=0, mode=2, index=0x7f96f24dc878, n_uniq=1, entry=0x7f96f24dc5f8, n_ext=0, thr=0x7f96f2457b68) at /data/src/10.1/storage/innobase/row/row0ins.cc:2518
            #11 0x00007f9701beaac7 in row_ins_clust_index_entry (index=0x7f96f24dc878, entry=0x7f96f24dc5f8, thr=0x7f96f2457b68, n_ext=0) at /data/src/10.1/storage/innobase/row/row0ins.cc:2929
            #12 0x00007f9701beadd5 in row_ins_index_entry (index=0x7f96f24dc878, entry=0x7f96f24dc5f8, thr=0x7f96f2457b68) at /data/src/10.1/storage/innobase/row/row0ins.cc:3027
            #13 0x00007f9701beb098 in row_ins_index_entry_step (node=0x7f96f2457780, thr=0x7f96f2457b68) at /data/src/10.1/storage/innobase/row/row0ins.cc:3104
            #14 0x00007f9701beb3ab in row_ins (node=0x7f96f2457780, thr=0x7f96f2457b68) at /data/src/10.1/storage/innobase/row/row0ins.cc:3244
            #15 0x00007f9701beb712 in row_ins_step (thr=0x7f96f2457b68) at /data/src/10.1/storage/innobase/row/row0ins.cc:3369
            #16 0x00007f9701c02b99 in row_insert_for_mysql (mysql_rec=0x7f96f2459688 "\242", prebuilt=0x7f96f2457078) at /data/src/10.1/storage/innobase/row/row0mysql.cc:1424
            #17 0x00007f9701af9ad1 in ha_innodb::write_row (this=0x7f96f2499088, record=0x7f96f2459688 "\242") at /data/src/10.1/storage/innobase/handler/ha_innodb.cc:8257
            #18 0x000055aea6b69831 in handler::ha_write_row (this=0x7f96f2499088, buf=0x7f96f2459688 "\242") at /data/src/10.1/sql/handler.cc:5945
            #19 0x000055aea6902690 in write_record (thd=0x7f96fefc9070, table=0x7f96f24b9e70, info=0x7f970a88d910) at /data/src/10.1/sql/sql_insert.cc:1914
            #20 0x000055aea6900244 in mysql_insert (thd=0x7f96fefc9070, table_list=0x7f96f24431a0, fields=..., values_list=..., update_fields=..., update_values=..., duplic=DUP_ERROR, ignore=false) at /data/src/10.1/sql/sql_insert.cc:1039
            #21 0x000055aea6924579 in mysql_execute_command (thd=0x7f96fefc9070) at /data/src/10.1/sql/sql_parse.cc:3926
            #22 0x000055aea692ef2b in mysql_parse (thd=0x7f96fefc9070, rawbuf=0x7f96f2443088 "INSERT INTO t1 (pk, i2) VALUES (NULL, -1013317632)", length=50, parser_state=0x7f970a88e5e0) at /data/src/10.1/sql/sql_parse.cc:7344
            #23 0x000055aea691de8a in dispatch_command (command=COM_QUERY, thd=0x7f96fefc9070, packet=0x7f9700f1c071 "INSERT INTO t1 (pk, i2) VALUES (NULL, -1013317632)", packet_length=50) at /data/src/10.1/sql/sql_parse.cc:1484
            #24 0x000055aea691cc27 in do_command (thd=0x7f96fefc9070) at /data/src/10.1/sql/sql_parse.cc:1106
            #25 0x000055aea6a54fa1 in do_handle_one_connection (thd_arg=0x7f96fefc9070) at /data/src/10.1/sql/sql_connect.cc:1349
            #26 0x000055aea6a54d05 in handle_one_connection (arg=0x7f96fefc9070) at /data/src/10.1/sql/sql_connect.cc:1261
            #27 0x000055aea6e0ccf2 in pfs_spawn_thread (arg=0x7f9700ee36f0) at /data/src/10.1/storage/perfschema/pfs.cc:1860
            #28 0x00007f970a51e494 in start_thread (arg=0x7f970a88fb00) at pthread_create.c:333
            #29 0x00007f970866993f in clone () from /lib/x86_64-linux-gnu/libc.so.6
            {noformat}

            The test case is attached. If it doesn't fail right away, try running with {{--repeat=N}}.

            The test is still rather long, but reducing it further seems to decrease the probability of failure.
            Please note there is a server restart in the middle of the test, after initial creation/population of the table. I couldn't reproduce the failure without it.
            {noformat:title=10.1 1ff65c271ff3edef5ccba72141047fda19170163}
            2017-08-13 20:28:43 7f970a88fb00 InnoDB: Assertion failure in thread 140286693538560 in file page0zip.cc line 3035
            InnoDB: Failing assertion: !memcmp(FIL_PAGE_TYPE + page, FIL_PAGE_TYPE + page_zip->data, PAGE_HEADER - FIL_PAGE_TYPE)

            #5 0x00007f97085b53fa in abort () from /lib/x86_64-linux-gnu/libc.so.6
            #6 0x00007f9701baa253 in page_zip_decompress (page_zip=0x7f96ff3c0eb0, page=0x7f96ff6f4000 "\f%", all=0) at /data/src/10.1/storage/innobase/page/page0zip.cc:3033
            #7 0x00007f9701b945de in page_cur_insert_rec_zip (cursor=0x7f970a88ca68, index=0x7f96f24dc878, rec=0x7f96f240fa00 "\200", offsets=0x7f96f240fa38, mtr=0x7f970a88cae0) at /data/src/10.1/storage/innobase/page/page0cur.cc:1427
            #8 0x00007f97019f88a3 in page_cur_tuple_insert (cursor=0x7f970a88ca68, tuple=0x7f96f24dc5f8, index=0x7f96f24dc878, offsets=0x7f970a88ca20, heap=0x7f970a88ca30, n_ext=0, mtr=0x7f970a88cae0) at /data/src/10.1/storage/innobase/include/page0cur.ic:288
            #9 0x00007f97019fc7e0 in btr_cur_optimistic_insert (flags=0, cursor=0x7f970a88ca60, offsets=0x7f970a88ca20, heap=0x7f970a88ca30, entry=0x7f96f24dc5f8, rec=0x7f970a88ca38, big_rec=0x7f970a88ca28, n_ext=0, thr=0x7f96f2457b68, mtr=0x7f970a88cae0) at /data/src/10.1/storage/innobase/btr/btr0cur.cc:1478
            #10 0x00007f9701be9a52 in row_ins_clust_index_entry_low (flags=0, mode=2, index=0x7f96f24dc878, n_uniq=1, entry=0x7f96f24dc5f8, n_ext=0, thr=0x7f96f2457b68) at /data/src/10.1/storage/innobase/row/row0ins.cc:2518
            #11 0x00007f9701beaac7 in row_ins_clust_index_entry (index=0x7f96f24dc878, entry=0x7f96f24dc5f8, thr=0x7f96f2457b68, n_ext=0) at /data/src/10.1/storage/innobase/row/row0ins.cc:2929
            #12 0x00007f9701beadd5 in row_ins_index_entry (index=0x7f96f24dc878, entry=0x7f96f24dc5f8, thr=0x7f96f2457b68) at /data/src/10.1/storage/innobase/row/row0ins.cc:3027
            #13 0x00007f9701beb098 in row_ins_index_entry_step (node=0x7f96f2457780, thr=0x7f96f2457b68) at /data/src/10.1/storage/innobase/row/row0ins.cc:3104
            #14 0x00007f9701beb3ab in row_ins (node=0x7f96f2457780, thr=0x7f96f2457b68) at /data/src/10.1/storage/innobase/row/row0ins.cc:3244
            #15 0x00007f9701beb712 in row_ins_step (thr=0x7f96f2457b68) at /data/src/10.1/storage/innobase/row/row0ins.cc:3369
            #16 0x00007f9701c02b99 in row_insert_for_mysql (mysql_rec=0x7f96f2459688 "\242", prebuilt=0x7f96f2457078) at /data/src/10.1/storage/innobase/row/row0mysql.cc:1424
            #17 0x00007f9701af9ad1 in ha_innodb::write_row (this=0x7f96f2499088, record=0x7f96f2459688 "\242") at /data/src/10.1/storage/innobase/handler/ha_innodb.cc:8257
            #18 0x000055aea6b69831 in handler::ha_write_row (this=0x7f96f2499088, buf=0x7f96f2459688 "\242") at /data/src/10.1/sql/handler.cc:5945
            #19 0x000055aea6902690 in write_record (thd=0x7f96fefc9070, table=0x7f96f24b9e70, info=0x7f970a88d910) at /data/src/10.1/sql/sql_insert.cc:1914
            #20 0x000055aea6900244 in mysql_insert (thd=0x7f96fefc9070, table_list=0x7f96f24431a0, fields=..., values_list=..., update_fields=..., update_values=..., duplic=DUP_ERROR, ignore=false) at /data/src/10.1/sql/sql_insert.cc:1039
            #21 0x000055aea6924579 in mysql_execute_command (thd=0x7f96fefc9070) at /data/src/10.1/sql/sql_parse.cc:3926
            #22 0x000055aea692ef2b in mysql_parse (thd=0x7f96fefc9070, rawbuf=0x7f96f2443088 "INSERT INTO t1 (pk, i2) VALUES (NULL, -1013317632)", length=50, parser_state=0x7f970a88e5e0) at /data/src/10.1/sql/sql_parse.cc:7344
            #23 0x000055aea691de8a in dispatch_command (command=COM_QUERY, thd=0x7f96fefc9070, packet=0x7f9700f1c071 "INSERT INTO t1 (pk, i2) VALUES (NULL, -1013317632)", packet_length=50) at /data/src/10.1/sql/sql_parse.cc:1484
            #24 0x000055aea691cc27 in do_command (thd=0x7f96fefc9070) at /data/src/10.1/sql/sql_parse.cc:1106
            #25 0x000055aea6a54fa1 in do_handle_one_connection (thd_arg=0x7f96fefc9070) at /data/src/10.1/sql/sql_connect.cc:1349
            #26 0x000055aea6a54d05 in handle_one_connection (arg=0x7f96fefc9070) at /data/src/10.1/sql/sql_connect.cc:1261
            #27 0x000055aea6e0ccf2 in pfs_spawn_thread (arg=0x7f9700ee36f0) at /data/src/10.1/storage/perfschema/pfs.cc:1860
            #28 0x00007f970a51e494 in start_thread (arg=0x7f970a88fb00) at pthread_create.c:333
            #29 0x00007f970866993f in clone () from /lib/x86_64-linux-gnu/libc.so.6
            {noformat}

            The test case is attached. If it doesn't fail right away, try running with {{--repeat=N}}.

            The test is still rather long, but reducing it further seems to decrease the probability of failure.
            Please note there is a server restart in the middle of the test, after initial creation/population of the table. I couldn't reproduce the failure without it.

            Also reproducible on 10.3.

            Is that ENCRYPTED=YES necessary for crash ?

            jplindst Jan Lindström (Inactive) added a comment - Is that ENCRYPTED=YES necessary for crash ?

            It appears that the table must be encrypted in order to hit the failure, at least I never got the failure otherwise. Whether it's encrypted because it has ENCRYPTED=YES or because the tables are set to be encrypted by default, wasn't important in the test initially, but on some reason the latest version of the test failed for me with the explicit clause and didn't fail otherwise, so I kept the clause.

            elenst Elena Stepanova added a comment - It appears that the table must be encrypted in order to hit the failure, at least I never got the failure otherwise. Whether it's encrypted because it has ENCRYPTED=YES or because the tables are set to be encrypted by default, wasn't important in the test initially, but on some reason the latest version of the test failed for me with the explicit clause and didn't fail otherwise, so I kept the clause.
            jplindst Jan Lindström (Inactive) made changes -
            Status Open [ 1 ] In Progress [ 3 ]
            jplindst Jan Lindström (Inactive) added a comment - https://github.com/MariaDB/server/commit/26301165ffa30cee5d237560d961834736d9b9ed
            jplindst Jan Lindström (Inactive) made changes -
            Assignee Jan Lindström [ jplindst ] Marko Mäkelä [ marko ]
            Status In Progress [ 3 ] In Review [ 10002 ]

            I think that this needs another round of review.
            There are not many encryption tests for row_format=compressed. Perhaps there should be a test combination for encryption that would be used in some innodb_zip tests.

            marko Marko Mäkelä added a comment - I think that this needs another round of review. There are not many encryption tests for row_format=compressed. Perhaps there should be a test combination for encryption that would be used in some innodb_zip tests.
            marko Marko Mäkelä made changes -
            Assignee Marko Mäkelä [ marko ] Jan Lindström [ jplindst ]
            Status In Review [ 10002 ] Stalled [ 10000 ]

            Following existing test also test row_format=compressed:

            innodb-compressed-blob
            innodb-first-page-read
            innodb-force-corrupt
            innodb-key-rotation-disable
            innodb_onlinealter_encryption
            innodb_page_encryption_key_change
            innodb-page_encryption_log_encryption
            innodb-page_encryption
            innodb-redo-badkey
            innodb-redo-nokeys
            

            Using some of innodb_zip tests is possible but they seem to create quite a few pages i.e. would need to wait until key rotation really has time to encrypt something before table is gone.

            jplindst Jan Lindström (Inactive) added a comment - Following existing test also test row_format=compressed: innodb-compressed-blob innodb-first-page-read innodb-force-corrupt innodb-key-rotation-disable innodb_onlinealter_encryption innodb_page_encryption_key_change innodb-page_encryption_log_encryption innodb-page_encryption innodb-redo-badkey innodb-redo-nokeys Using some of innodb_zip tests is possible but they seem to create quite a few pages i.e. would need to wait until key rotation really has time to encrypt something before table is gone.

            At the very least, I would like to see a test that reproduces the problem when the fix is not present.

            marko Marko Mäkelä added a comment - At the very least, I would like to see a test that reproduces the problem when the fix is not present.
            jplindst Jan Lindström (Inactive) made changes -
            Status Stalled [ 10000 ] In Progress [ 3 ]
            jplindst Jan Lindström (Inactive) added a comment - https://github.com/MariaDB/server/commit/9b998c18283c8acd4ed3d8e6df503fe861af04f0
            jplindst Jan Lindström (Inactive) made changes -
            Assignee Jan Lindström [ jplindst ] Marko Mäkelä [ marko ]
            Status In Progress [ 3 ] In Review [ 10002 ]
            serg Sergei Golubchik made changes -
            Sprint 10.2.10 [ 183 ]

            I do not understand what the fix is supposed to achieve, apart from skipping the comparison that is failing.

            A design constraint of ROW_FORMAT=COMPRESSED which I designed and implemented in the InnoDB Plugin for MySQL 5.1 is that the compressed and uncompressed index page headers must match at all times. That is, if one of the headers is updated, then the other one must be update as well, while holding X-latch or SX-latch on block->lock. Once the latch is released, the two copies must be equivalent. Something is apparently breaking this constraint, and we must find and fix the cause of this. I do not think that the check should be relaxed without fully understanding the problem.

            I cannot repeat any problem with the included test case (the commit features an slightly extended version of t3.test), either in 10.1 or 10.2. I did verify that page_zip_decompress() is being successfully called after restart and that the data file is encrypted.

            If I could repeat the problem, I could better suggest how to fix this.

            marko Marko Mäkelä added a comment - I do not understand what the fix is supposed to achieve, apart from skipping the comparison that is failing. A design constraint of ROW_FORMAT=COMPRESSED which I designed and implemented in the InnoDB Plugin for MySQL 5.1 is that the compressed and uncompressed index page headers must match at all times. That is, if one of the headers is updated, then the other one must be update as well, while holding X-latch or SX-latch on block->lock. Once the latch is released, the two copies must be equivalent. Something is apparently breaking this constraint, and we must find and fix the cause of this. I do not think that the check should be relaxed without fully understanding the problem. I cannot repeat any problem with the included test case (the commit features an slightly extended version of t3.test ), either in 10.1 or 10.2. I did verify that page_zip_decompress() is being successfully called after restart and that the data file is encrypted. If I could repeat the problem, I could better suggest how to fix this.
            marko Marko Mäkelä made changes -
            Assignee Marko Mäkelä [ marko ] Jan Lindström [ jplindst ]
            Status In Review [ 10002 ] Stalled [ 10000 ]

            If the mismatch is caused by encryption updating only one of block->frame or block->page.zip.data for the buf_block_t* block of a ROW_FORMAT=COMPRESSED table, then the root cause should be fixed by updating both copies at the same time.

            Note that during buffer pool flushing, both the uncompressed and compressed page must exist in the buffer pool. This is because flushing is protected by buf_block_t::lock. At other times, the page might exist in compressed format only. If it contains changes that must be written to the *.ibd file, then the buf_page_t::state will be BUF_BLOCK_ZIP_DIRTY.

            marko Marko Mäkelä added a comment - If the mismatch is caused by encryption updating only one of block->frame or block->page.zip.data for the buf_block_t* block of a ROW_FORMAT=COMPRESSED table, then the root cause should be fixed by updating both copies at the same time. Note that during buffer pool flushing, both the uncompressed and compressed page must exist in the buffer pool. This is because flushing is protected by buf_block_t::lock. At other times, the page might exist in compressed format only. If it contains changes that must be written to the *.ibd file, then the buf_page_t::state will be BUF_BLOCK_ZIP_DIRTY.

            I found the problem by running the test on perro. On two subsequent runs, the problem occurred on page 4:10, giving some hope of determinism.
            So, I decided to run it inside --manual-gdb. Until the shutdown, there was no problem.
            Before the restart, I set a breakpoint:

            break buf_page_get_gen
            cond 1 space==4&&offset==10
            run
            finish
            disable
            display/x $1->page[24]@14
            display/x $1->page.zip.data[24]@14
            

            At this point, I noticed that both the uncompressed frame and the compressed page contain non-zero bytes. The mismatch that I had observed in the core dumps was that the block->page.zip.data contained zeroes while block->frame contained nonzero bytes.

            So, I started to suspect that the problem is that block->page.zip.data is zeroed out, and not that block->frame is being written with nonzero bytes without updating block->page.zip.data accordingly.

            I set a watchpoint:

            p (unsigned*)&$1->page.zip.data[28]
            watch *$2
            continue
            

            This watchpoint was triggered in a page flush:

            #1  0x000055555616dda7 in buf_flush_update_zip_checksum (
                page=0x7fffed4f8000 "\f%", zip_size=8192, lsn=2195866)
                at /home/mariadb/git/10.1-mdev13512/storage/xtradb/buf/buf0flu.cc:760
            #2  0x000055555616df0c in buf_flush_init_for_writing (
                page=0x7fffed4f4000 "\f%", page_zip_=0x7fffecfbfaf8, newest_lsn=2195866)
                at /home/mariadb/git/10.1-mdev13512/storage/xtradb/buf/buf0flu.cc:804
            #3  0x000055555616e3e3 in buf_flush_write_block_low (bpage=0x7fffecfbfae8, 
                flush_type=BUF_FLUSH_LIST, sync=false)
                at /home/mariadb/git/10.1-mdev13512/storage/xtradb/buf/buf0flu.cc:950
            #4  0x000055555616ea0e in buf_flush_page (buf_pool=0x7ffff0fb0478, 
                bpage=0x7fffecfbfae8, flush_type=BUF_FLUSH_LIST, sync=false)
                at /home/mariadb/git/10.1-mdev13512/storage/xtradb/buf/buf0flu.cc:1130
            #5  0x000055555616efe8 in buf_flush_try_neighbors (space=4, offset=10, 
                flush_type=BUF_FLUSH_LIST, n_flushed=1, n_to_flush=100)
                at /home/mariadb/git/10.1-mdev13512/storage/xtradb/buf/buf0flu.cc:1344
            #6  0x000055555616f308 in buf_flush_page_and_try_neighbors (
                bpage=0x7fffecfbfae8, flush_type=BUF_FLUSH_LIST, n_to_flush=100, 
                count=0x7fffe33feae8)
                at /home/mariadb/git/10.1-mdev13512/storage/xtradb/buf/buf0flu.cc:1433
            #7  0x000055555616fe48 in buf_do_flush_list_batch (buf_pool=0x7ffff0fb0478, 
                min_n=100, lsn_limit=18446744073709551615)
            

            buf_flush_update_zip_checksum() zeroes out the FIL_PAGE_FILE_FLUSH_LSN_OR_KEY_VERSION field, which it perhaps should not do?

            	ib_uint32_t	checksum = static_cast<ib_uint32_t>(
            		page_zip_calc_checksum(
            			page, zip_size,
            			static_cast<srv_checksum_algorithm_t>(
            				srv_checksum_algorithm)));
             
            	mach_write_to_8(page + FIL_PAGE_LSN, lsn);
            	memset(page + FIL_PAGE_FILE_FLUSH_LSN_OR_KEY_VERSION, 0, 8);
            	mach_write_to_4(page + FIL_PAGE_SPACE_OR_CHKSUM, checksum);
            

            Note that the page here is the compressed page frame (in this case, zip_size is half the srv_page_size aka innodb_page_size).

            Without encryption, this field would always be zero in both the compressed and uncompressed page header.
            Perhaps we should remove this memset() altogether. The field is excluded from the checksum calculation.
            This memset() became redundant with my MySQL Bug #53306 fix which added memset() to fsp_init_file_page_low(). So, it is safe to remove the memset().

            It looks like the memset() must be breaking SPATIAL INDEX in 10.2 for ROW_FORMAT=COMPRESSED, by zeroing out the SSN (split sequence number) field. In MySQL 5.7.5, two memset() were removed from buf0flu.cc in WL#6968 InnoDB GIS: R-tree index support.

            The proper fix is to remove the 2 offending memset() from both InnoDB and XtraDB, rather than to shoot or cripple the messenger (the consistency check).

            diff --git a/storage/innobase/buf/buf0flu.cc b/storage/innobase/buf/buf0flu.cc
            index 31feb322826..dacddfca385 100644
            --- a/storage/innobase/buf/buf0flu.cc
            +++ b/storage/innobase/buf/buf0flu.cc
            @@ -715,7 +715,6 @@ buf_flush_update_zip_checksum(
             				srv_checksum_algorithm)));
             
             	mach_write_to_8(page + FIL_PAGE_LSN, lsn);
            -	memset(page + FIL_PAGE_FILE_FLUSH_LSN_OR_KEY_VERSION, 0, 8);
             	mach_write_to_4(page + FIL_PAGE_SPACE_OR_CHKSUM, checksum);
             }
             
            @@ -894,8 +893,6 @@ buf_flush_write_block_low(
             				bpage->newest_modification);
             
             		ut_a(page_zip_verify_checksum(frame, zip_size));
            -
            -		memset(frame + FIL_PAGE_FILE_FLUSH_LSN_OR_KEY_VERSION, 0, 8);
             		break;
             	case BUF_BLOCK_FILE_PAGE:
             		frame = bpage->zip.data;
            

            I believe that the reason why I was unable to repeat the bug on my laptop is that there was not enough concurrency. On perro, there are more processor cores and more true concurrency between the page flush and the SQL execution thread.

            I tried to prove my claim that SPATIAL INDEX is broken on ROW_FORMAT=COMPRESSED tables, but we do not have the larger --suite=innodb_gis tests from 5.7 (until MDEV-13626 addresses that), and a modification of innodb.instant_alter in 10.3 to use ROW_FORMAT=COMPRESSED did not trigger any assertion failure. I suppose that the SSN only matters in certain SPATIAL INDEX operations.

            marko Marko Mäkelä added a comment - I found the problem by running the test on perro. On two subsequent runs, the problem occurred on page 4:10, giving some hope of determinism. So, I decided to run it inside --manual-gdb. Until the shutdown, there was no problem. Before the restart, I set a breakpoint: break buf_page_get_gen cond 1 space==4&&offset==10 run finish disable display/x $1->page[24]@14 display/x $1->page.zip.data[24]@14 At this point, I noticed that both the uncompressed frame and the compressed page contain non-zero bytes. The mismatch that I had observed in the core dumps was that the block->page.zip.data contained zeroes while block->frame contained nonzero bytes. So, I started to suspect that the problem is that block->page.zip.data is zeroed out, and not that block->frame is being written with nonzero bytes without updating block->page.zip.data accordingly. I set a watchpoint: p (unsigned*)&$1->page.zip.data[28] watch *$2 continue This watchpoint was triggered in a page flush: #1 0x000055555616dda7 in buf_flush_update_zip_checksum ( page=0x7fffed4f8000 "\f%", zip_size=8192, lsn=2195866) at /home/mariadb/git/10.1-mdev13512/storage/xtradb/buf/buf0flu.cc:760 #2 0x000055555616df0c in buf_flush_init_for_writing ( page=0x7fffed4f4000 "\f%", page_zip_=0x7fffecfbfaf8, newest_lsn=2195866) at /home/mariadb/git/10.1-mdev13512/storage/xtradb/buf/buf0flu.cc:804 #3 0x000055555616e3e3 in buf_flush_write_block_low (bpage=0x7fffecfbfae8, flush_type=BUF_FLUSH_LIST, sync=false) at /home/mariadb/git/10.1-mdev13512/storage/xtradb/buf/buf0flu.cc:950 #4 0x000055555616ea0e in buf_flush_page (buf_pool=0x7ffff0fb0478, bpage=0x7fffecfbfae8, flush_type=BUF_FLUSH_LIST, sync=false) at /home/mariadb/git/10.1-mdev13512/storage/xtradb/buf/buf0flu.cc:1130 #5 0x000055555616efe8 in buf_flush_try_neighbors (space=4, offset=10, flush_type=BUF_FLUSH_LIST, n_flushed=1, n_to_flush=100) at /home/mariadb/git/10.1-mdev13512/storage/xtradb/buf/buf0flu.cc:1344 #6 0x000055555616f308 in buf_flush_page_and_try_neighbors ( bpage=0x7fffecfbfae8, flush_type=BUF_FLUSH_LIST, n_to_flush=100, count=0x7fffe33feae8) at /home/mariadb/git/10.1-mdev13512/storage/xtradb/buf/buf0flu.cc:1433 #7 0x000055555616fe48 in buf_do_flush_list_batch (buf_pool=0x7ffff0fb0478, min_n=100, lsn_limit=18446744073709551615) buf_flush_update_zip_checksum() zeroes out the FIL_PAGE_FILE_FLUSH_LSN_OR_KEY_VERSION field, which it perhaps should not do? ib_uint32_t checksum = static_cast <ib_uint32_t>( page_zip_calc_checksum( page, zip_size, static_cast <srv_checksum_algorithm_t>( srv_checksum_algorithm)));   mach_write_to_8(page + FIL_PAGE_LSN, lsn); memset (page + FIL_PAGE_FILE_FLUSH_LSN_OR_KEY_VERSION, 0, 8); mach_write_to_4(page + FIL_PAGE_SPACE_OR_CHKSUM, checksum); Note that the page here is the compressed page frame (in this case, zip_size is half the srv_page_size aka innodb_page_size). Without encryption, this field would always be zero in both the compressed and uncompressed page header. Perhaps we should remove this memset() altogether. The field is excluded from the checksum calculation. This memset() became redundant with my MySQL Bug #53306 fix which added memset() to fsp_init_file_page_low() . So, it is safe to remove the memset(). It looks like the memset() must be breaking SPATIAL INDEX in 10.2 for ROW_FORMAT=COMPRESSED, by zeroing out the SSN (split sequence number) field. In MySQL 5.7.5, two memset() were removed from buf0flu.cc in WL#6968 InnoDB GIS: R-tree index support . The proper fix is to remove the 2 offending memset() from both InnoDB and XtraDB, rather than to shoot or cripple the messenger (the consistency check). diff --git a/storage/innobase/buf/buf0flu.cc b/storage/innobase/buf/buf0flu.cc index 31feb322826..dacddfca385 100644 --- a/storage/innobase/buf/buf0flu.cc +++ b/storage/innobase/buf/buf0flu.cc @@ -715,7 +715,6 @@ buf_flush_update_zip_checksum( srv_checksum_algorithm))); mach_write_to_8(page + FIL_PAGE_LSN, lsn); - memset(page + FIL_PAGE_FILE_FLUSH_LSN_OR_KEY_VERSION, 0, 8); mach_write_to_4(page + FIL_PAGE_SPACE_OR_CHKSUM, checksum); } @@ -894,8 +893,6 @@ buf_flush_write_block_low( bpage->newest_modification); ut_a(page_zip_verify_checksum(frame, zip_size)); - - memset(frame + FIL_PAGE_FILE_FLUSH_LSN_OR_KEY_VERSION, 0, 8); break; case BUF_BLOCK_FILE_PAGE: frame = bpage->zip.data; I believe that the reason why I was unable to repeat the bug on my laptop is that there was not enough concurrency. On perro, there are more processor cores and more true concurrency between the page flush and the SQL execution thread. I tried to prove my claim that SPATIAL INDEX is broken on ROW_FORMAT=COMPRESSED tables, but we do not have the larger --suite=innodb_gis tests from 5.7 (until MDEV-13626 addresses that), and a modification of innodb.instant_alter in 10.3 to use ROW_FORMAT=COMPRESSED did not trigger any assertion failure. I suppose that the SSN only matters in certain SPATIAL INDEX operations.
            marko Marko Mäkelä made changes -
            Summary InnoDB: Failing assertion: !memcmp(FIL_PAGE_TYPE + page, FIL_PAGE_TYPE + page_zip->data, PAGE_HEADER - FIL_PAGE_TYPE) buf_flush_update_zip_checksum() corrupts SPATIAL INDEX in ROW_FORMAT=COMPRESSED tables
            marko Marko Mäkelä made changes -
            Assignee Jan Lindström [ jplindst ] Marko Mäkelä [ marko ]
            marko Marko Mäkelä added a comment - I removed the offending memset() .
            marko Marko Mäkelä made changes -
            issue.field.resolutiondate 2017-10-06 15:01:35.0 2017-10-06 15:01:35.216
            marko Marko Mäkelä made changes -
            Fix Version/s 10.1.29 [ 22636 ]
            Fix Version/s 10.2.10 [ 22615 ]
            Fix Version/s 10.2 [ 14601 ]
            Fix Version/s 10.1 [ 16100 ]
            Resolution Fixed [ 1 ]
            Status Stalled [ 10000 ] Closed [ 6 ]
            marko Marko Mäkelä made changes -
            serg Sergei Golubchik made changes -
            Workflow MariaDB v3 [ 82059 ] MariaDB v4 [ 152606 ]

            People

              marko Marko Mäkelä
              elenst Elena Stepanova
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.