Details
-
Bug
-
Status: Closed (View Workflow)
-
Major
-
Resolution: Fixed
-
10.2(EOL), 10.3(EOL), 10.4(EOL)
Description
In debug mode running a test with --repeat results in an every second run failed like this:
innodb.innodb-change-buffer-recovery 'innodb' [ fail ]
|
Test ended at 2019-04-26 12:23:01
|
|
CURRENT_TEST: innodb.innodb-change-buffer-recovery
|
--- /home/kevg/work/mariadb/mysql-test/suite/innodb/r/innodb-change-buffer-recovery.result 2019-04-26 12:14:05.804197310 +0300
|
+++ /home/kevg/work/mariadb/mysql-test/suite/innodb/r/innodb-change-buffer-recovery.reject 2019-04-26 12:23:01.364749083 +0300
|
@@ -38,7 +38,7 @@
|
ERROR HY000: Lost connection to MySQL server during query
|
disconnect con1;
|
connection default;
|
-FOUND 1 /Wrote log record for ibuf update in place operation/ in my_restart.err
|
+FOUND 2 /Wrote log record for ibuf update in place operation/ in my_restart.err
|
CHECK TABLE t1;
|
Table Op Msg_type Msg_text
|
test.t1 check status OK
|
|
mysqltest: Result content mismatch
|
|
- saving '/home/kevg/work/debug/mysql-test/var/log/innodb.innodb-change-buffer-recovery-innodb/' to '/home/kevg/work/debug/mysql-test/var/log/innodb.innodb-change-buffer-recovery-innodb/'
|
In debug mode with -O2 it sometime crashes like this:
#6 0x00007f4016d55012 in __GI___assert_fail (assertion=0xfadc93 "field != 16 || val", file=0xf9048b "/work/mariadb/storage/innobase/include/page0page.ic", line=176, function=0xfadc3f "void page_header_set_field(ib_page_t *, page_zip_des_t *, ulint, ulint)") at assert.c:101
|
#7 0x0000000000a1bbcb in page_header_set_field (page=<optimized out>, page_zip=0x0, field=16, val=<optimized out>) at page0page.ic:176
|
#8 page_mem_free (page=<optimized out>, page_zip=0x0, rec=<optimized out>, index=<optimized out>, offsets=<optimized out>) at page0page.ic:1047
|
#9 page_cur_delete_rec (cursor=0x7f40106e8d01, index=<optimized out>, offsets=0x7ffe97abb990, mtr=0x7ffe97abc230) at page0cur.cc:2417
|
#10 0x0000000000b79fab in btr_cur_optimistic_delete_func (cursor=0x7ffe97abc760, flags=<optimized out>, mtr=0x7ffe97abc230) at btr0cur.cc:5163
|
#11 0x00000000009c4387 in ibuf_delete_rec (space=4, page_no=27, pcur=0x7ffe97abc760, search_tuple=0x26faf98, mtr=0x7ffe97abc230) at ibuf0ibuf.cc:4261
|
#12 0x00000000009c11de in ibuf_merge_or_delete_for_page (block=0x7f40102526a8, page_id=..., page_size=0x7f40102526b8, update_ibuf_bitmap=1) at ibuf0ibuf.cc:4645
|
#13 0x00000000009fcd6f in mlog_init_t::ibuf_merge (this=<optimized out>, mtr=...) at log0recv.cc:330
|
#14 0x00000000009f44c5 in recv_apply_hashed_log_recs (last_batch=<optimized out>) at log0recv.cc:2516
|
#15 0x0000000000ae989b in innobase_start_or_create_for_mysql () at srv0start.cc:2224
|
#16 0x000000000097d77f in innobase_init (p=<optimized out>) at ha_innodb.cc:4377
|
#17 0x00000000007af955 in ha_initialize_handlerton (plugin=0x2367ca8) at handler.cc:521
|
#18 0x000000000060cc59 in plugin_initialize (tmp_root=0x7ffe97ac26b8, plugin=0x2367ca8, argc=0x15d1f68 <remaining_argc>, argv=0x2225428, options_only=<optimized out>) at sql_plugin.cc:1416
|
#19 0x000000000060c77e in plugin_init (argc=0x15d1f68 <remaining_argc>, argv=0x2225428, flags=2) at sql_plugin.cc:1697
|
#20 0x00000000005525b5 in init_server_components () at mysqld.cc:5311
|
#21 mysqld_main (argc=<optimized out>, argv=0x2225428) at mysqld.cc:5907
|
Adding -mysqld=-debug=d,ib_log_checkpoint_avoid make debug version fail with the same backtrace.
Attachments
Issue Links
- relates to
-
MDEV-22497 [ERROR] InnoDB: Unable to purge a record
-
- Closed
-
- links to
The ‘extra’ call to ibuf_insert_low() is explained by the following:
err = ibuf_insert_low(BTR_MODIFY_PREV, op, no_counter,
entry, entry_size,
index, page_id, zip_size, thr);
err = ibuf_insert_low(BTR_MODIFY_TREE | BTR_LATCH_FOR_INSERT,
op, no_counter, entry, entry_size,
index, page_id, zip_size, thr);
}
That is, no duplicate record was inserted into the change buffer for either the DELETE (@r, 36) or the DELETE_MARK (NULL, 521). Only 14 records were present when the DELETE (NULL, 520) was wrongly buffered.
In ibuf_get_volume_buffered() we check also the preceding change buffer page. This worries me a little, because normally, a backward B-tree scan in InnoDB could lead to a deadlock of threads, and we normally save the cursor position and search for the preceding record to move to the preceding page. Only if the caller ibuf_insert_low() was invoked with mode = BTR_MODIFY_TREE | BTR_LATCH_FOR_INSERT would there likely be no concern, because we would be holding ibuf_mutex and ibuf_pessimistic_insert_mutex. However, that call has been there since the inclusion of InnoDB in MySQL. If I remember correctly the discussions between 2005 and 2007, going back one more page could lead to deadlocks. So, I guess that we are fine with regard to deadlocks here. (Also, I am not aware of any actual deadlocks in this code.)
The following hash function ought to work. False positives are not a correctness problem. For example, ('fu','bar'), (NULL,'fubar'),('fub','ar') would be treated as equal, and we would increment min_n_rec by at most 1 for those 3 distinct keys.
/** Determine if a change buffer record has been encountered already.
@param rec change buffer record in the MySQL 5.5 format
@param hash hash table of encountered records
@param size number of elements in hash
@retval true if a distinct record
@retval false if this may be duplicating an earlier record */
ulint size)
{
ut_ad(rec_get_n_fields_old(rec) > IBUF_REC_FIELD_USER);
*hash|= bitmask;
}
The comparison min_n_recs < 2 in ibuf_insert_low() actually does check for underflow, because the variable is a signed type (lint, not ulint).
With the corrected hash function, the estimate of min_n_recs looks conservative to me, even if there were multiple change buffer pages of buffered changes and the estimate did not consider them all. In the worst case, we missed many preceding pages filled with IBUF_DELETE_OP records that had been buffered for the user page earlier. But, it would have been an error to buffer those operations if the records did not exist in the first place. So, it is OK to ignore any preceding pages of buffered operations and start with a conservative estimate that there exist at least min_n_recs=0 records in the page.
Update: I tried 10.2 with the following patch:
diff --git a/storage/innobase/include/page0page.ic b/storage/innobase/include/page0page.ic
index 4efb92f7182..276f85df69a 100644
--- a/storage/innobase/include/page0page.ic
+++ b/storage/innobase/include/page0page.ic
@@ -173,9 +173,7 @@ page_header_set_field(
{
ut_ad(page);
ut_ad(field <= PAGE_N_RECS);
-#if 0 /* FIXME: MDEV-19344 hits this */
ut_ad(field != PAGE_N_RECS || val);
-#endif
ut_ad(field == PAGE_N_HEAP || val < srv_page_size);
ut_ad(field != PAGE_N_HEAP || (val & 0x7fff) < srv_page_size);
but the test innodb.innodb-change-buffer-recovery did not fail in 70 runs even without my fix. I will try harder, possibly with an older version, so that I can test my fix.