Details
-
Bug
-
Status: Closed (View Workflow)
-
Major
-
Resolution: Fixed
-
10.5.4
Description
origin/bb-10.5-release 8394979aae274a19689102420ebb86580e26f92e 2020-06-22T12:10:06+02:00
|
compiled without debug.
|
|
|
The RQG log shows
|
In short:
|
The connection Thread1 hits the problem, reports it and exits.
|
Than the connection Thread2 hits the problem too, reports it and exits. This might be important for understanding the mysql.err better.
|
The main process in RQG becomes aware of the issue and starts to kill processes for aborting the run.
|
In detail
|
# 2020-06-23T12:12:52 [17433] ERROR: The query 'CHECK TABLE t1 /* E_R Thread1 QNO 2363 CON_ID 13 */ ' passed but has a result set line
|
# 2020-06-23T12:12:52 [17433] ERROR: ->test.t1 check Warning InnoDB: The B-tree of index GEN_CLUST_INDEX is corrupted.<-.
|
# 2020-06-23T12:12:52 [17433] ERROR: Will set status STATUS_DATABASE_CORRUPTION.
|
# 2020-06-23T12:12:52 [17433] ERROR: The query 'CHECK TABLE t1 /* E_R Thread1 QNO 2363 CON_ID 13 */ ' passed but has a result set line
|
# 2020-06-23T12:12:52 [17433] ERROR: ->test.t1 check Warning InnoDB: Index 'idx' contains 5 entries, should be 18446744073709551615.<-.
|
# 2020-06-23T12:12:52 [17433] ERROR: Will set status STATUS_DATABASE_CORRUPTION.
|
# 2020-06-23T12:12:52 [17433] Thread1 in Mixer : Critical failure STATUS_DATABASE_CORRUPTION (106), Error 0 reported at dsn dbi:mysql:host=127.0.0.1:port=27840:user=root:database=test:mysql_local_infile=1
|
# 2020-06-23T12:12:52 [17433] GenTest: Server crash or critical failure (STATUS_DATABASE_CORRUPTION) was reported.
|
# 2020-06-23T12:12:52 [17433] The child process for Thread1 will be stopped.
|
# 2020-06-23T12:12:52 [17433] GenTest: child 17433 is being stopped with status STATUS_DATABASE_CORRUPTION
|
# 2020-06-23T12:12:52 [17436] ERROR: The query 'CHECK TABLE t1 /* E_R Thread2 QNO 2444 CON_ID 12 */ ' passed but has a result set line
|
# 2020-06-23T12:12:52 [17436] ERROR: ->test.t1 check Warning InnoDB: The B-tree of index GEN_CLUST_INDEX is corrupted.<-.
|
# 2020-06-23T12:12:52 [17436] ERROR: Will set status STATUS_DATABASE_CORRUPTION.
|
# 2020-06-23T12:12:52 [17436] ERROR: The query 'CHECK TABLE t1 /* E_R Thread2 QNO 2444 CON_ID 12 */ ' passed but has a result set line
|
# 2020-06-23T12:12:52 [17436] ERROR: ->test.t1 check Warning InnoDB: Index idx is marked as corrupted<-.
|
# 2020-06-23T12:12:52 [17436] ERROR: Will set status STATUS_DATABASE_CORRUPTION.
|
# 2020-06-23T12:12:52 [17436] Thread2 in Mixer : Critical failure STATUS_DATABASE_CORRUPTION (106), Error 0 reported at dsn dbi:mysql:host=127.0.0.1:port=27840:user=root:database=test:mysql_local_infile=1
|
# 2020-06-23T12:12:52 [17436] GenTest: Server crash or critical failure (STATUS_DATABASE_CORRUPTION) was reported.
|
# 2020-06-23T12:12:52 [17436] The child process for Thread2 will be stopped.
|
# 2020-06-23T12:12:52 [17436] GenTest: child 17436 is being stopped with status STATUS_DATABASE_CORRUPTION
|
# 2020-06-23T12:12:53 [10133] Process with pid 17433 for Thread1 ended with status STATUS_DATABASE_CORRUPTION
|
# 2020-06-23T12:12:53 [10133] Killing (TERM) remaining worker process with pid 17436...
|
# 2020-06-23T12:12:53 [10133] Process with pid 17436 for Thread2 ended with status STATUS_DATABASE_CORRUPTION
|
# 2020-06-23T12:12:55 [17432] DEBUG: GenTest::ReporterManager::monitor: Will return the (maximum) status 0
|
# 2020-06-23T12:13:05 [17432] DEBUG: GenTest::ReporterManager::monitor: Will return the (maximum) status 0
|
# 2020-06-23T12:13:09 [10133] Killing periodic reporting process with pid 17432...
|
|
mysql.err shows
|
[rr 11138 509260]2020-06-23 12:12:51 12 [Note] InnoDB: Online DDL : Applying log to index
|
[rr 11138 510517]2020-06-23 12:12:52 13 [Note] InnoDB: Online DDL : Start reading clustered index of the table and create temporary files
|
[rr 11138 510565]2020-06-23 12:12:52 13 [Note] InnoDB: Online DDL : End of reading clustered index of the table and create temporary files
|
[rr 11138 510569]2020-06-23 12:12:52 13 [Note] InnoDB: Online DDL : Applying log to index
|
[rr 11138 510869]2020-06-23 12:12:52 13 [Warning] InnoDB: Wrong page type 17855
|
[rr 11138 510873]2020-06-23 12:12:52 13 [ERROR] InnoDB: Apparent corruption in space 30 page 3 of index `GEN_CLUST_INDEX` of table `test`.`t1`
|
[rr 11138 510877]2020-06-23 12:12:52 13 [ERROR] InnoDB: In page 3 of index `GEN_CLUST_INDEX` of table `test`.`t1`
|
[rr 11138 510885]2020-06-23 12:12:52 13 [ERROR] InnoDB: Flagged corruption of `idx` in table `test`.`t1` in CHECK TABLE; Wrong count
|
[rr 11138 511000]2020-06-23 12:12:52 13 [Warning] Aborted connection 13 to db: 'test' user: 'root' host: 'localhost' (Got an error reading communication packets)
|
[rr 11138 511004]2020-06-23 12:12:52 13 [Warning] Aborted connection 13 to db: 'test' user: 'root' host: 'localhost' (This connection closed normally)
|
[rr 11138 511021]2020-06-23 12:12:52 12 [Warning] InnoDB: Wrong page type 17855
|
[rr 11138 511025]2020-06-23 12:12:52 12 [ERROR] InnoDB: Apparent corruption in space 30 page 3 of index `GEN_CLUST_INDEX` of table `test`.`t1`
|
[rr 11138 511029]2020-06-23 12:12:52 12 [ERROR] InnoDB: In page 3 of index `GEN_CLUST_INDEX` of table `test`.`t1`
|
|
RQG
|
git clone https://github.com/mleich1/rqg --branch experimental RQG
|
origin/experimental a4fb38dacc00ea901dbad55d72ec31d4514b8830 2020-06-19T17:19:27+02:00
|
|
perl rqg.pl \
|
--duration=300 \
|
--queries=10000000 \
|
--threads=2 \
|
--no_mask \
|
--seed=random \
|
--rpl_mode=none \
|
--gendata=conf/runtime/alter_online.zz \
|
--engine=InnoDB \
|
--mysqld=--loose-debug_assert_on_not_freed_memory=0 \
|
--mysqld=--innodb-lock-wait-timeout=50 \
|
--mysqld=--loose_innodb_lock_schedule_algorithm=fcfs \
|
--mysqld=--connect_timeout=60 \
|
--mysqld=--net_read_timeout=30 \
|
--mysqld=--innodb_stats_persistent=on \
|
--mysqld=--innodb_page_size=32K \
|
--mysqld=--log_bin_trust_function_creators=1 \
|
--mysqld=--net_write_timeout=60 \
|
--mysqld=--loose-max-statement-time=30 \
|
--mysqld=--loose-idle_write_transaction_timeout=0 \
|
--mysqld=--log-output=none \
|
--mysqld=--lock-wait-timeout=86400 \
|
--mysqld=--log-bin \
|
--mysqld=--interactive_timeout=28800 \
|
--mysqld=--loose-idle_readonly_transaction_timeout=0 \
|
--mysqld=--loose-innodb_fatal_semaphore_wait_threshold=300 \
|
--mysqld=--innodb-buffer-pool-size=256M \
|
--mysqld=--slave_net_timeout=60 \
|
--mysqld=--loose-table_lock_wait_timeout=50 \
|
--mysqld=--wait_timeout=28800 \
|
--mysqld=--loose-idle_transaction_timeout=0 \
|
--reporters=Backtrace,Deadlock1,ErrorLog \
|
--validators=None \
|
--grammar=p00025.yy
|
--workdir=<local settings> \
|
--vardir=<local settings> \
|
--mtr-build-thread=<local settings> \
|
--basedir1=<local settings>
|
|
Replaying the problem with rr enabled, two concurrent sessions and the uploaded simplified grammar
|
- a non debug build is rather fast Just a few RQG runs are required.
|
- a debug build is extreme slow 500 runs had already a "pass" when the first replay happened.
|
|
This seems to explain why I have not observed that problem yesterday even though making > 10000 RQG runs.
|
debug builds, ~ 30% of tests do not use a grammar which is capable to reveal the problem
|
|
Features which are sometimes critical but not used in the test at all
|
- Recovery after crash
|
- FOREIGN KEYs, TRUNCATE, FLUSH, FULLTEXT
|
|
No replay of the problem on
|
origin/10.4 b80b52394d41a4c334642ae8b3af16f76f6fac57 2020-06-22T13:25:25+03:00
|
build without debug with 2500 RQG runs.
|
So it seems to be a 10.5 problem only.
|
|
Attachments
Issue Links
- is caused by
-
MDEV-22867 Assertion `instant.n_core_fields == n_core_fields' failed in dict_index_t::instant_add_field
-
- Closed
-
The problem occurs at the following code:
10.5 8394979aae274a19689102420ebb86580e26f92e
#4 0x000055d8e9197070 in page_create_empty (block=0x3bd456b4faa8,
index=0x7f54b81621b8, mtr=0x2035af685f0)
at /home/mleich/bb-10.5-release_B/storage/innobase/page/page0page.cc:428
#5 0x000055d8e918b46c in page_cur_delete_rec (cursor=0x2035af684e8,
index=0x7f54b81621b8, offsets=0x4ca360468838, mtr=0x2035af685f0)
at /home/mleich/bb-10.5-release_B/storage/innobase/page/page0cur.cc:2154
#6 0x000055d8e9351f87 in btr_cur_pessimistic_update (flags=10,
cursor=0x2035af684e0, offsets=0x2035af68400, offsets_heap=0x2035af68408,
entry_heap=0x4ca360529588, big_rec=0x2035af68410, update=0x4ca36052b840,
cmpl_info=1, thr=0x4ca36052b798, trx_id=7366, mtr=0x2035af685f0)
at /home/mleich/bb-10.5-release_B/storage/innobase/btr/btr0cur.cc:5060
#7 0x000055d8e90bc2e0 in innobase_instant_try (ha_alter_info=0x2035af69e80,
ctx=0x4ca360013518, altered_table=0x2035af69f20, table=0x4ca360525818,
trx=0x502e419ca710)
at /home/mleich/bb-10.5-release_B/storage/innobase/handler/handler0alter.cc:5879
We are wrongly resetting the page type to FIL_PAGE_INDEX here, when ALTER TABLE t1 DROP COLUMN col2 is being executed on an empty table.
Also when re-inserting the metadata record into the empty page, we will fail to update the page type:
10.5 8394979aae274a19689102420ebb86580e26f92e
#2 0x000055d8e933c7ff in page_cur_tuple_insert (cursor=0x2035af684e8,
tuple=0x4ca36052ba40, index=0x7f54b81621b8, offsets=0x2035af68400,
heap=0x2035af68408, n_ext=1, mtr=0x2035af685f0)
at /home/mleich/bb-10.5-release_B/storage/innobase/include/page0cur.ic:284
#3 0x000055d8e93492f1 in btr_cur_insert_if_possible (cursor=0x2035af684e0,
tuple=0x4ca36052ba40, offsets=0x2035af68400, heap=0x2035af68408, n_ext=1,
mtr=0x2035af685f0)
at /home/mleich/bb-10.5-release_B/storage/innobase/btr/btr0cur.cc:3185
#4 0x000055d8e9351fc2 in btr_cur_pessimistic_update (flags=10,
cursor=0x2035af684e0, offsets=0x2035af68400, offsets_heap=0x2035af68408,
entry_heap=0x4ca360529588, big_rec=0x2035af68410, update=0x4ca36052b840,
cmpl_info=1, thr=0x4ca36052b798, trx_id=7366, mtr=0x2035af685f0)
at /home/mleich/bb-10.5-release_B/storage/innobase/btr/btr0cur.cc:5064
#5 0x000055d8e90bc2e0 in innobase_instant_try (ha_alter_info=0x2035af69e80,
ctx=0x4ca360013518, altered_table=0x2035af69f20, table=0x4ca360525818,
trx=0x502e419ca710)
at /home/mleich/bb-10.5-release_B/storage/innobase/handler/handler0alter.cc:5879
A few INSERT statements later, CHECK TABLE will report corruption, because the first record is a metadata record, but the page type is wrong.