|
while trying to reproduce the first assertion I'm getting another one, probably related:
Assertion `page_align(ptr) == undo_block->frame' failed in trx_undo_page_set_next_prev_and_add
mysqld: /10.5/storage/innobase/trx/trx0rec.cc:81: uint16_t trx_undo_page_set_next_prev_and_add(buf_block_t*, byte*, mtr_t*): Assertion `page_align(ptr) == undo_block->frame' failed.
|
201103 12:07:17 [ERROR] mysqld got signal 6 ;
|
|
Server version: 10.5.7-MariaDB-debug-log
|
|
stdlib/abort.c:81(__GI_abort)[0x7f7b9b6eb859]
|
:0(__GI___assert_fail)[0x7f7b9b6fcf36]
|
trx/trx0rec.cc:83(trx_undo_page_set_next_prev_and_add(buf_block_t*, unsigned char*, mtr_t*))[0x56448963652a]
|
trx/trx0rec.cc:458(trx_undo_page_report_insert(buf_block_t*, trx_t*, dict_index_t*, dtuple_t const*, mtr_t*))[0x564489638685]
|
trx/trx0rec.cc:2016(trx_undo_report_row_operation(que_thr_t*, dict_index_t*, dtuple_t const*, upd_t const*, unsigned long, unsigned char const*, unsigned short const*, unsigned long*))[0x564489643156]
|
btr/btr0cur.cc:3288(btr_cur_ins_lock_and_undo(unsigned long, btr_cur_t*, dtuple_t*, que_thr_t*, mtr_t*, bool*))[0x564489703077]
|
btr/btr0cur.cc:3502(btr_cur_optimistic_insert(unsigned long, btr_cur_t*, unsigned short**, mem_block_info_t**, dtuple_t*, unsigned char**, big_rec_t**, unsigned long, que_thr_t*, mtr_t*))[0x56448970467a]
|
row/row0ins.cc:2724(row_ins_clust_index_entry_low(unsigned long, unsigned long, dict_index_t*, unsigned long, dtuple_t*, unsigned long, que_thr_t*))[0x56448947d842]
|
row/row0ins.cc:3196(row_ins_clust_index_entry(dict_index_t*, dtuple_t*, que_thr_t*, unsigned long))[0x5644894802da]
|
row/row0ins.cc:3321(row_ins_index_entry(dict_index_t*, dtuple_t*, que_thr_t*))[0x564489480ce4]
|
row/row0ins.cc:3490(row_ins_index_entry_step(ins_node_t*, que_thr_t*))[0x564489481e55]
|
row/row0ins.cc:3627(row_ins(ins_node_t*, que_thr_t*))[0x56448948287d]
|
row/row0ins.cc:3766(row_ins_step(que_thr_t*))[0x5644894838e2]
|
row/row0mysql.cc:1423(row_insert_for_mysql(unsigned char const*, row_prebuilt_t*, ins_mode_t))[0x5644894cd1bc]
|
handler/ha_innodb.cc:7592(ha_innobase::write_row(unsigned char const*))[0x56448912d9e4]
|
sql/handler.cc:7143(handler::ha_write_row(unsigned char const*))[0x56448862dd47]
|
sql/sql_insert.cc:1736(vers_insert_history_row(TABLE*))[0x564487d3a523]
|
sql/sql_update.cc:1093(mysql_update(THD*, TABLE_LIST*, List<Item>&, List<Item>&, Item*, unsigned int, st_order*, unsigned long long, bool, unsigned long long*, unsigned long long*))[0x5644880fc1e6]
|
sql/sql_parse.cc:4430(mysql_execute_command(THD*))[0x564487dfc0d8]
|
sql/sql_parse.cc:8044(mysql_parse(THD*, char*, unsigned int, Parser_state*, bool, bool))[0x564487e153d8]
|
sql/sql_parse.cc:1875(dispatch_command(enum_server_command, THD*, char*, unsigned int, bool, bool))[0x564487deb9ea]
|
sql/sql_parse.cc:1353(do_command(THD*))[0x564487de8313]
|
sql/sql_connect.cc:1410(do_handle_one_connection(CONNECT*, bool))[0x56448822887a]
|
sql/sql_connect.cc:1314(handle_one_connection)[0x5644882281de]
|
perfschema/pfs.cc:2203(pfs_spawn_thread)[0x564488f30bfb]
|
nptl/pthread_create.c:478(start_thread)[0x7f7b9bc14609]
|
|
|
The failure appeared in 10.4 after this commit:
Author: Aleksey Midenkov <midenok@gmail.com>
|
Date: Tue Oct 20 10:57:57 2020 +0300
|
|
MDEV-23672 Assertion `v.v_indexes.empty()' failed in dict_table_t::instant_column
|
Non-debug build also crashes, but it happens after a post-test check (but before the shutdown), and it fails to produce a coredump. It looks like this:
|
10.4 add67826 RelWithDebInfo
|
Version: '10.4.17-MariaDB-log' socket: '/data/bld/10.4-rel-nightly/mysql-test/var/tmp/mysqld.1.sock' port: 16000 MariaDB Server
|
201106 11:04:12 [ERROR] mysqld got signal 11 ;
|
This could be because you hit a bug. It is also possible that this binary
|
or one of the libraries it was linked against is corrupt, improperly built,
|
or misconfigured. This error can also be caused by malfunctioning hardware.
|
|
To report this bug, see https://mariadb.com/kb/en/reporting-bugs
|
|
We will try our best to scrape up some info that will hopefully help
|
diagnose the problem, but since we have already crashed,
|
something is definitely wrong and this may fail.
|
|
Server version: 10.4.17-MariaDB-log
|
key_buffer_size=1048576
|
read_buffer_size=131072
|
max_used_connections=1
|
max_threads=153
|
thread_count=7
|
It is possible that mysqld could use up to
|
key_buffer_size + (read_buffer_size + sort_buffer_size)*max_threads = 63534 K bytes of memory
|
Hope that's ok; if not, decrease some variables in the equation.
|
|
Thread pointer: 0x7fcf40000c48
|
Attempting backtrace. You can use the following information to find out
|
where mysqld died. If you see no messages after this, something went
|
terribly wrong...
|
stack_bottom = 0x7fcf4dffaca8 thread_stack 0x49000
|
Non-debug ASAN build doesn't help much, but it produces an extra error line before crashing:
Version: '10.4.17-MariaDB-log' socket: '/data/bld/10.4-rel-asan-nightly/mysql-test/var/tmp/mysqld.1.sock' port: 16000 Source distribution
|
2020-11-06 11:28:38 1 [ERROR] [FATAL] InnoDB: Trying to read page number 538976288 in space 0, space name innodb_system, which is outside the tablespace bounds. Byte offset 0, len 16384
|
201106 11:28:38 [ERROR] mysqld got signal 6 ;
|
This could be because you hit a bug. It is also possible that this binary
|
or one of the libraries it was linked against is corrupt, improperly built,
|
or misconfigured. This error can also be caused by malfunctioning hardware.
|
|
To report this bug, see https://mariadb.com/kb/en/reporting-bugs
|
|
We will try our best to scrape up some info that will hopefully help
|
diagnose the problem, but since we have already crashed,
|
something is definitely wrong and this may fail.
|
|
Server version: 10.4.17-MariaDB-log
|
key_buffer_size=1048576
|
read_buffer_size=131072
|
max_used_connections=1
|
max_threads=153
|
thread_count=7
|
It is possible that mysqld could use up to
|
key_buffer_size + (read_buffer_size + sort_buffer_size)*max_threads = 63534 K bytes of memory
|
Hope that's ok; if not, decrease some variables in the equation.
|
|
Thread pointer: 0x62b000000208
|
Attempting backtrace. You can use the following information to find out
|
where mysqld died. If you see no messages after this, something went
|
terribly wrong...
|
stack_bottom = 0x7f78b2ce4b50 thread_stack 0x5fc00
|
/lib/x86_64-linux-gnu/libasan.so.5(+0x6cd30)[0x7f78c47a1d30]
|
mysys/stacktrace.c:175(my_print_stacktrace)[0x563eb94941a6]
|
sql/signal_handler.cc:206(handle_fatal_signal)[0x563eb83088a8]
|
It also has a variation of the assertion failure:
|
10.4 add67826 debug
|
mysqld: /data/src/10.4/storage/innobase/trx/trx0rec.cc:66: void trx_undof_page_add_undo_rec_log(const buf_block_t*, ulint, ulint, mtr_t*): Assertion `new_free < srv_page_size' failed.
|
201106 11:36:23 [ERROR] mysqld got signal 6 ;
|
|
#7 0x00007ff92e542f36 in __GI___assert_fail (assertion=0x55b29a7c4c94 "new_free < srv_page_size", file=0x55b29a7c4c00 "/data/src/10.4/storage/innobase/trx/trx0rec.cc", line=66, function=0x55b29a7c4c30 "void trx_undof_page_add_undo_rec_log(const buf_block_t*, ulint, ulint, mtr_t*)") at assert.c:101
|
#8 0x000055b29a037d34 in trx_undof_page_add_undo_rec_log (undo_block=0x7ff92382d0e0, old_free=16371, new_free=16919, mtr=0x7ff9236fc290) at /data/src/10.4/storage/innobase/trx/trx0rec.cc:66
|
#9 0x000055b29a03c3f7 in trx_undo_page_report_modify (undo_block=0x7ff92382d0e0, trx=0x7ff9281aa140, index=0x7ff8d82003e0, rec=0x7ff923b10578 "\200", offsets=0x7ff9236fcd00, update=0x7ff8d8081770, cmpl_info=0, row=0x0, mtr=0x7ff9236fc290) at /data/src/10.4/storage/innobase/trx/trx0rec.cc:1504
|
#10 0x000055b29a03eb0c in trx_undo_report_row_operation (thr=0x7ff8d8081d40, index=0x7ff8d82003e0, clust_entry=0x0, update=0x7ff8d8081770, cmpl_info=0, rec=0x7ff923b10578 "\200", offsets=0x7ff9236fcd00, roll_ptr=0x7ff9236fc838) at /data/src/10.4/storage/innobase/trx/trx0rec.cc:2136
|
#11 0x000055b29a0a56ac in btr_cur_upd_lock_and_undo (flags=2, cursor=0x7ff8d8048490, offsets=0x7ff9236fcd00, update=0x7ff8d8081770, cmpl_info=0, thr=0x7ff8d8081d40, mtr=0x7ff9236fcf60, roll_ptr=0x7ff9236fc838) at /data/src/10.4/storage/innobase/btr/btr0cur.cc:3920
|
#12 0x000055b29a0a92db in btr_cur_optimistic_update (flags=2, cursor=0x7ff8d8048490, offsets=0x7ff9236fcc10, heap=0x7ff9236fccb0, update=0x7ff8d8081770, cmpl_info=0, thr=0x7ff8d8081d40, trx_id=317, mtr=0x7ff9236fcf60) at /data/src/10.4/storage/innobase/btr/btr0cur.cc:4707
|
#13 0x000055b299fe841b in row_upd_clust_rec (flags=0, node=0x7ff8d8081648, index=0x7ff8d82003e0, offsets=0x7ff9236fcd00, offsets_heap=0x7ff9236fccb0, thr=0x7ff8d8081d40, mtr=0x7ff9236fcf60) at /data/src/10.4/storage/innobase/row/row0upd.cc:2857
|
#14 0x000055b299fe9701 in row_upd_clust_step (node=0x7ff8d8081648, thr=0x7ff8d8081d40) at /data/src/10.4/storage/innobase/row/row0upd.cc:3230
|
#15 0x000055b299fe9a84 in row_upd (node=0x7ff8d8081648, thr=0x7ff8d8081d40) at /data/src/10.4/storage/innobase/row/row0upd.cc:3296
|
#16 0x000055b299fea04c in row_upd_step (thr=0x7ff8d8081d40) at /data/src/10.4/storage/innobase/row/row0upd.cc:3440
|
#17 0x000055b299f88ac9 in row_update_for_mysql (prebuilt=0x7ff8d8080aa0) at /data/src/10.4/storage/innobase/row/row0mysql.cc:1848
|
#18 0x000055b299dddb21 in ha_innobase::update_row (this=0x7ff8d81f7818, old_row=0x7ff8d822af28 "\020", new_row=0x7ff8d8220dd8 "\020") at /data/src/10.4/storage/innobase/handler/ha_innodb.cc:8802
|
#19 0x000055b299b8f708 in handler::ha_update_row (this=0x7ff8d81f7818, old_data=0x7ff8d822af28 "\020", new_data=0x7ff8d8220dd8 "\020") at /data/src/10.4/sql/handler.cc:6792
|
#20 0x000055b299928bdd in mysql_update (thd=0x7ff8d8000d90, table_list=0x7ff8d80135b0, fields=..., values=..., conds=0x0, order_num=0, order=0x0, limit=18446744073709551442, ignore=false, found_return=0x7ff9236fde20, updated_return=0x7ff9236fdee0) at /data/src/10.4/sql/sql_update.cc:1073
|
#21 0x000055b2998145ba in mysql_execute_command (thd=0x7ff8d8000d90) at /data/src/10.4/sql/sql_parse.cc:4396
|
#22 0x000055b2998207cb in mysql_parse (thd=0x7ff8d8000d90, rawbuf=0x7ff8d8013448 "UPDATE table100_innodb_int_autoinc SET col_longtext_utf8_key = 'foo'", length=68, parser_state=0x7ff9236fe550, is_com_multi=false, is_next_command=false) at /data/src/10.4/sql/sql_parse.cc:7938
|
#23 0x000055b29980cad1 in dispatch_command (command=COM_QUERY, thd=0x7ff8d8000d90, packet=0x7ff8d80087a1 "UPDATE table100_innodb_int_autoinc SET col_longtext_utf8_key = 'foo'", packet_length=68, is_com_multi=false, is_next_command=false) at /data/src/10.4/sql/sql_parse.cc:1839
|
#24 0x000055b29980b339 in do_command (thd=0x7ff8d8000d90) at /data/src/10.4/sql/sql_parse.cc:1357
|
#25 0x000055b29999a0b9 in do_handle_one_connection (connect=0x55b29c95e220) at /data/src/10.4/sql/sql_connect.cc:1412
|
#26 0x000055b299999e02 in handle_one_connection (arg=0x55b29c95e220) at /data/src/10.4/sql/sql_connect.cc:1316
|
#27 0x000055b29a3b8d22 in pfs_spawn_thread (arg=0x55b29c950af0) at /data/src/10.4/storage/perfschema/pfs.cc:1869
|
#28 0x00007ff92edc3609 in start_thread (arg=<optimized out>) at pthread_create.c:477
|
#29 0x00007ff92e62e293 in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95
|
The dirty test case is attached as srv2.test .
It is fast and not big, but contains a very ugly data set of 100 lines and a chain of DDL which most likely can be further simplified; but it seems to be very sensitive to both the data contents and the sequence of events.
Currently on my machine it fails reliably every time, with all described failures – first_free... assertion o 10.5, new_free... assertion on 10.4, crashes without coredumps on 10.4-10.5 non-debug. However, due to the fragility of the test case i am not sure whether it will fail the same way or other machines, it is to be tried.
|
|
I've run the tests which I had for this issue – the first RQG test used to produce the rr profile, and the second RQG test which was the source of the provided MTR test, and the MTR test itself, each repeatedly.
The MTR test didn't fail either on the patched 10.4 nor on the patched 10.5, while without the patch it was failing on both (more or less deterministically on my machine).
The second RQG test didn't fail either on the patched 10.4 nor on the patched 10.5. Without the patch, it fails non-deterministically on 10.5 but doesn't fail on 10.4.
The first RQG test fails on the patched 10.4 and 10.5 non-deterministically with symptoms similar to MDEV-22564 and other versioning-related issues on debug builds or with diagnostics area assertion failures, and with MDEV-21987 on non-debug builds. It did it before the patch as well, intermittently with the problem at hand, which was one of the reasons why it took so long to narrow down the test case.
Thus, results of the MTR test and the second RQG test are somewhat reassuring, while the result of the first RQG test is inconclusive. I think it's acceptable for an emergency patch, but more testing is definitely needed.
|