[MDEV-16797] Node keep on IST every a few hours (InnoDB: Failing assertion: !cursor->index->is_committed()) Created: 2018-07-21  Updated: 2019-01-16  Resolved: 2019-01-06

Status: Closed
Project: MariaDB Server
Component/s: Storage Engine - InnoDB
Affects Version/s: None
Fix Version/s: N/A

Type: Bug Priority: Major
Reporter: Syamiera Assignee: Unassigned
Resolution: Incomplete Votes: 0
Labels: need_feedback
Environment:

Centos 7 , Galera Cluster 25.3.23 , MariaDB 10.2.13


Issue Links:
Relates
relates to MDEV-14643 InnoDB: Failing assertion: !cursor->... Closed
relates to MDEV-18272 InnoDB fails to rollback after exceed... Closed

 Description   

Please help me. What does this error mean?
How to fix this?
It keep on IST every few hours

this is the log file

2018-07-21 05:18:04 0x7f34bc1e6700 InnoDB: Assertion failure in file /home/buildbot/buildbot/padding_for_CPACK_RPM_BUILD_SOURCE_DIRS_PREFIX/mariadb-10.2.13/storage/innobase/row/row0ins.cc line 269
InnoDB: Failing assertion: !cursor->index->is_committed()
InnoDB: We intentionally generate a memory trap.
InnoDB: Submit a detailed bug report to https://jira.mariadb.org/
InnoDB: If you get repeated assertion failures or crashes, even
InnoDB: immediately after the mysqld startup, there may be
InnoDB: corruption in the InnoDB tablespace. Please refer to
InnoDB: https://mariadb.com/kb/en/library/xtradbinnodb-recovery-modes/
InnoDB: about forcing recovery.
180721 5:18:04 [ERROR] mysqld got signal 6 ;
This could be because you hit a bug. It is also possible that this binary
or one of the libraries it was linked against is corrupt, improperly built,
or misconfigured. This error can also be caused by malfunctioning hardware.

To report this bug, see https://mariadb.com/kb/en/reporting-bugs

We will try our best to scrape up some info that will hopefully help
diagnose the problem, but since we have already crashed,
something is definitely wrong and this may fail.

Server version: 10.2.13-MariaDB-log
key_buffer_size=131072
read_buffer_size=262144
max_used_connections=3
max_threads=514
thread_count=45
It is possible that mysqld could use up to
key_buffer_size + (read_buffer_size + sort_buffer_size)*max_threads = 67513673 K bytes of memory
Hope that's ok; if not, decrease some variables in the equation.

Thread pointer: 0x7f340c000b58
Attempting backtrace. You can use the following information to find out
where mysqld died. If you see no messages after this, something went
terribly wrong...
stack_bottom = 0x7f34bc1e5c9f thread_stack 0x49000
/usr/sbin/mysqld(my_print_stacktrace+0x2e)[0x563b6d8e66ce]
/usr/sbin/mysqld(handle_fatal_signal+0x355)[0x563b6d36e085]
/lib64/libpthread.so.0(+0xf5e0)[0x7f44037275e0]
/lib64/libc.so.6(gsignal+0x37)[0x7f4401c341f7]
:0(__GI_raise)[0x7f4401c358e8]
/usr/sbin/mysqld(+0x4277c0)[0x563b6d1247c0]
/usr/sbin/mysqld(+0x90359f)[0x563b6d60059f]
ut/ut0rbt.cc:221(rbt_tree_add_child(ib_rbt_t const*, ib_rbt_bound_t*, ib_rbt_node_t*) [clone .isra.4] [clone .part.5])[0x563b6d6032a2]
row/row0ins.cc:268(row_ins_sec_index_entry_low(unsigned long, unsigned long, dict_index_t*, mem_block_info_t*, mem_block_info_t*, dtuple_t*, unsigned long, que_thr_t*, bool))[0x563b6d605658]
row/row0ins.cc:3248(row_ins_sec_index_entry(dict_index_t*, dtuple_t*, que_thr_t*, bool))[0x563b6d61512a]
row/row0ins.cc:3293(row_ins_step(que_thr_t*))[0x563b6d55a283]
row/row0mysql.cc:1411(row_insert_for_mysql(unsigned char const*, row_prebuilt_t*))[0x563b6d378cdf]
handler/ha_innodb.cc:8455(ha_innobase::write_row(unsigned char*))[0x563b6d1c055f]
sql/handler.cc:6001(handler::ha_write_row(unsigned char*))[0x563b6d1c0ecc]
sql/sql_insert.cc:1929(write_record(THD*, TABLE*, st_copy_info*))[0x563b6d22048f]
sql/sql_insert.cc:3757(select_insert::send_data(List<Item>&))[0x563b6d208cc3]
sql/sql_select.cc:19859(end_send(JOIN*, st_join_table*, bool))[0x563b6d2108d4]
sql/sql_class.h:3602(THD::get_stmt_da())[0x563b6d22fcdf]
sql/sql_select.cc:18236(JOIN::exec_inner())[0x563b6d22fef3]
sql/sql_select.cc:3362(JOIN::exec())[0x563b6d23004a]
sql/sql_select.cc:3763(mysql_select(THD*, TABLE_LIST*, unsigned int, List<Item>&, Item*, unsigned int, st_order*, st_order*, Item*, st_order*, unsigned long long, select_result*, st_select_lex_unit*, st_select_lex*))[0x563b6d230ba4]
sql/sql_select.cc:376(handle_select(THD*, LEX*, select_result*, unsigned long))[0x563b6d1e167a]
sql/sql_parse.cc:4525(mysql_execute_command(THD*))[0x563b6d499246]
sql/sp_head.cc:3247(sp_instr_stmt::exec_core(THD*, unsigned int*))[0x563b6d49f45d]
sql/sp_head.cc:3009(sp_lex_keeper::reset_lex_and_exec_core(THD*, unsigned int*, bool, sp_instr*))[0x563b6d49fa35]
sql/sp_head.cc:3163(sp_instr_stmt::execute(THD*, unsigned int*))[0x563b6d49c280]
sql/sp_head.cc:1329(sp_head::execute(THD*, bool))[0x563b6d49da0f]
sql/sp_head.cc:2117(sp_head::execute_procedure(THD*, List<Item>*))[0x563b6d295f91]
sql/event_data_objects.cc:1433(Event_job_data::execute(THD*, bool))[0x563b6d4d5473]
sql/event_scheduler.cc:315(Event_worker_thread::run(THD*, Event_queue_element_for_exec*))[0x563b6d4d553d]
/lib64/libpthread.so.0(+0x7e25)[0x7f440371fe25]
/lib64/libc.so.6(clone+0x6d)[0x7f4401cf734d]

Trying to get some variables.
Some pointers may be invalid and cause the dump to abort.
Query (0x7f328c0167b0): INSERT INTO xx SELECT * FROM xx WHERE xx
Connection ID (thread ID): 118
Status: NOT_KILLED

Optimizer switch: index_merge=on,index_merge_union=on,index_merge_sort_union=on,index_merge_intersection=on,index_merge_sort_intersection=off,engine_condition_pushdown=off,index_condition_pushdown=on,derived_merge=on,derived_with_keys=on,firstmatch=on,loosescan=on,materialization=on,in_to_exists=on,semijoin=on,partial_match_rowid_merge=on,partial_match_table_scan=on,subquery_cache=on,mrr=off,mrr_cost_based=off,mrr_sort_keys=off,outer_join_with_cache=on,semijoin_with_cache=on,join_cache_incremental=on,join_cache_hashed=on,join_cache_bka=on,optimize_join_buffer_size=off,table_elimination=on,extended_keys=on,exists_to_in=on,orderby_uses_equalities=on,condition_pushdown_for_derived=on

The manual page at http://dev.mysql.com/doc/mysql/en/crashing.html contains
information that should help you find out what is causing the crash.



 Comments   
Comment by Elena Stepanova [ 2018-07-21 ]

Have you upgraded from some previous major version recently? You say 5.5.60-galera in Affected versions, but the crash is from 10.2.13.
Does it always happen on the same table, same INSERT .. SELECT statement?
Did you try to run CHECK TABLE on the table?
Is it production environment? How confidential the data is, what can you provide?
Data dump would be wonderful, but I guess it's unlikely.
Can you at least provide the description of the procedure which executes this statement, and the output of

SHOW CREATE TABLE gensms_archive.sms_messages;
SHOW INDEX IN gensms_archive.sms_messages;
SHOW CREATE TABLE gensms.sms_messages;
SHOW INDEX IN gensms.sms_messages;

And also your config file(s) or the output of SHOW VARIABLES.

Comment by Elena Stepanova [ 2018-07-22 ]

Thanks for the info. Could you please also attach the error log(s) that you have for the crashed instance – starting from as far back as you can, and up until now?

Comment by Syamiera [ 2018-07-22 ]

should be the same as the log I send at the top. same error

Comment by Elena Stepanova [ 2018-07-22 ]

This is just the occurrence of the crash itself. I mean the whole error log – server startups, shutdowns, etc. – for as long period of time as possible.

The working theory for this failure is that at some point, possibly long time before the crash, some other problem causes a hidden data corruption, which remains dormant until some set of circumstances brings it up. We have several user reports about it, but so far haven't been able to reproduce it on our side (and MySQL and Percona also had such reports, with equally unsuccessful results). So, we are trying to collect all information that we can get. In particular, the error might show that some time in the past, possibly even before the upgrade, there was some strange crash, or obscure errors/warnings on startups or recovery, which can give us a clue how the corruption occurs.

Comment by Syamiera [ 2018-07-28 ]

Do u think if I upgrade to 10.3 will help? If yes I will proceed to upgrade as I saw got many changes in thr newer version.
Is it consider as some kind of bug?
Actually the event is not just created. We've been running that event for the past few years.

Comment by Syamiera [ 2018-07-28 ]

I think i havent do this one step. That is to restart new cluster. Do u think it will help?

Let me share with you a few of the actions taken by me:

  • fresh install mariadb(version 10.2)
  • format that server
  • myisamcheck that mysql.event ----> it managed to survived for 11h without ist after I did that.but yet it still happened again
Comment by Syamiera [ 2018-08-01 ]

is it related?

  1. dmesg
    [138189.814327] mysqld[23280]: segfault at 0 ip 00007f07c740e690 sp 00007f07b883c688 error 4 in libc-2.17.so[7f07c72a4000+1c3000]
    [141789.574045] mysqld[28753]: segfault at 0 ip 00007fc847058690 sp 00007fc838486688 error 4 in libc-2.17.so[7fc846eee000+1c3000]
    [148986.765635] mysqld[37785]: segfault at 0 ip 00007faf8cfb96c0 sp 00007faf803eb708 error 4 in libc-2.17.so[7faf8ce4f000+1c3000]
    [156187.359027] mysqld[49835]: segfault at 0 ip 00007fb85ccbc6c0 sp 00007fb8500ee708 error 4 in libc-2.17.so[7fb85cb52000+1c3000]
    [213652.780564] Process accounting resumed
    [213780.228586] Process accounting resumed
    [213783.627900] mysqld[129339]: segfault at 0 ip 00007f7d9b5bf6c0 sp 00007f6f7eff7708 error 4 in libc-2.17.so[7f7d9b455000+1c3000]
    [220984.528118] mysqld[142811]: segfault at 0 ip 00007fbd7e9d26c0 sp 00007faf784b9708 error 4 in libc-2.17.so[7fbd7e868000+1c3000]
    [224586.192172] mysqld[147700]: segfault at 0 ip 00007f3315fa56c0 sp 00007f24f721c708 error 4 in libc-2.17.so[7f3315e3b000+1c3000]
    [231786.641688] mysqld[156699]: segfault at 0 ip 00007f11944906c0 sp 00007f0390180708 error 4 in libc-2.17.so[7f1194326000+1c3000]
    [235387.881364] mysqld[161355]: segfault at 0 ip 00007f480b4ef6c0 sp 00007f39fc180708 error 4 in libc-2.17.so[7f480b385000+1c3000]
    [266697.099222] Process accounting resumed
    [300184.982049] mysqld[47901]: segfault at 0 ip 00007f7e3fc746c0 sp 00007f70170f0708 error 4 in libc-2.17.so[7f7e3fb0a000+1c3000]
    [303789.451026] mysqld[52341]: segfault at 0 ip 00007f02649366c0 sp 00007f0258569708 error 4 in libc-2.17.so[7f02647cc000+1c3000]
    [307382.998797] mysqld[56793]: segfault at 0 ip 00007fa4c00646c0 sp 00007f96b82f7708 error 4 in libc-2.17.so[7fa4bfefa000+1c3000]
    [310987.160261] mysqld[61201]: segfault at 0 ip 00007f52ff2b76c0 sp 00007f44f859a708 error 4 in libc-2.17.so[7f52ff14d000+1c3000]
    [314587.684121] mysqld[65769]: segfault at 0 ip 00007fdb4c9456c0 sp 00007fdb40578708 error 4 in libc-2.17.so[7fdb4c7db000+1c3000]
    [318191.102170] mysqld[70190]: segfault at 0 ip 00007f049d8f76c0 sp 00007f0490528708 error 4 in libc-2.17.so[7f049d78d000+1c3000]
    [321787.573505] mysqld[74612]: segfault at 0 ip 00007f3f560b36c0 sp 00007f3f484e3708 error 4 in libc-2.17.so[7f3f55f49000+1c3000]
    [325393.704040] mysqld[78997]: segfault at 0 ip 00007f60762046c0 sp 00007f6068634708 error 4 in libc-2.17.so[7f607609a000+1c3000]
    [328988.891767] mysqld[83420]: segfault at 0 ip 00007f2f5dc1e6c0 sp 00007f215809f708 error 4 in libc-2.17.so[7f2f5dab4000+1c3000]
    [353092.356646] Process accounting resumed
    [354181.823324] mysqld[115375]: segfault at 0 ip 00007f18d9e056c0 sp 00007f0ad4054708 error 4 in libc-2.17.so[7f18d9c9b000+1c3000]
    [379382.163299] mysqld[146725]: segfault at 0 ip 00007fa489efa6c0 sp 00007f9684180708 error 4 in libc-2.17.so[7fa489d90000+1c3000]
    [393778.124726] mysqld[164415]: segfault at 0 ip 00007fd273cc76c0 sp 00007fd2680fb708 error 4 in libc-2.17.so[7fd273b5d000+1c3000]
    [404580.527102] mysqld[179047]: segfault at 0 ip 00007f8b594df6c0 sp 00007f7d5310e708 error 4 in libc-2.17.so[7f8b59375000+1c3000]
    [418985.263075] mysqld[17806]: segfault at 0 ip 00007fb5445bc6c0 sp 00007fa740261708 error 4 in libc-2.17.so[7fb544452000+1c3000]
    [422602.693969] mysqld[22353]: segfault at 0 ip 00007fd26aafa690 sp 00007fc46454f688 error 4 in libc-2.17.so[7fd26a990000+1c3000]
Comment by Elena Stepanova [ 2018-08-29 ]

Are you using any temporary tables in your workflow?

Comment by Syamiera [ 2018-08-30 ]

Yes I did. but is it related?
because the error shows its related with event

Comment by Elena Stepanova [ 2018-08-30 ]

We have recently discovered a test case, however complicated, which causes the same failure when temporary tables are used. I suppose it could be made to use events and still cause the trouble.
So, your event runs a stored procedure, which, in turn, runs an INSERT statement. Can it be that the event/procedure creates, and INSERT statement uses any of the temporary tables?

Comment by Syamiera [ 2018-09-28 ]

my event not related with store procedure.
basically the task is to move from table a to table b, delete from from a once done.
there's no temp table. the event insert directly to the table.

Comment by Elena Stepanova [ 2018-12-08 ]

We have got another test case, which doesn't involve temporary tables, but involves virtual columns. Unfortunately SHOW CREATE TABLE is not here anymore, so I can't check if your table had any virtual columns. Did it?

If it did, then your variation of the problem might be fixed in 10.2.16, in the scope of MDEV-15114.

Generated at Thu Feb 08 08:31:38 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.