[MDEV-20367] SIGSEGV with "Found wrong key definition..." warning Created: 2019-08-16  Updated: 2019-09-06  Resolved: 2019-08-21

Status: Closed
Project: MariaDB Server
Component/s: Views
Affects Version/s: 10.2.17
Fix Version/s: 10.0.37, 10.1.36, 10.2.18, 10.3.10

Type: Bug Priority: Major
Reporter: Jonathan Monahan Assignee: Unassigned
Resolution: Fixed Votes: 0
Labels: None
Environment:

Ubuntu Xenial and Precise


Attachments: File issue_28287_bad.sql     File issue_28287_ok.sql    
Issue Links:
Duplicate
duplicates MDEV-17021 Server crash or assertion `length <= ... Closed

 Description   

The attached issue_28287_bad.sql script kills MariaDb 10.2.17. The mysql-error.log contains

2019-08-16 23:14:02 139832664938240 [ERROR] Found wrong key definition in summary_temporary; Please do "ALTER TABLE 'summary_temporary' FORCE " to fix it!
2019-08-16 23:14:02 139832664938240 [ERROR] Found wrong key definition in summary_temporary; Please do "ALTER TABLE 'summary_temporary' FORCE " to fix it!
190816 23:14:02 [ERROR] mysqld got signal 11 ;
This could be because you hit a bug. It is also possible that this binary
or one of the libraries it was linked against is corrupt, improperly built,
or misconfigured. This error can also be caused by malfunctioning hardware.
 
To report this bug, see https://mariadb.com/kb/en/reporting-bugs
 
We will try our best to scrape up some info that will hopefully help
diagnose the problem, but since we have already crashed, 
something is definitely wrong and this may fail.
 
Server version: 10.2.17-MariaDB-10.2.17+maria~precise-log
key_buffer_size=104857600
read_buffer_size=131072
max_used_connections=5
max_threads=202
thread_count=9
It is possible that mysqld could use up to 
key_buffer_size + (read_buffer_size + sort_buffer_size)*max_threads = 546245 K  bytes of memory
Hope that's ok; if not, decrease some variables in the equation.
 
Thread pointer: 0x7f2ca80009a8
Attempting backtrace. You can use the following information to find out
where mysqld died. If you see no messages after this, something went
terribly wrong...
stack_bottom = 0x7f2d54524e50 thread_stack 0x49000
*** buffer overflow detected ***: /usr/sbin/mysqld terminated
======= Backtrace: =========
/lib/x86_64-linux-gnu/libc.so.6(__fortify_fail+0x37)[0x7f2d52511817]
/lib/x86_64-linux-gnu/libc.so.6(+0x109710)[0x7f2d52510710]
/lib/x86_64-linux-gnu/libc.so.6(+0x10a7ce)[0x7f2d525117ce]
/usr/sbin/mysqld(my_addr_resolve+0xd8)[0xd7c8b8]
/usr/sbin/mysqld(my_print_stacktrace+0x1bd)[0xd66b6d]
/usr/sbin/mysqld(handle_fatal_signal+0x4c2)[0x7f3852]
/lib/x86_64-linux-gnu/libpthread.so.0(+0xfcb0)[0x7f2d52dd1cb0]
/lib/x86_64-linux-gnu/libc.so.6(+0x148059)[0x7f2d5254f059]
/usr/sbin/mysqld[0xca3a21]
/usr/sbin/mysqld[0xca4ca1]
/usr/sbin/mysqld[0xca60fd]
/usr/sbin/mysqld[0xcb18d4]
/usr/sbin/mysqld(_ZN7handler12ha_write_rowEPh+0x33f)[0x7feb6f]
/usr/sbin/mysqld(_Z12write_recordP3THDP5TABLEP12st_copy_info+0x73)[0x60f083]
/usr/sbin/mysqld(_ZN13select_insert9send_dataER4ListI4ItemE+0xd2)[0x60fa32]
/usr/sbin/mysqld[0x65daa7]
/usr/sbin/mysqld[0x66198f]
/usr/sbin/mysqld(_Z10sub_selectP4JOINP13st_join_tableb+0x176)[0x669d46]
/usr/sbin/mysqld(_ZN4JOIN10exec_innerEv+0x90a)[0x68d36a]
/usr/sbin/mysqld(_Z12mysql_selectP3THDP10TABLE_LISTjR4ListI4ItemEPS4_jP8st_orderS9_S7_S9_yP13select_resultP18st_select_lex_unitP13st_select_lex+0x14d)[0x68d84d]
/usr/sbin/mysqld(_Z13handle_selectP3THDP3LEXP13select_resultm+0x24c)[0x68e66c]
/usr/sbin/mysqld(_Z21mysql_execute_commandP3THD+0x7b44)[0x6320f4]
/usr/sbin/mysqld(_Z11mysql_parseP3THDPcjP12Parser_statebb+0x2ae)[0x6333fe]
/usr/sbin/mysqld(_Z16dispatch_command19enum_server_commandP3THDPcjbb+0x270a)[0x63652a]
/usr/sbin/mysqld(_Z10do_commandP3THD+0x151)[0x636b31]
/usr/sbin/mysqld(_Z24do_handle_one_connectionP7CONNECT+0x38b)[0x71ef2b]
/usr/sbin/mysqld(handle_one_connection+0x3f)[0x71f00f]
/lib/x86_64-linux-gnu/libpthread.so.0(+0x7e9a)[0x7f2d52dc9e9a]
/lib/x86_64-linux-gnu/libc.so.6(clone+0x6d)[0x7f2d524faccd]
======= Memory map: ========
00400000-01371000 r-xp 00000000 fc:01 932705                             /usr/sbin/mysqld
01571000-0157b000 r--p 00f71000 fc:01 932705                             /usr/sbin/mysqld
0157b000-01631000 rw-p 00f7b000 fc:01 932705                             /usr/sbin/mysqld
01631000-01ec6000 rw-p 00000000 00:00 0 
036ee000-07b00000 rw-p 00000000 00:00 0                                  [heap]
7f2c84000000-7f2c84021000 rw-p 00000000 00:00 0 
...
7f2d546ae000-7f2d546b3000 rw-p 00000000 00:00 0 
7f2d546b3000-7f2d546b4000 r--p 00022000 fc:01 393415                     /lib/x86_64-linux-gnu/ld-2.15.so
7f2d546b4000-7f2d546b6000 rw-p 00023000 fc:01 393415                     /lib/x86_64-linux-gnu/ld-2.15.so
7fffe47d8000-7fffe47f9000 rw-p 00000000 00:00 0                          [stack]
7fffe47ff000-7fffe4800000 r-xp 00000000 00:00 0                          [vdso]
ffffffffff600000-ffffffffff601000 r-xp 00000000 00:00 0                  [vsyscall]
Fatal signal 6 while backtracing

The attached issue_28287_ok.sql script is virtually identical but it does not kill MariaDb 10.2.17. Instead the final statement has warnings, the same as in the error log:

MariaDB [test]> show warnings;
+---------+------+---------------------------------------------------------------------------------------------------------------+
| Level   | Code | Message                                                                                                       |
+---------+------+---------------------------------------------------------------------------------------------------------------+
| Warning | 1194 | Found wrong key definition in summary_temporary; Please do "ALTER TABLE 'summary_temporary' FORCE" to fix it! |
| Warning | 1194 | Found wrong key definition in summary_temporary; Please do "ALTER TABLE 'summary_temporary' FORCE" to fix it! |
+---------+------+---------------------------------------------------------------------------------------------------------------+
2 rows in set (0.00 sec)



 Comments   
Comment by Elena Stepanova [ 2019-08-17 ]

I am not getting the crash, but the warnings are indeed present in 10.2.17, and don't appear starting from 10.2.18 (also 10.0.37 and 1.1.36).
10.2.17 is an year old and 10 releases behind, please try to upgrade to the latest and see if it helps to resolve the problem.

Comment by Jonathan Monahan [ 2019-08-19 ]

I can see the "Found wrong key definition" warning in table.cc in all branches from 10.2 to 10.5. Are you able to point to a bug fix that will fix this crash?

Comment by Jonathan Monahan [ 2019-08-19 ]

issue_28287_bad.sql
Apologies, I attached the wrong bad SQL - I have replaced it.

With further investigation I have found that the ROUND(...,0) is the probable original cause of the crash. Changing it to ROUND(...,2) works OK too.

Comment by Alice Sherepa [ 2019-08-21 ]

Reproduced on 10.2.17. fixed by MDEV-17021:

2019-08-21 10:20:17 140262090344192 [ERROR] Found wrong key definition in summary_temporary; Please do "ALTER TABLE 'summary_temporary' FORCE " to fix it!
2019-08-21 10:20:17 140262090344192 [ERROR] Found wrong key definition in summary_temporary; Please do "ALTER TABLE 'summary_temporary' FORCE " to fix it!
190821 10:20:17 [ERROR] mysqld got signal 11 ;
 
/lib/x86_64-linux-gnu/libpthread.so.0(+0x11390)[0x7f91523be390]
multiarch/memcpy-avx-unaligned.S:90(__nss_passwd_lookup)[0x7f9151aa5f75]
maria/ma_blockrec.c:1998(write_tail)[0x564f7b49845c]
maria/ma_blockrec.c:2930(write_block_record)[0x564f7b498d5b]
maria/ma_blockrec.c:3565(allocate_and_write_block_record)[0x564f7b49a97d]
maria/ma_write.c:157(maria_write)[0x564f7b4a7ba1]
sql/handler.cc:5959(handler::ha_write_row(unsigned char*))[0x564f7b0204af]
sql/sql_insert.cc:1930(write_record(THD*, TABLE*, st_copy_info*))[0x564f7ae7b4c1]
sql/sql_insert.cc:3758(select_insert::send_data(List<Item>&))[0x564f7ae7be22]
sql/sql_select.cc:19903(end_send(JOIN*, st_join_table*, bool))[0x564f7aed355f]
sql/sql_class.h:3588(THD::get_stmt_da())[0x564f7aec0391]
sql/sql_select.cc:18743(sub_select(JOIN*, st_join_table*, bool))[0x564f7aec880e]
sql/sql_select.cc:18280(do_select)[0x564f7aee6db3]
sql/sql_select.cc:3398(JOIN::exec())[0x564f7aee6fdc]
sql/sql_select.cc:3799(mysql_select(THD*, TABLE_LIST*, unsigned int, List<Item>&, Item*, unsigned int, st_order*, st_order*, Item*, st_order*, unsigned long long, select_result*, st_select_lex_unit*, st_select_lex*))[0x564f7aee5efa]
sql/sql_select.cc:376(handle_select(THD*, LEX*, select_result*, unsigned long))[0x564f7aee71f4]
sql/sql_parse.cc:3953(mysql_execute_command(THD*))[0x564f7ae9a408]
sql/sql_parse.cc:8009(mysql_parse(THD*, char*, unsigned int, Parser_state*, bool, bool))[0x564f7ae9ab8a]
sql/sql_parse.cc:1824(dispatch_command(enum_server_command, THD*, char*, unsigned int, bool, bool))[0x564f7ae9cfd1]
sql/sql_parse.cc:1380(do_command(THD*))[0x564f7ae9d5bd]
sql/sql_connect.cc:1335(do_handle_one_connection(CONNECT*))[0x564f7af5cf1f]
sql/sql_connect.cc:1243(handle_one_connection)[0x564f7af5d044]
/lib/x86_64-linux-gnu/libpthread.so.0(+0x76ba)[0x7f91523b46ba]
x86_64/clone.S:111(clone)[0x7f9151a5f41d]
 
 
Query (0x7f90dc00f000): CREATE TEMPORARY TABLE summary_temporary (   INDEX index_on_person_name (person_name(10)),   INDEX index_on_total_amount (total_amount),   INDEX index_on_total_weighted_amount (total_weighted_amount)   ) ENGINE=Aria   SELECT * FROM summary

Comment by Jonathan Monahan [ 2019-08-21 ]

I am surprised that this is a duplicate of MDEV-17021 - that issue is only fixed in 10.0 and 10.1. Apparently it isn't reproducible in 10.2 or later.

My script kills MariaDb 10.2.17 with SIGSEGV (signal 11) whilst in MDEV-17021 the server dies with signal 6. They seem quite different issues to me.

Comment by Alice Sherepa [ 2019-08-21 ]

fix for the bug is merged up, please see https://github.com/mariadb/server/commit/f195286a3eae6328a1f90948205e90201c0479c5,
signal 6 is on the debug version, stack trace is quite similar.

Comment by Elena Stepanova [ 2019-08-21 ]

jonathan.monahan@workbooks.com, MDEV-17021 didn't contain all fix versions as it should have, it happens. I've updated it now.
For the duplicate, I agree, it's not the most accurate way to put it. The bug that you reported was apparently fixed by the same patch as MDEV-17021 (as alice found out). It often happens that bugs which look completely different have the same root cause and get fixed by a single change.

Comment by Steve [ 2019-09-06 ]

Is there any way of avoiding this issue as it may be easier for us to modify our application code that upgrade a suite of production servers. For example, if it is triggered by a specific style of query, we could avoid it. I also noticed that the test case that was fixed in MDEV-17021 related to an Aria table. In our failures, it was also an Aria table being inserted into at the time of the cash. Is the problem specific to Aria? If we switched our table types to MyISAM or even InnoDb, do you think this might mitigate the issue?

Generated at Thu Feb 08 08:58:55 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.