[MDEV-3955] mysqld signal 6 when killing ALTER TABLE ADD FULLTEXT Created: 2012-12-21  Updated: 2013-06-03  Resolved: 2013-06-03

Status: Closed
Project: MariaDB Server
Component/s: None
Affects Version/s: 5.5.28a
Fix Version/s: 5.5.32

Type: Bug Priority: Major
Reporter: Sean Pringle Assignee: Sergei Golubchik
Resolution: Cannot Reproduce Votes: 0
Labels: None
Environment:

Ubuntu Server 10.04 LTS i686 glibc 2.11.1



 Description   

I ran an ALTER TABLE ... ADD FULLTEXT (...) statement on a ~900M MyISAM table in MariaDB 5.5.28a.

Issuing a "KILL <id>" via the MySQL client resulted in a "*** glibc detected *** mysqld free() invalid pointer" trace in the syslog and mysqld got signal 6.

Additionally, mysqld hung trying to gather a stack trace of its own and needed a sig9, which made it a bit nasty. Nothing except the generic sig6 message in the error log.

I havn't been able to reproduce it yet. I'm filing this bug in case someone more informed can suggest how to repeat it other than using brute force and luck...

*** glibc detected *** /home/db/sandboxes/5.5.28/bin/mysqld: free(): invalid pointer: 0x0ae82e20 ***
======= Backtrace: =========
/lib/tls/i686/cmov/libc.so.6(+0x6b591)[0xb7504591]
/lib/tls/i686/cmov/libc.so.6(+0x6cde8)[0xb7505de8]
/lib/tls/i686/cmov/libc.so.6(cfree+0x6d)[0xb7508ecd]
/home/db/sandboxes/5.5.28/bin/mysqld(my_free+0x1d)[0x875872d]
/home/db/sandboxes/5.5.28/bin/mysqld[0x86843fb]
/home/db/sandboxes/5.5.28/bin/mysqld[0x8675e51]
/home/db/sandboxes/5.5.28/bin/mysqld[0x8675ff2]
/home/db/sandboxes/5.5.28/bin/mysqld[0x867400e]
/home/db/sandboxes/5.5.28/bin/mysqld[0x823563c]
/home/db/sandboxes/5.5.28/bin/mysqld(_Z17mysql_alter_tableP3THDPcS1_P24st_ha_create_informationP10TABLE_LISTP10Alter_infojP8st_orderbb+0x237b)[0x823e3fb]
/home/db/sandboxes/5.5.28/bin/mysqld(_ZN21Alter_table_statement7executeEP3THD+0x56f)[0x847a1ef]
/home/db/sandboxes/5.5.28/bin/mysqld(_Z21mysql_execute_commandP3THD+0x727)[0x81bd8e7]
/home/db/sandboxes/5.5.28/bin/mysqld(_Z11mysql_parseP3THDPcjP12Parser_state+0x1bb)[0x81c58fb]
/home/db/sandboxes/5.5.28/bin/mysqld(_Z16dispatch_command19enum_server_commandP3THDPcj+0x1beb)[0x81c74eb]
/home/db/sandboxes/5.5.28/bin/mysqld(_Z10do_commandP3THD+0xd8)[0x81c7a18]
/home/db/sandboxes/5.5.28/bin/mysqld(_Z24do_handle_one_connectionP3THD+0x154)[0x8287f44]
/home/db/sandboxes/5.5.28/bin/mysqld(handle_one_connection+0x3c)[0x828808c]
/lib/tls/i686/cmov/libpthread.so.0(+0x596e)[0xb775496e]
/lib/tls/i686/cmov/libc.so.6(clone+0x5e)[0xb7566a4e]

Using objdump on the mysqld binary and the return addresses in the trace, I think the problem is mi_repair_by_sort() blindly freeing sort_info pointers after some lower layer detected the kill, specifically here in the 5.5.28a source:

mi_check.c:2557 my_free(sort_info.key_block);

Objdump excerpt:

 86843cf:	0f 84 a8 00 00 00    	je     868447d <mi_repair_by_sort+0x57d>
 86843d5:	8b 85 68 f9 ff ff    	mov    -0x698(%ebp),%eax
 86843db:	31 c9                	xor    %ecx,%ecx
 86843dd:	8d 50 e8             	lea    -0x18(%eax),%edx
 86843e0:	85 c0                	test   %eax,%eax
 86843e2:	0f 45 ca             	cmovne %edx,%ecx
 86843e5:	89 0c 24             	mov    %ecx,(%esp)
 86843e8:	e8 23 43 0d 00       	call   8758710 <my_free>
 86843ed:	8b 85 7c ff ff ff    	mov    -0x84(%ebp),%eax
 86843f3:	89 04 24             	mov    %eax,(%esp)
 86843f6:	e8 15 43 0d 00       	call   8758710 <my_free>   <<< HERE
 86843fb:	8b 45 84             	mov    -0x7c(%ebp),%eax
 86843fe:	89 04 24             	mov    %eax,(%esp)
 8684401:	e8 0a 43 0d 00       	call   8758710 <my_free>
 8684406:	8b 85 78 ff ff ff    	mov    -0x88(%ebp),%eax
 868440c:	89 04 24             	mov    %eax,(%esp)
 868440f:	e8 fc 42 0d 00       	call   8758710 <my_free>
 8684414:	8b 8d 54 f6 ff ff    	mov    -0x9ac(%ebp),%ecx
 868441a:	89 0c 24             	mov    %ecx,(%esp)
 868441d:	e8 3e 05 0c 00       	call   8744960 <end_io_cache>
 8684422:	8b 9d 60 f6 ff ff    	mov    -0x9a0(%ebp),%ebx

Or rather, some lower layer is leaving sort_info in a bad state after a kill.

Obviously this is MyISAM and the FULLTEXT focus may be on Aria now, but fwiw the code in ma_check.c and maria_repair_by_sort() looks very similar.



 Comments   
Comment by Sergei Golubchik [ 2012-12-21 ]

what mariadb binaries were you using? to make use of the addresses in the backtrace I need to have exactly the same binary.

Comment by Sean Pringle [ 2012-12-21 ]

mariadb-5.5.28a-linux-i686.tar.gz from downloads.mariadb.org.

$ md5sum mariadb-5.5.28a-linux-i686.tar.gz
34f744dc10504f85008bf4b060291234

$ md5sum 5.5.28/bin/mysqld
46b386736bd620e049956a145fad047b

Comment by Sergei Golubchik [ 2012-12-21 ]

Thanks, that helped. Just for the reference (and as a note to myself), the backtrace with addr2line information:

0xb7504591 libc.so.6[+0x6b591]
0xb7505de8 libc.so.6[+0x6cde8]
0xb7508ecd libc.so.6[cfree+0x6d]
0x0875872d mysqld[my_free+0x1d] => my_malloc.c:117
0x086843fb mysqld[mi_repair_by_sort] => mi_check.c:2558
0x08675e51 mysqld[ha_myisam::repair(THD*,st_handler_check_param&,bool)] => ha_myisam.cc:1133
0x08675ff2 mysqld[ha_myisam::enable_indexes(uint)] => ha_myisam.cc:1446
0x0867400e mysqld[ha_myisam::end_bulk_insert()] => ha_myisam.cc:1592
0x0823563c mysqld[copy_data_between_tables] => sql_table.cc:7509
0x0823e3fb mysqld[mysql_alter_table(THD*,char*,char*,st_ha_create_information*,TABLE_LIST*,Alter_info*,uint,st_order*,bool,bool)+0x237b] => sql_table.cc:6752
0x0847a1ef mysqld[Alter_table_statement::execute(THD*)+0x56f] => sql_alter.cc:108
0x081bd8e7 mysqld[mysql_execute_command(THD*)+0x727] => sql_parse.cc:4453
0x081c58fb mysqld[mysql_parse(THD*,char*,uint,Parser_state*)+0x1bb] => sql_parse.cc:5730
0x081c74eb mysqld[dispatch_command(enum_server_command,THD*,char*,uint)+0x1beb] => sql_parse.cc:1057
0x081c7a18 mysqld[do_command(THD*)+0xd8] => sql_parse.cc:794
0x08287f44 mysqld[do_handle_one_connection(THD*)+0x154] => sql_connect.cc:1253
0x0828808c mysqld[handle_one_connection+0x3c] => sql_connect.cc:1170
0xb775496e libpthread.so.0[+0x596e]
0xb7566a4e libc.so.6[clone+0x5e]

Comment by Sergei Golubchik [ 2013-06-03 ]

I've never was able to reproduce it.

And adding lots of debug checks in the code where the 'killed' flag is checked did not reveal anything.

Generated at Thu Feb 08 06:52:37 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.