[MDEV-15184] mysqld crashes with Signal 11 (Is it because of semijoin_nests ??) Created: 2018-02-02  Updated: 2018-03-07  Resolved: 2018-03-07

Status: Closed
Project: MariaDB Server
Component/s: Optimizer
Affects Version/s: 10.0.17
Fix Version/s: N/A

Type: Bug Priority: Major
Reporter: Vijay Chenji Assignee: Alice Sherepa
Resolution: Incomplete Votes: 0
Labels: crash
Environment:

Database server – CentOS release 6.8 (Final)
Server version: 10.0.17-MariaDB-wsrep-log MariaDB Server, wsrep_25.10.r4144


Attachments: File log.error    

 Description   

mysqld is crashing multiple times with Signal 11. The log.error file did not capture any SQL statement at the time of the crash. The same following entries are seen for every crash:

Thread pointer: 0x0x7fe6f8f93008
Attempting backtrace. You can use the following information to find out
where mysqld died. If you see no messages after this, something went
terribly wrong...
stack_bottom = 0x7fe6dc7fedd8 thread_stack 0x40000
mysys/stacktrace.c:247(my_print_stacktrace)[0xbe8ede]
sql/signal_handler.cc:153(handle_fatal_signal)[0x738a7c]
/lib64/libpthread.so.0(+0xf7e0)[0x7fe92ad2e7e0]
sql/opt_subselect.cc:2203(optimize_semijoin_nests(JOIN*, unsigned long long))[0x6b54ad]
sql/sql_select.cc:4030(make_join_statistics)[0x5f06d5]
sql/sql_select.cc:1339(JOIN::optimize_inner())[0x5f160a]
sql/sql_select.cc:1037(JOIN::optimize())[0x5f4592]
sql/sql_select.cc:373(handle_select(THD*, LEX*, select_result*, unsigned long))[0x5f835d]
sql/sql_parse.cc:5740(execute_sqlcom_select)[0x59b501]
sql/sql_parse.cc:2851(mysql_execute_command(THD*))[0x59e47f]
sql/sql_parse.cc:7122(mysql_parse(THD*, char*, unsigned int, Parser_state*))[0x5a63ba]
sql/sql_parse.cc:6946(wsrep_mysql_parse)[0x5a642a]
sql/sql_parse.cc:1479(dispatch_command(enum_server_command, THD*, char*, unsigned int))[0x5a8db7]
sql/sql_parse.cc:1077(do_command(THD*))[0x5a9737]
sql/sql_connect.cc:1392(do_handle_one_connection(THD*))[0x673b50]
sql/sql_connect.cc:1305(handle_one_connection)[0x673d72]
perfschema/pfs.cc:1863(pfs_spawn_thread)[0xaa2ab9]
/lib64/libpthread.so.0(+0x7aa1)[0x7fe92ad26aa1]
/lib64/libc.so.6(clone+0x6d)[0x7fe929ca5bcd]



 Comments   
Comment by Daniel Black [ 2018-02-03 ]

I had a look and despite the age of 10.0.17 it seems very little that I can see has been changed in the vicinity of https://github.com/MariaDB/server/blob/mariadb-galera-10.0.17/sql/opt_subselect.cc#L2203.

As you indicated a query would be really helpful as would the 'SHOW CREATE TABLE

{tablenames}

' and 'SHOW INDEXES FROM

{tablesnames}

' and a my.cnf file.

If you get a core dump and using gdb can you extract the following:

p *join
p *join->thd
p tableno
p tm_it
p *join->map2table[tableno]
bt thread apply full

Comment by Vijay Chenji [ 2018-02-05 ]

Thank you Daniel for your quick response.

Is there an already compiled debugged version of mysqld available which we can use straightaway?

Comment by Daniel Black [ 2018-02-05 ]

Seems like there's no debuginfo packages (https://jira.mariadb.org/browse/MDEV-12508?focusedCommentId=97056), or debug packages MDEV-13027 yet.

How to enable core enabling at runtime see MDEV-13126.

Maybe its worth trying the latest version 10.0-galera version at least on one node and see if its reproducible there, I might have missed something looking at its staticly. Note to save your galera rpm package if you have it - seems binary rpm are missing from the archiving scripts MDEV-15188.

Comment by Vijay Chenji [ 2018-02-09 ]

Hi Daniel,

I was trying to compile mariadb for debugging : These are the steps I was following and the cmake failed to configure.
Please advise how to get the "cmake" work properly.

The steps:

Installed mariadb 10.0.17 in virtual box centos6.
Prepared the system by installing required tools like cmake, git, g++ etc...

created a build directory beside my source directory /usr/local/
mkdir build-mariadb
cd build-mariadb

GENERIC BUILD INSTRUCTIONS:
cmake . -DBUILD_CONFIG=mysql_release -DRPM=centos6

[root@vb-nwe build-mariadb]# cmake . -DBUILD_CONFIG=mysql_release -DRPM=centos6
CMake Error: The source directory "/usr/local/build-mariadb" does not appear to contain CMakeLists.txt.
Specify --help for usage, or press the help button on the CMake GUI.
[root@vb-nwe build-mariadb]#
[root@vb-nwe build-mariadb]# cp /usr/share/cmake/Modules/FortranCInterface/Verify/CMakeLists.txt /usr/local/build-mariadb
[root@vb-nwe build-mariadb]# ll
total 4
rw-rr- 1 root root 1205 Feb 9 21:21 CMakeLists.txt
[root@vb-nwe build-mariadb]#
[root@vb-nwe build-mariadb]#
[root@vb-nwe build-mariadb]# cmake . -DBUILD_CONFIG=mysql_release
– The C compiler identification is GNU 4.4.7
– The Fortran compiler identification is unknown
– Check for working C compiler: /usr/bin/cc
– Check for working C compiler: /usr/bin/cc – works
– Detecting C compiler ABI info
– Detecting C compiler ABI info - done
CMake Error: your Fortran compiler: "CMAKE_Fortran_COMPILER-NOTFOUND" was not found. Please set CMAKE_Fortran_COMPILER to a valid compiler path or name.
– Detecting Fortran/C Interface
CMake Error at /usr/share/cmake/Modules/CMakeFortranInformation.cmake:27 (get_filename_component):
get_filename_component called with incorrect number of arguments
Call Stack (most recent call first):
CMakeLists.txt:13 (project)

CMake Error: CMAKE_Fortran_COMPILER not set, after EnableLanguage
CMake Error: Internal CMake error, TryCompile configure of cmake failed
– Detecting Fortran/C Interface - Failed to compile
CMake Warning (dev) at /usr/share/cmake/Modules/FortranCInterface.cmake:215 (message):
No FortranCInterface mangling known for VerifyFortran
Call Stack (most recent call first):
CMakeLists.txt:24 (FortranCInterface_HEADER)
This warning is for project developers. Use -Wno-dev to suppress it.

– Configuring incomplete, errors occurred!

Comment by Daniel Black [ 2018-02-10 ]

cmake has to be run passing the directory of the mariadb source. This can be obtained with {{git clone --single-branch --branch 10.0-galera https://github.com/MariaDB/server.git]] or downloading a mariadb-galera the source tarball from http://archive.mariadb.org/.

More instructions see:
https://mariadb.com/kb/en/library/debugging-mariadb/

Comment by Vijay Chenji [ 2018-02-14 ]

Hi Daniel,

– Installed the source code of mariadb-10.0.17
– Completed the build CentOS 6.8 environment
– Complied using cmake. The command used is :
cmake ../mysql3 -DCMAKE_INSTALL_PREFIX=/usr/local/mysql3 -DIGNORE_AIO_CHECK=ON
– Then, ran make
– During the make test, 54 out of 55 tests have passed and the last one "debg" test failed.

55/55 Test #55: dbug .............................***Failed 0.00 sec

98% tests passed, 1 tests failed out of 55

Total Test time (real) = 138.14 sec

The following tests FAILED:
55 - dbug (Failed)
Errors while running CTest
make: *** [test] Error 8

=========
I upgraded gcc from 4.4 to 4.7, rebooted the server and reran the make test, but the debug test failed again. Any suggestions?

By the way, I was able to run ./scripts/mysql_install_db --defaults-file=./my.cnf and created the system databases 'mysql' and 'performance schema', and also started mysql using ./bin/mysqld_safe --defaults-file=./my.cnf.

Please suggest what I can do to get through the debug test.

Comment by Alice Sherepa [ 2018-03-07 ]

Is it possible for you to find out what query is causing the crash, is it reproducible, how often do you get it?

Comment by Vijay Chenji [ 2018-03-07 ]

As there was no particular query available that would have caused the multiple crashes, I tried to reproduce the crash by using quite a few queries gathered from the file "log.slowquery". These queries have multiple joins and create temp tables and can apply considerable stress on the database. I ran such queries simultaneously from multiple sessions, but none of these complicated queries could bring down the database and on the contrary all of them got successfully executed after running for a longer than average execution time in real time production.. I did all of these on a virtual box with no real time application connections since I am not supposed to test anything on the production server. Simply to say, I could not reproduce the crash using any of the stress queries.

As an alternate solution, we edited the values of these variables in the my.cnf and since then we did not have any more crashes. Not sure if these changes or the application behavior resolved it :

Changed from:
join_cache_level=8
optimizer_switch='mrr=on'
optimizer_switch='mrr_sort_keys=on'

to this::
join_cache_level=2

  1. optimizer_switch='mrr=on'
  2. optimizer_switch='mrr_sort_keys=on'

You may go ahead and close the ticket. Thanks to you and Daniel for all your suggestions and time.

Comment by Vijay Chenji [ 2018-03-07 ]

In the above post, the comment out sign # got changed to numbers 1 and 2. These lines were actually commented out in the my.cnf : Commnted out "optimizer_switch='mrr=on'" and commented out optimizer_switch='mrr_sort_keys=on

#optimizer_switch='mrr=on'
#optimizer_switch='mrr_sort_keys=on'

Comment by Alice Sherepa [ 2018-03-07 ]

I close it, but if you get some info what/why was that crash, please write)
happy to hear that there were no more crashes.

Generated at Thu Feb 08 08:19:19 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.