[MDEV-12760] CONNECT Engine crashes with signal 7 with JSON table type Created: 2017-05-09  Updated: 2020-08-25  Resolved: 2017-06-01

Status: Closed
Project: MariaDB Server
Component/s: Storage Engine - Connect
Affects Version/s: 10.1.20
Fix Version/s: 10.1.25, 10.0.32, 10.2.7

Type: Bug Priority: Major
Reporter: Geoff Montee (Inactive) Assignee: Olivier Bertrand
Resolution: Fixed Votes: 0
Labels: connect-engine, crash, json


 Description   

A user using the CONNECT engine with JSON table types is seeing the following crash:

170508 21:25:20 [ERROR] mysqld got signal 7 ;
This could be because you hit a bug. It is also possible that this binary
or one of the libraries it was linked against is corrupt, improperly built,
or misconfigured. This error can also be caused by malfunctioning hardware.
 
To report this bug, see https://mariadb.com/kb/en/reporting-bugs
 
We will try our best to scrape up some info that will hopefully help
diagnose the problem, but since we have already crashed, 
something is definitely wrong and this may fail.
 
Server version: 10.1.20-MariaDB
key_buffer_size=1048576
read_buffer_size=262144
max_used_connections=280
max_threads=502
thread_count=223
It is possible that mysqld could use up to 
key_buffer_size + (read_buffer_size + sort_buffer_size)*max_threads = 653874 K  bytes of memory
Hope that's ok; if not, decrease some variables in the equation.
 
Thread pointer: 0x0x7fd5151af008
Attempting backtrace. You can use the following information to find out
where mysqld died. If you see no messages after this, something went
terribly wrong...
stack_bottom = 0x7fd18509bd40 thread_stack 0x3c000
/usr/sbin/mysqld(my_print_stacktrace+0x2b)[0x7fd7bdb2e5cb]
/usr/sbin/mysqld(handle_fatal_signal+0x4d5)[0x7fd7bd687375]
/lib64/libpthread.so.0(+0xf7e0)[0x7fd7bcc887e0]
/usr/lib64/mysql/plugin/ha_connect.so(_Z11ParseStringP7_globalRiR4STRG+0x67)[0x7fd7b7531947]
/usr/lib64/mysql/plugin/ha_connect.so(_Z10ParseValueP7_globalRiR4STRGPb+0x226)[0x7fd7b7532d96]
/usr/lib64/mysql/plugin/ha_connect.so(_Z11ParseObjectP7_globalRiR4STRGPb+0x287)[0x7fd7b7533127]
/usr/lib64/mysql/plugin/ha_connect.so(_Z10ParseValueP7_globalRiR4STRGPb+0xdc)[0x7fd7b7532c4c]
/usr/lib64/mysql/plugin/ha_connect.so(_Z10ParseArrayP7_globalRiR4STRGPb+0xc9)[0x7fd7b7532929]
/usr/lib64/mysql/plugin/ha_connect.so(_Z9ParseJsonP7_globalPciPiPb+0x280)[0x7fd7b7533480]
/usr/lib64/mysql/plugin/ha_connect.so(_ZN7TDBJSON12MakeDocumentEP7_global+0xa6)[0x7fd7b755aaa6]
/usr/lib64/mysql/plugin/ha_connect.so(_ZN7TDBJSON11CardinalityEP7_global+0x36)[0x7fd7b755b196]
/usr/lib64/mysql/plugin/ha_connect.so(_Z7CntInfoP7_globalP3TDBP6_xinfo+0xfc)[0x7fd7b750d03c]
/usr/lib64/mysql/plugin/ha_connect.so(_ZN10ha_connect4infoEj+0x2c4)[0x7fd7b7501154]
/usr/sbin/mysqld(+0x4c1c69)[0x7fd7bd57ac69]
/usr/sbin/mysqld(+0x4b5147)[0x7fd7bd56e147]
/usr/sbin/mysqld(_Z14get_all_tablesP3THDP10TABLE_LISTP4Item+0x81e)[0x7fd7bd58357e]
/usr/sbin/mysqld(_Z24get_schema_tables_resultP4JOIN23enum_schema_table_state+0x28e)[0x7fd7bd5810ce]
/usr/sbin/mysqld(_ZN4JOIN10exec_innerEv+0x6b5)[0x7fd7bd568d85]
/usr/sbin/mysqld(_ZN4JOIN4execEv+0x5d)[0x7fd7bd56b19d]
/usr/sbin/mysqld(_Z12mysql_selectP3THDPPP4ItemP10TABLE_LISTjR4ListIS1_ES2_jP8st_orderSB_S2_SB_yP13select_resultP18st_select_lex_unitP13st_select_lex+0x12a)[0x7fd7bd567b6a]
/usr/sbin/mysqld(_Z13handle_selectP3THDP3LEXP13select_resultm+0x25d)[0x7fd7bd56b47d]
/usr/sbin/mysqld(+0x451e32)[0x7fd7bd50ae32]
/usr/sbin/mysqld(_Z21mysql_execute_commandP3THD+0x5d8c)[0x7fd7bd5170fc]
/usr/sbin/mysqld(_Z11mysql_parseP3THDPcjP12Parser_state+0x334)[0x7fd7bd51a6b4]
/usr/sbin/mysqld(_Z16dispatch_command19enum_server_commandP3THDPcj+0x2293)[0x7fd7bd51d183]
/usr/sbin/mysqld(_Z10do_commandP3THD+0x16b)[0x7fd7bd51d71b]
/usr/sbin/mysqld(_Z24do_handle_one_connectionP3THD+0x17f)[0x7fd7bd5dbfff]
/usr/sbin/mysqld(handle_one_connection+0x47)[0x7fd7bd5dc157]
/lib64/libpthread.so.0(+0x7aa1)[0x7fd7bcc80aa1]
/lib64/libc.so.6(clone+0x6d)[0x7fd7bb164bcd]
 
Trying to get some variables.
Some pointers may be invalid and cause the dump to abort.
Query (0x7fd513f83020): is an invalid pointer
Connection ID (thread ID): 62694
Status: NOT_KILLED
 
Optimizer switch: index_merge=on,index_merge_union=on,index_merge_sort_union=on,index_merge_intersection=on,index_merge_sort_intersection=off,engine_condition_pushdown=off,index_condition_pushdown=on,derived_merge=on,derived_with_keys=on,firstmatch=on,loosescan=on,materialization=on,in_to_exists=on,semijoin=on,partial_match_rowid_merge=on,partial_match_table_scan=on,subquery_cache=on,mrr=off,mrr_cost_based=off,mrr_sort_keys=off,outer_join_with_cache=on,semijoin_with_cache=on,join_cache_incremental=on,join_cache_hashed=on,join_cache_bka=on,optimize_join_buffer_size=off,table_elimination=on,extended_keys=on,exists_to_in=on,orderby_uses_equalities=off



 Comments   
Comment by Olivier Bertrand [ 2017-05-12 ]

Is this the same problem than the one reported in MDEV-12667 ?

If not, I cannot do anything without some basic informations:

The scenario causing the crash. Table type definition, command excuted and, if possible, the data file(s) accessed by the table.

Thanks.

Comment by Geoff Montee (Inactive) [ 2017-05-12 ]

Hi bertrandop,

I emailed you that information earlier this week. Please let me know if you need any other information.

Comment by Geoff Montee (Inactive) [ 2017-05-12 ]

No, this problem is separate from MDEV-12667, but both problems were encountered on the same server.

Comment by Olivier Bertrand [ 2017-05-13 ]

Hello Geoff,

With the table definitions and the json files I was able to reproduce the problem.

It is a memory problem. For the largest file app_change_log.json the default size of the connect work area (64M) is not enough. Increasing it by:

set connect_work_size=100000000;

was enough to suppress the problem.

Note that on my machine I just got an error message. No crash by signal 7.

I don't know why but I suspect that your version of connect was still using longjmp in case of error. There has been something weird with longjmp in this specific function like crashing when setting them. This is why I replaced all of them by try/catch.

Therefore, what should be done next is:

  1. Checking that this can be avoided by increasing the work memory size. This will be just a turnaround because the same can still occur with bigger files.
  2. Using a newer version of MariaDB and CONNECT to avoid crashing.
  3. If possible, use JSON files with the pretty=0 format because instead of parsing the whole file, each record is parsed when used.

I hope this helps,

Olivier

Comment by Geoff Montee (Inactive) [ 2017-05-25 ]

Hi bertrandop,

Using a newer version of MariaDB and CONNECT to avoid crashing.

The user saw a very similar crash in MariaDB 10.1.23, which is the latest MariaDB 10.1. release:

170525 10:38:04 [ERROR] mysqld got signal 11 ;
This could be because you hit a bug. It is also possible that this binary
or one of the libraries it was linked against is corrupt, improperly built,
or misconfigured. This error can also be caused by malfunctioning hardware.
 
To report this bug, see https://mariadb.com/kb/en/reporting-bugs
 
We will try our best to scrape up some info that will hopefully help
diagnose the problem, but since we have already crashed, 
something is definitely wrong and this may fail.
 
Server version: 10.1.23-MariaDB
key_buffer_size=1048576
read_buffer_size=262144
max_used_connections=197
max_threads=502
thread_count=93
It is possible that mysqld could use up to 
key_buffer_size + (read_buffer_size + sort_buffer_size)*max_threads = 653874 K  bytes of memory
Hope that's ok; if not, decrease some variables in the equation.
 
Thread pointer: 0x7f54b70a0008
Attempting backtrace. You can use the following information to find out
where mysqld died. If you see no messages after this, something went
terribly wrong...
stack_bottom = 0x7f56768b9d40 thread_stack 0x3c000
/usr/sbin/mysqld(my_print_stacktrace+0x2b)[0x7f5769dd1c8b]
/usr/sbin/mysqld(handle_fatal_signal+0x4d5)[0x7f576992cbf5]
/lib64/libpthread.so.0(+0xf7e0)[0x7f5768f2c7e0]
/usr/lib64/mysql/plugin/ha_connect.so(_Z10PlgGetUserP7_global+0x0)[0x7f576371f870]
/usr/lib64/mysql/plugin/ha_connect.so(_ZN10ha_connect4infoEj+0x248)[0x7f57636e1bc8]
/usr/sbin/mysqld(+0x4c2561)[0x7f576981f561]
/usr/sbin/mysqld(+0x4b664d)[0x7f576981364d]
/usr/sbin/mysqld(_Z14get_all_tablesP3THDP10TABLE_LISTP4Item+0x81d)[0x7f5769827fbd]
/usr/sbin/mysqld(_Z24get_schema_tables_resultP4JOIN23enum_schema_table_state+0x28e)[0x7f5769825b0e]
/usr/sbin/mysqld(_ZN4JOIN10exec_innerEv+0x6b5)[0x7f576980d455]
/usr/sbin/mysqld(_ZN4JOIN4execEv+0x5d)[0x7f576980f86d]
/usr/sbin/mysqld(_Z12mysql_selectP3THDPPP4ItemP10TABLE_LISTjR4ListIS1_ES2_jP8st_orderSB_S2_SB_yP13select_resultP18st_select_lex_unitP13st_select_lex+0x12a)[0x7f576980c22a]
/usr/sbin/mysqld(_Z13handle_selectP3THDP3LEXP13select_resultm+0x25d)[0x7f576980fb4d]
/usr/sbin/mysqld(+0x452de2)[0x7f57697afde2]
/usr/sbin/mysqld(_Z21mysql_execute_commandP3THD+0x5b3a)[0x7f57697bc0da]
/usr/sbin/mysqld(_Z11mysql_parseP3THDPcjP12Parser_state+0x324)[0x7f57697bf3c4]
/usr/sbin/mysqld(_Z16dispatch_command19enum_server_commandP3THDPcj+0x22a3)[0x7f57697c1e43]
/usr/sbin/mysqld(_Z10do_commandP3THD+0x151)[0x7f57697c23c1]
/usr/sbin/mysqld(_Z24do_handle_one_connectionP3THD+0x17f)[0x7f57698812cf]
/usr/sbin/mysqld(handle_one_connection+0x47)[0x7f5769881407]
/lib64/libpthread.so.0(+0x7aa1)[0x7f5768f24aa1]
/lib64/libc.so.6(clone+0x6d)[0x7f5767408bcd]
 
Trying to get some variables.
Some pointers may be invalid and cause the dump to abort.
Query (0x7f54bb13e020): is an invalid pointer
Connection ID (thread ID): 15314
Status: NOT_KILLED
 
Optimizer switch: index_merge=on,index_merge_union=on,index_merge_sort_union=on,index_merge_intersection=on,index_merge_sort_intersection=off,engine_condition_pushdown=off,index_condition_pushdown=on,derived_merge=on,derived_with_keys=on,firstmatch=on,loosescan=on,materialization=on,in_to_exists=on,semijoin=on,partial_match_rowid_merge=on,partial_match_table_scan=on,subquery_cache=on,mrr=off,mrr_cost_based=off,mrr_sort_keys=off,outer_join_with_cache=on,semijoin_with_cache=on,join_cache_incremental=on,join_cache_hashed=on,join_cache_bka=on,optimize_join_buffer_size=off,table_elimination=on,extended_keys=on,exists_to_in=on,orderby_uses_equalities=off

Is this the same crash or a slightly different case?

Comment by Olivier Bertrand [ 2017-05-26 ]

MariaDB 10.1.23 did not include the latest CONNECT version. I hope 10.1.24 will do it.

Was it possible to avoid the crash by increasing memory?

Comment by Geoff Montee (Inactive) [ 2017-05-26 ]

Ah, it's good to know that that fix might be in the next release.

I'm still trying to find out if increasing connect_work_size prevents the crash. They haven't answered that question yet.

Comment by Geoff Montee (Inactive) [ 2017-05-30 ]

The user said that they have converted all JSON tables in the database to use pretty=0, and they have been setting connect_work_size, but they are still seeing the above signal 11 crash in MariaDB 10.1.23.

Comment by Olivier Bertrand [ 2017-05-31 ]

Indeed, this crash is different and probably not related to JSON at all. The described first one was a signal 7 occuring when parsing a JSON file and most probably a memory problem (bus error or bad address).

The new one is a signal 11 (segmentation fault) occuring in PlgGetUser called by Info. This is a very small function using its pointer argument g to return another pointer. In theory, it could not fail except if given a NULL or wrong pointer g. Until now, this never happen on any machine but who knows.. Murphy's laws are there to tell us the worst is always possible.

Therefore I shall fix that and it will be available in future releases.

Comment by Geoff Montee (Inactive) [ 2017-05-31 ]

Ah, I see. Thanks for fixing it!

Comment by Olivier Bertrand [ 2017-06-01 ]

I hope this will fix it. However, not being able to reproduce it, I cannot be sure.

This will hopefully happen in MariaDB next versions from do day.

Generated at Thu Feb 08 08:00:15 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.