[MDEV-18539] WITH RECURSIVE queries crash server (Galera cluster) Created: 2019-02-11  Updated: 2019-05-27  Resolved: 2019-05-27

Status: Closed
Project: MariaDB Server
Component/s: Server
Affects Version/s: 10.3.12
Fix Version/s: N/A

Type: Bug Priority: Critical
Reporter: Konstantin Vasserman Assignee: Alice Sherepa
Resolution: Incomplete Votes: 0
Labels: need_feedback
Environment:

Ubuntu 16.04 and 18.04


Attachments: Text File WITH-RECURSIVE-CREATE-TABLES.txt    

 Description   

Ever since we've upgraded to 10.3 (from 10.2) we are experiencing server crashes on WITH RECURSIVE statements. We've seen it with various WITH RECURSIVE queries and it seems to be data dependent. Unfortunately, we can't provide the data for your testing.

This is super critical for us, because it's affecting our production environment (unfortunately we didn't detect this in testing, since it's data specific) with Galera nodes randomly crashing. Yesterday, it took down the entire cluster.

Please help.

Here is an example of logs from the latest crash:

-------------------------------------
Feb 11 14:54:27 maria2 mysqld[25623]: 190211 14:54:27 [ERROR] mysqld got signal 11 ;
Feb 11 14:54:27 maria2 mysqld[25623]: This could be because you hit a bug. It is also possible that this binary
Feb 11 14:54:27 maria2 mysqld[25623]: or one of the libraries it was linked against is corrupt, improperly built,
Feb 11 14:54:27 maria2 mysqld[25623]: or misconfigured. This error can also be caused by malfunctioning hardware.
Feb 11 14:54:27 maria2 mysqld[25623]: To report this bug, see https://mariadb.com/kb/en/reporting-bugs
Feb 11 14:54:27 maria2 mysqld[25623]: We will try our best to scrape up some info that will hopefully help
Feb 11 14:54:27 maria2 mysqld[25623]: diagnose the problem, but since we have already crashed,
Feb 11 14:54:27 maria2 mysqld[25623]: something is definitely wrong and this may fail.
Feb 11 14:54:27 maria2 mysqld[25623]: Server version: 10.3.12-MariaDB-1:10.3.12+maria~bionic-log
Feb 11 14:54:27 maria2 mysqld[25623]: key_buffer_size=134217728
Feb 11 14:54:27 maria2 mysqld[25623]: read_buffer_size=2097152
Feb 11 14:54:27 maria2 mysqld[25623]: max_used_connections=4310
Feb 11 14:54:27 maria2 mysqld[25623]: max_threads=10002
Feb 11 14:54:27 maria2 mysqld[25623]: thread_count=3874
Feb 11 14:54:27 maria2 mysqld[25623]: It is possible that mysqld could use up to
Feb 11 14:54:27 maria2 mysqld[25623]: key_buffer_size + (read_buffer_size + sort_buffer_size)*max_threads = 61806351 K bytes of memory
Feb 11 14:54:27 maria2 mysqld[25623]: Hope that's ok; if not, decrease some variables in the equation.
Feb 11 14:54:27 maria2 mysqld[25623]: Thread pointer: 0x7f9808cfb0a8
Feb 11 14:54:27 maria2 mysqld[25623]: Attempting backtrace. You can use the following information to find out
Feb 11 14:54:27 maria2 mysqld[25623]: where mysqld died. If you see no messages after this, something went
Feb 11 14:54:27 maria2 mysqld[25623]: terribly wrong...
Feb 11 14:54:27 maria2 mysqld[25623]: stack_bottom = 0x7f973a702dd8 thread_stack 0x49000
Feb 11 14:54:27 maria2 mysqld[25623]: /usr/sbin/mysqld(my_print_stacktrace+0x2e)[0x563c179ceffe]
Feb 11 14:54:27 maria2 mysqld[25623]: /usr/sbin/mysqld(handle_fatal_signal+0x5a5)[0x563c1746f8e5]
Feb 11 14:54:28 maria2 mysqld[25623]: /lib/x86_64-linux-gnu/libpthread.so.0(+0x12890)[0x7f9f01d75890]
Feb 11 14:54:28 maria2 mysqld[25623]: /usr/sbin/mysqld(+0x9aa7a7)[0x563c176d47a7]
Feb 11 14:54:28 maria2 mysqld[25623]: /usr/sbin/mysqld(+0x8d1ac4)[0x563c175fbac4]
Feb 11 14:54:28 maria2 mysqld[25623]: /usr/sbin/mysqld(_ZN7handler17ha_index_read_mapEPhPKhm16ha_rkey_function+0x118)[0x563c17474b08]
Feb 11 14:54:28 maria2 mysqld[25623]: /usr/sbin/mysqld(+0x5b0f68)[0x563c172daf68]
Feb 11 14:54:29 maria2 mysqld[25623]: /usr/sbin/mysqld(_Z10sub_selectP4JOINP13st_join_tableb+0x151)[0x563c172cd8e1]
Feb 11 14:54:29 maria2 mysqld[25623]: /usr/sbin/mysqld(_ZN4JOIN10exec_innerEv+0x946)[0x563c172ec3d6]
Feb 11 14:54:29 maria2 mysqld[25623]: /usr/sbin/mysqld(_ZN4JOIN4execEv+0x33)[0x563c172ec723]
Feb 11 14:54:29 maria2 mysqld[25623]: /usr/sbin/mysqld(_Z12mysql_selectP3THDP10TABLE_LISTjR4ListI4ItemEPS4_jP8st_orderS9_S7_S9_yP13select_resultP18st_select_lex_unitP13st_select_lex+0xeb)[0x563c172ec86b]
Feb 11 14:54:29 maria2 mysqld[25623]: 2019-02-11 14:54:29 44 [Warning] Aborted connection 44 to db: 'unconnected' user: 'maxscale-svc' host: 'maxscale1.c.e2shopcloud.internal' (Got an error reading communication packets)
Feb 11 14:54:30 maria2 mysqld[25623]: /usr/sbin/mysqld(_Z18mysql_derived_fillP3THDP3LEXP10TABLE_LIST+0x1cf)[0x563c1725ff7f]
Feb 11 14:54:30 maria2 mysqld[25623]: /usr/sbin/mysqld(_Z27mysql_handle_single_derivedP3LEXP10TABLE_LISTj+0x9a)[0x563c1725fb8a]
Feb 11 14:54:30 maria2 mysqld[25623]: /usr/sbin/mysqld(_ZN13st_join_table12preread_initEv+0x68)[0x563c172cd6b8]
Feb 11 14:54:30 maria2 mysqld[25623]: /usr/sbin/mysqld(_Z10sub_selectP4JOINP13st_join_tableb+0x460)[0x563c172cdbf0]
Feb 11 14:54:30 maria2 mysqld[25623]: /usr/sbin/mysqld(+0x599308)[0x563c172c3308]
Feb 11 14:54:31 maria2 mysqld[25623]: /usr/sbin/mysqld(_Z10sub_selectP4JOINP13st_join_tableb+0x170)[0x563c172cd900]
Feb 11 14:54:31 maria2 mysqld[25623]: /usr/sbin/mysqld(+0x599308)[0x563c172c3308]
Feb 11 14:54:31 maria2 mysqld[25623]: /usr/sbin/mysqld(_Z10sub_selectP4JOINP13st_join_tableb+0x170)[0x563c172cd900]
Feb 11 14:54:31 maria2 mysqld[25623]: /usr/sbin/mysqld(_ZN4JOIN10exec_innerEv+0x946)[0x563c172ec3d6]
Feb 11 14:54:31 maria2 mysqld[25623]: /usr/sbin/mysqld(_ZN4JOIN4execEv+0x33)[0x563c172ec723]
Feb 11 14:54:32 maria2 mysqld[25623]: /usr/sbin/mysqld(_Z12mysql_selectP3THDP10TABLE_LISTjR4ListI4ItemEPS4_jP8st_orderS9_S7_S9_yP13select_resultP18st_select_lex_unitP13st_select_lex+0xeb)[0x563c172ec86b]
Feb 11 14:54:32 maria2 mysqld[25623]: /usr/sbin/mysqld(_Z13handle_selectP3THDP3LEXP13select_resultm+0x14d)[0x563c172ed1dd]
Feb 11 14:54:32 maria2 mysqld[25623]: /usr/sbin/mysqld(+0x564541)[0x563c1728e541]
Feb 11 14:54:32 maria2 mysqld[25623]: /usr/sbin/mysqld(_Z21mysql_execute_commandP3THD+0x5ec2)[0x563c1729ad62]
Feb 11 14:54:32 maria2 mysqld[25623]: /usr/sbin/mysqld(_Z11mysql_parseP3THDPcjP12Parser_statebb+0x20a)[0x563c1729d92a]
Feb 11 14:54:32 maria2 mysqld[25623]: 2019-02-11 14:54:32 48990 [Warning] Aborted connection 48990 to db: 'unconnected' user: 'maxscale-svc' host: 'maxscale1.c.e2shopcloud.internal' (Got an error reading communication packets)
Feb 11 14:54:33 maria2 mysqld[25623]: 2019-02-11 14:54:33 45 [Warning] Aborted connection 45 to db: 'unconnected' user: 'maxscale-svc' host: 'maxscale2.c.e2shopcloud.internal' (Got an error reading communication packets)
Feb 11 14:54:33 maria2 mysqld[25623]: /usr/sbin/mysqld(+0x5741c6)[0x563c1729e1c6]
Feb 11 14:54:33 maria2 mysqld[25623]: /usr/sbin/mysqld(_Z16dispatch_command19enum_server_commandP3THDPcjbb+0x130c)[0x563c1729f96c]
Feb 11 14:54:33 maria2 mysqld[25623]: /usr/sbin/mysqld(_Z10do_commandP3THD+0x178)[0x563c172a1098]
Feb 11 14:54:33 maria2 mysqld[25623]: /usr/sbin/mysqld(_Z24do_handle_one_connectionP7CONNECT+0x212)[0x563c173708f2]
Feb 11 14:54:34 maria2 mysqld[25623]: /usr/sbin/mysqld(handle_one_connection+0x3d)[0x563c17370acd]
Feb 11 14:54:34 maria2 mysqld[25623]: /lib/x86_64-linux-gnu/libpthread.so.0(+0x76db)[0x7f9f01d6a6db]
Feb 11 14:54:34 maria2 mysqld[25623]: /lib/x86_64-linux-gnu/libc.so.6(clone+0x3f)[0x7f9f0136c88f]
Feb 11 14:54:34 maria2 mysqld[25623]: Trying to get some variables.
Feb 11 14:54:34 maria2 mysqld[25623]: Some pointers may be invalid and cause the dump to abort.
Feb 11 14:54:34 maria2 mysqld[25623]: Query (0x7f9808ca7960): WITH RECURSIVE allJobsWithSubs AS ( SELECT OrderDet.JobNo, OrderDet.MasterJobNo, OrderDet.OrderDet_ID FROM OrderDet WHERE (OrderDet.MasterJobNo IS NULL OR OrderDet.MasterJobNo = '' OR OrderDet.JobNo IN (SELECT ReWorkJobNo FROM CustReturnDet) OR OrderDet.MasterJobNo IN (SELECT ReworkJobNo FROM CustReturnDet)) AND OrderDet.Status = 'OPEN' AND OrderDet.ScheduleLocked = 'N' AND OrderDet.JobNo <> '' AND OrderDet.JobNo IS NOT NULL UNION ALL SELECT sub.JobNo, sub.MasterJobNo, sub.OrderDet_ID FROM OrderDet AS sub, allJobsWithSubs AS jobs WHERE sub.MasterJobNo = jobs.JobNo ), distinctJobsWithParts AS ( SELECT DISTINCT OrderDet.JobNo, OrderDet.PartNo FROM OrderDet INNER JOIN allJobsWithSubs ON OrderDet.JobNo = allJobsWithSubs.JobNo ) SELECT OrderDet.*, IFNULL(qtyFromStock.TotQtyPosted1, 0) AS QtyPostedFromStock, IFNULL(qtyFromOtherJob.TotQty, 0) AS QtyMadeOnAnotherJob FROM OrderDet INNER JOIN distinctJobsWithParts ON OrderDet.JobNo = distinctJobsWithParts.JobNo LEFT JOIN ( SELECT JobMaterials.JobNo, JobMaterials.PartNo, SUM(IFNULL(QtyPosted1,0)) AS TotQtyPosted1 FROM JobMaterials INNER JOIN distinctJobsWithParts ON distinctJobsWithParts.JobNo = JobMaterials.JobNo AND distinctJobsWithParts.PartNo = JobMaterials.PartNo WHERE PostedFromStock = 'Y' GROUP BY JobMaterials.JobNo, JobMaterials.PartNo ) AS qtyFromStock ON distinctJobsWithParts.JobNo = qtyFromStock.JobNo LEFT JOIN ( SELECT JobNo, SUM(IFNULL(Qty, 0)) AS TotQty FROM Releases WHERE MfgJobNo IS NOT NULL AND MfgJobNo <> '' AND DelType <> 2 AND DateComplete IS NULL GROUP BY JobNo ) AS qtyFromOtherJob ON distinctJobsWithParts.JobNo = qtyFromOtherJob.JobNo ORDER BY IFNULL(OrderDet.MasterJobNo, ''), OrderDet.Priority DESC, OrderDet.DueDate ASC
Feb 11 14:54:34 maria2 mysqld[25623]: Connection ID (thread ID): 34065
Feb 11 14:54:34 maria2 mysqld[25623]: Status: NOT_KILLED
Feb 11 14:54:34 maria2 mysqld[25623]: Optimizer switch: index_merge=on,index_merge_union=on,index_merge_sort_union=on,index_merge_intersection=on,index_merge_sort_intersection=off,engine_condition_pushdown=off,index_condition_pushdown=on,derived_merge=on,derived_with_keys=on,firstmatch=on,loosescan=on,materialization=on,in_to_exists=on,semijoin=on,partial_match_rowid_merge=on,partial_match_table_scan=on,subquery_cache=on,mrr=off,mrr_cost_based=off,mrr_sort_keys=off,outer_join_with_cache=on,semijoin_with_cache=on,join_cache_incremental=on,join_cache_hashed=on,join_cache_bka=on,optimize_join_buffer_size=off,table_elimination=on,extended_keys=on,exists_to_in=on,orderby_uses_equalities=on,condition_pushdown_for_derived=on,split_materialized=on
Feb 11 14:54:34 maria2 mysqld[25623]: The manual page at http://dev.mysql.com/doc/mysql/en/crashing.html contains
Feb 11 14:54:34 maria2 mysqld[25623]: information that should help you find out what is causing the crash.
Feb 11 14:54:34 maria2 kernel: mysqld[12769]: segfault at fffffffffffffffb ip 0000563c176d47a7 sp 00007f973a6ff1c0 error 5
Feb 11 14:54:36 maria2 systemd[1]: mariadb.service: Main process exited, code=dumped, status=11/SEGV
Feb 11 14:54:36 maria2 systemd[1]: mariadb.service: Failed with result 'core-dump'.

-------------------------------------



 Comments   
Comment by Alice Sherepa [ 2019-02-13 ]

Could you please provide the output of SHOW CREATE TABLE for all involved tables(OrderDet, CustReturnDet, JobMaterials, Releases) and your .cnf file(s)

Comment by Konstantin Vasserman [ 2019-02-14 ]

Alice,

I've attached the file with table definitions. We do not modify default my.cnf, instead we override settings in a file we drop into /etc/mysql/conf.d/. Here are the settings we override:
------------------------------------------------
[mysqld]
bind-address = 0.0.0.0
server-id = 201
max_connections = 10000
datadir = /data/maria
tmpdir = /data/temp
slow_query_log_file = /data/logs/mariadb-slow.log
log_bin = /data/logs/mariadb-bin
log_bin_index = /data/logs/mariadb-bin.index
expire_logs_days = 14
log_slave_updates
innodb_buffer_pool_size = 18G
innodb_buffer_pool_instances = 32
innodb_strict_mode = 1
binlog_format = ROW
default-storage-engine = innodb
log_bin_trust_function_creators = 1
wait_timeout=28800
max_allowed_packet=33554432

max-statement-time = 300
collation-server = 'utf8mb4_general_ci'
character-set-server = 'utf8mb4'
---------------------------------------------------------

We had major cluster disaster again yesterday due to this bug. We had to restart the cluster. Today we've noticed the following reports in system logs. They seem to be related to the issue, because they also deal with WITH RECURSIVE statements, but these particular queries do not crash server (yet). Here is an example of what we are seeing today (I redacted database name and user name for security reasons):

-------------------------------------------------
Feb 14 14:11:08 maria2 mysqld[1124]: 2019-02-14 14:11:08 136577 [ERROR] InnoDB: Clustered record for sec rec not found index `IX_JobMaterials_JobNo_PartNo` of table `MYDATABASE`.`JobMaterials`
Feb 14 14:11:08 maria2 mysqld[1124]: InnoDB: sec index record PHYSICAL RECORD: n_fields 3; compact format; info bits 0
Feb 14 14:11:08 maria2 mysqld[1124]: 0: len 30; hex 800a5aee00007fb46f70a0000144860164343235343834323534382d3031; asc Z op D d4254842548-01; (total 255 bytes);
Feb 14 14:11:08 maria2 mysqld[1124]: 1: len 0; hex ; asc ;;
Feb 14 14:11:08 maria2 mysqld[1124]: 2: len 4; hex 10000138; asc 8;;
Feb 14 14:11:08 maria2 mysqld[1124]: InnoDB: clust index record PHYSICAL RECORD: n_fields 1; compact format; info bits 0
Feb 14 14:11:08 maria2 mysqld[1124]: 0: len 8; hex 696e66696d756d00; asc infimum ;;
Feb 14 14:11:08 maria2 mysqld[1124]: TRANSACTION 2181418199, ACTIVE 1 sec starting index read
Feb 14 14:11:08 maria2 mysqld[1124]: mysql tables in use 7, locked 0
Feb 14 14:11:08 maria2 mysqld[1124]: 11 lock struct(s), heap size 1136, 5 row lock(s), undo log entries 10
Feb 14 14:11:08 maria2 mysqld[1124]: MySQL thread id 136577, OS thread handle 140568439531264, query id 80857204 maxscale.c.mycloud.internal 10.128.0.14 MYUSER Sending data
Feb 14 14:11:08 maria2 mysqld[1124]: WITH RECURSIVE
Feb 14 14:11:08 maria2 mysqld[1124]: allJobsWithSubs AS (
Feb 14 14:11:08 maria2 mysqld[1124]: #015
Feb 14 14:11:08 maria2 mysqld[1124]: #011#011#011#011SELECT
Feb 14 14:11:08 maria2 mysqld[1124]: #011#011#011#011#011OrderDet.JobNo,
Feb 14 14:11:08 maria2 mysqld[1124]: #011#011#011#011#011OrderDet.MasterJobNo,
Feb 14 14:11:08 maria2 mysqld[1124]: #011#011#011#011#011OrderDet.OrderDet_ID
Feb 14 14:11:08 maria2 mysqld[1124]: #011#011#011#011FROM OrderDet
Feb 14 14:11:08 maria2 mysqld[1124]: #011#011#011#011WHERE OrderDet.JobNo = '42789-10'
Feb 14 14:11:08 maria2 mysqld[1124]: UNION ALL
Feb 14 14:11:08 maria2 mysqld[1124]: #011#011#011#011SELECT
Feb 14 14:11:08 maria2 mysqld[1124]: #011#011#011#011#011sub.JobNo,
Feb 14 14:11:08 maria2 mysqld[1124]: #011#011#011#011#011sub.MasterJobNo,
Feb 14 14:11:08 maria2 mysqld[1124]: #011#011#011#011#011sub.OrderDet_ID
Feb 14 14:11:08 maria2 mysqld[1124]: #011#011#011#011FROM OrderDet AS sub, allJobsWithSubs AS jobs
Feb 14 14:11:08 maria2 mysqld[1124]: #011#011#011#011WHERE sub.MasterJobNo = jobs.JobNo
Feb 14 14:11:08 maria2 mysqld[1124]: #011#011#011),
Feb 14 14:11:08 maria2 mysqld[1124]: #015
Feb 14 14:11:08 maria2 mysqld[1124]: #011#011#011#011distinctJobsWithParts AS (
Feb 14 14:11:08 maria2 mysqld[1124]: #011#011#011#011#011SELECT DISTINCT OrderDet.JobNo, OrderDet.PartNo
Feb 14 14:11:08 maria2 mysqld[1124]: #011#011#011#011#011FROM OrderDet
Feb 14 14:11:08 maria2 mysqld[1124]: #011#011#011#011#011INNER JOIN allJobsWithSubs ON OrderDet.JobNo = allJobsWithSubs.JobNo
Feb 14 14:11:08 maria2 mysqld[1124]: #011#011#011)
Feb 14 14:11:08 maria2 mysqld[1124]: #015
Feb 14 14:11:08 maria2 mysqld[1124]: #011#011#011#011SELECT
Feb 14 14:11:08 maria2 mysqld[1124]: #011#011#011#011#011OrderDet.*,
Feb 14 14:11:08 maria2 mysqld[1124]: #011#011#011#011#011IFNULL(qt
Feb 14 14:11:08 maria2 mysqld[1124]: InnoDB: Submit a detailed bug report to https://jira.mariadb.org/
-------------------------------------------------

Comment by Alice Sherepa [ 2019-04-25 ]

kvasserman, sorry, I can not reproduce the crash, maybe it depends on data. Is it still reproducible in your environment?

Generated at Thu Feb 08 08:44:53 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.