[MDEV-28556] segfault at 78 ip 0000556abe28aee5 sp 00007f42740812f0 error 4 in mysqld[556abdc91000+1121000] Created: 2022-05-13  Updated: 2022-05-16  Resolved: 2022-05-13

Status: Closed
Project: MariaDB Server
Component/s: Optimizer - CTE
Affects Version/s: 10.2.10
Fix Version/s: N/A

Type: Bug Priority: Major
Reporter: Luke Assignee: Unassigned
Resolution: Duplicate Votes: 0
Labels: None
Environment:

CentOS Linux release 7.9.2009 (Core)


Issue Links:
Duplicate

 Description   

One of our production MariaDB instances restarted frequently when running some queries and I found these logs in /var/log/messages:

May 12 15:39:01 prod-looker-db-lv-101 kernel: [11020058.169004] mysqld[10431]: segfault at 78 ip 0000556abe28aee5 sp 00007f42740812f0 error 4 in mysqld[556abdc91000+1121000]
May 12 15:40:32 prod-looker-db-lv-101 kernel: [11020148.639296] mysqld[21778]: segfault at 78 ip 000055e1baa31ee5 sp 00007fe0b8d172f0 error 4 in mysqld[55e1ba438000+1121000]
May 12 15:54:59 prod-looker-db-lv-101 kernel: [11021015.856639] mysqld[25739]: segfault at 78 ip 000055d9c4d3aee5 sp 00007f60633652f0 error 4 in mysqld[55d9c4741000+1121000]
May 13 09:26:50 prod-looker-db-lv-101 kernel: [11084127.967062] mysqld[28518]: segfault at 78 ip 000055b3655b8ee5 sp 00007f66f85622f0 error 4 in mysqld[55b364fbf000+1121000]
May 13 09:43:27 prod-looker-db-lv-101 kernel: [11085125.330421] mysqld[10395]: segfault at 78 ip 0000556a65d1bee5 sp 00007f35739922f0 error 4 in mysqld[556a65722000+1121000]
May 13 09:45:42 prod-looker-db-lv-101 kernel: [11085260.650045] mysqld[15932]: segfault at 78 ip 000055a9ffdfbee5 sp 00007f81198a52f0 error 4 in mysqld[55a9ff802000+1121000]

I find some extra info in the error log:

Thread pointer: 0x7f33df812008
Attempting backtrace. You can use the following information to find out
where mysqld died. If you see no messages after this, something went
terribly wrong...
stack_bottom = 0x7f66f8564d00 thread_stack 0x49000
/usr/sbin/mysqld(my_print_stacktrace+0x2e)[0x55b365baaa2e]
/usr/sbin/mysqld(handle_fatal_signal+0x30d)[0x55b365627b6d]
/lib64/libpthread.so.0(+0xf630)[0x7f67231a9630]
/usr/sbin/mysqld(_ZN11With_clause14find_table_defEP10TABLE_LISTP12With_element+0x35)[0x55b3655b8ee5]
/usr/sbin/mysqld(_Z30find_table_def_in_with_clausesP10TABLE_LISTP17st_unit_ctxt_elem+0x59)[0x55b3655b8f89]
/usr/sbin/mysqld(_ZN12With_element28check_dependencies_in_selectEP13st_select_lexP17st_unit_ctxt_elembPy+0x174)[0x55b3655b92e4]
/usr/sbin/mysqld(_ZN12With_element26check_dependencies_in_unitEP18st_select_lex_unitP17st_unit_ctxt_elembPy+0x7c)[0x55b3655b914c]
/usr/sbin/mysqld(_ZN12With_element33check_dependencies_in_with_clauseEP11With_clauseP17st_unit_ctxt_elembPy+0x48)[0x55b3655b90b8]
/usr/sbin/mysqld(_ZN12With_element26check_dependencies_in_unitEP18st_select_lex_unitP17st_unit_ctxt_elembPy+0x36)[0x55b3655b9106]
/usr/sbin/mysqld(_ZN12With_element28check_dependencies_in_selectEP13st_select_lexP17st_unit_ctxt_elembPy+0x104)[0x55b3655b9274]
/usr/sbin/mysqld(_ZN12With_element26check_dependencies_in_specEv+0x72)[0x55b3655b9392]
/usr/sbin/mysqld(_ZN11With_clause18check_dependenciesEv+0x2b)[0x55b3655b93eb]
/usr/sbin/mysqld(_Z34check_dependencies_in_with_clausesP11With_clause+0x3d)[0x55b3655b9e8d]
/usr/sbin/mysqld(+0x41545e)[0x55b3653d445e]
/usr/sbin/mysqld(_Z21mysql_execute_commandP3THD+0x6688)[0x55b36549d418]
/usr/sbin/mysqld(_Z11mysql_parseP3THDPcjP12Parser_statebb+0x2de)[0x55b36549ffde]
/usr/sbin/mysqld(_Z16dispatch_command19enum_server_commandP3THDPcjbb+0x202d)[0x55b3654a2f7d]
/usr/sbin/mysqld(_Z10do_commandP3THD+0x149)[0x55b3654a3b79]
/usr/sbin/mysqld(_Z24do_handle_one_connectionP7CONNECT+0x1aa)[0x55b365567b7a]
/usr/sbin/mysqld(handle_one_connection+0x3d)[0x55b365567c9d]
/usr/sbin/mysqld(+0x81e4ed)[0x55b3657dd4ed]
/lib64/libpthread.so.0(+0x7ea5)[0x7f67231a1ea5]
/lib64/libc.so.6(clone+0x6d)[0x7f67215419fd]
 
..
Query (0x7f33df840020): EXPLAIN WITH yarn_memory AS (WITH yarn_memory_ndt_test AS (WITH event_yarn AS (select * from looker.event_yarn where date(start_time)>'2022-05-08' ) SELECT     (TIMESTAMP(DATE(event_yarn2.start_time ))) AS `start_time_date`,         (event_yarn2.start_time ) AS `start_time_time`,         (event_yarn2.finished_time ) AS `end_time_time`,     substring_index(substring_index(substring_index(substring_index(substring_index(event_yarn2.name,'2019',1),'@',1),'/',1),':W=',-1),':',1)  AS `job_name`,     FLOOR((UNIX_TIMESTAMP(event_yarn2.finished_time)+300)/300) - floor(UNIX_TIMESTAMP(event_yarn2.start_time)/300)   AS `intervals`,         (date_add(date(event_yarn2.start_time),interval floor((TIME_TO_SEC(event_yarn2.start_time))/300)*300 second) ) AS `lower_bucket`,     (cast(event_yarn2.memory_seconds as decimal(20,7))/(event_yarn2.elapsed_time))/1024  AS `interval_memory`,     event_yarn2.elapsed_time/1000  AS `elapsed_seconds`,     event_yarn_intervals.seq AS `seq`,     COALESCE((0E0 +  ( SUM(DISTINCT (CAST(FLOOR(COALESCE( event_yarn2.memory_seconds  ,0)*(1000000*1.0)) AS DECIMAL(65,0))) + (CAST(CONV(SUBSTR(MD5( event_yarn2.id  ),1,16),16,10) AS DECIMAL(65)) *18446744073709551616 + CAST(CONV(SUBSTR(MD5( event_yarn2.id  ), 17, 16), 16, 10) AS DECIMAL(65))) ) - SUM(DISTINCT (CAST(CONV(SUBSTR(MD5( event_yarn2.id  ),1,16),16,10) AS DECIMAL(65)) *18446744073709551616 + CAST(CONV(SUBSTR(MD5( event_yarn2.id  ), 17, 16), 16, 10) AS DECIMAL(65)))) ) ) / (0E0 + (1000000*1.0)), 0) AS `memory_seconds` FROM event_yarn AS event_yarn2 LEFT JOIN looker.event_yarn_intervals  AS event_yarn_intervals ON (FLOOR((UNIX_TIMESTAMP(event_yarn2.finished_time)+300)/300) - floor(UNIX_TIMESTAMP(event_yarn2.start_time)/300))>event_yarn_intervals.seq  WHERE ((( event_yarn2.start_time  ) >= ((DATE_ADD(CURDATE(),INTERVAL -28 day))) AND ( event_yarn2.start_time  ) < ((DATE_ADD(DATE_ADD(CURDATE(),INTERVAL -28 day),INTERVAL 29 day))))) AND (if((substring_index(substring_index(substring_index(substring_index(substring_index(event_yarn2.name,'2019',1),'@',1),'/',1),':W=',-1),':',1)) in (select * from event_yarn_jobs ),1,0) ) = 1 GROUP BY     1,     2,     3,     4,     5,     6,     7,     8,     9) SELECT     (date_add(yarn_memory_ndt_test.lower_bucket,interval yarn_memory_ndt_test.seq*300 second) ) AS `lower_time_interval`,     COALESCE(SUM(yarn_memory_ndt_test.interval_memory), 0) AS `sum_memory_used`,     COALESCE(SUM(case when  yarn_memory_ndt_test.job_name  = 'BulkPoWriter' then  yarn_memory_ndt_test.interval_memory  else 0 end  ), 0) AS `sum_BulkPoWriter`,     COALESCE(SUM(case when  yarn_memory_ndt_test.job_name  = 'bid-post-process-for-bid-recommender-for-multiplier-adjusted-segment' then  yarn_memory_ndt_test.interval_memory  else 0 end  ), 0) AS `sum_bid_post_process`,     COALESCE(SUM(case when  yarn_memory_ndt_test.job_name  = 'marin-tracker-hourlysessions-oozie' then  yarn_memory_ndt_test.interval_memory  else 0 end  ), 0) AS `sum_marin_tracker_hourlysessions_oozie`,     COALESCE(SUM(CASE WHEN (yarn_memory_ndt_test.job_name LIKE 'test-marin-hybrid-odc-oozie-dim_kw_bid' OR yarn_memory_ndt_test.job_name LIKE 'test-marin-hybrid-odc-oozie-dim_kw' OR yarn_memory_ndt_test.job_name LIKE 'test-marin-hybrid-odc-oozie-dim_pt_bid' OR yarn_memory_ndt_test.job_name LIKE 'test-marin-hybrid-odc-oozie-dim_cr' OR yarn_memory_ndt_test.job_name LIKE 'test-marin-hybrid-odc-oozie-dim_pt' OR yarn_memory_ndt_test.job_name LIKE 'test-marin-hybrid-odc-oozie-dim_sp' OR yarn_memory_ndt_test.job_name LIKE 'test-marin-hybrid-odc-oozie-dim_gp' OR yarn_memory_ndt_test.job_name LIKE 'test-marin-hybrid-odc-oozie-dim_gp_bid' OR yarn_memory_ndt_test.job_name LIKE 'test-marin-hybrid-odc-oozie-dim_dt' OR yarn_memory_ndt_test.job_name LIKE 'test-marin-hybrid-odc-oozie-dim_sl' OR yarn_memory_ndt_test.job_name LIKE 'test-marin-hybrid-odc-oozie-dim_dt_bid' OR yarn_memory_ndt_test.job_name LIKE 'BulkPoWriter_AD_CREATIVES' OR yarn_memory_ndt_test.job_name LIKE 'BulkPoWriter_TRACKING_VALUES' OR yarn_memory_ndt_test.job_name LIKE 'BulkPoWriter_KEYWORDS' OR yarn_memory_ndt_test.job_name IN (...) )) THEN yarn_memory_ndt_test.interval_memory ELSE NULL END), 0) AS `sum_other` FROM yarn_memory_ndt_test GROUP BY     1) SELECT     (DATE(date((CAST(yarn_memory.lower_time_interval AS CHAR(19)))) )) AS `yarn_memory.date_key_date` FROM yarn_memory GROUP BY     1 ORDER BY     (DATE(date((CAST(yarn_memory.lower_time_interval AS CHAR(19)))) )) DESC LIMIT 500
 
 
Optimizer switch: index_merge=on,index_merge_union=on,index_merge_sort_union=on,index_merge_intersection=on,index_merge_sort_intersection=off,engine_condition_pushdown=off,index_condition_pushdown=on,derived_merge=on,derived_with_keys=on,firstmatch=on,loosescan=on,materialization=on,in_to_exists=on,semijoin=on,partial_match_rowid_merge=on,partial_match_table_scan=on,subquery_cache=on,mrr=off,mrr_cost_based=off,mrr_sort_keys=off,outer_join_with_cache=on,semijoin_with_cache=on,join_cache_incremental=on,join_cache_hashed=on,join_cache_bka=on,optimize_join_buffer_size=off,table_elimination=on,extended_keys=on,exists_to_in=on,orderby_uses_equalities=on,condition_pushdown_for_derived=on

Could someone know what's the root cause of this? Thanks in advance.



 Comments   
Comment by Daniel Black [ 2022-05-13 ]

can you include the show create table looker.event_yarn and all the other tables of the explain query.

Comment by Alice Sherepa [ 2022-05-13 ]

Thank you for the report!
This is the same bug as MDEV-28504. Please let us know if the problem will still exist even after MDEV-28504's fix, then the bug will be reopened and investigated further

Comment by Luke [ 2022-05-16 ]

@Alice Sherepa when I try to switch to MDEV-28504, the page says :

You can't view this issue
It may have been deleted or you don't have permission to view it.

Comment by Luke [ 2022-05-16 ]

@Daniel Black The table structure:

# event_yarn
MariaDB [looker]> show create table event_yarn\G
*************************** 1. row ***************************
       Table: event_yarn
Create Table: CREATE TABLE `event_yarn` (
  `id` text DEFAULT NULL,
  `name` text DEFAULT NULL,
  `application_type` text DEFAULT NULL,
  `final_status` text DEFAULT NULL,
  `state` text DEFAULT NULL,
  `memory_seconds` bigint(20) DEFAULT NULL,
  `vcore_seconds` bigint(20) DEFAULT NULL,
  `queue` text DEFAULT NULL,
  `user` text DEFAULT NULL,
  `start_time` datetime DEFAULT NULL,
  `finished_time` datetime DEFAULT NULL,
  `elapsed_time` bigint(20) DEFAULT NULL,
  `derived_name` text DEFAULT NULL
) ENGINE=InnoDB DEFAULT CHARSET=utf8
1 row in set (0.00 sec)
 
# event_yarn_jobs
MariaDB [looker]> show create table event_yarn_jobs\G
*************************** 1. row ***************************
       Table: event_yarn_jobs
Create Table: CREATE TABLE `event_yarn_jobs` (
  `job_name` varchar(255) NOT NULL,
  PRIMARY KEY (`job_name`),
  KEY `job_name` (`job_name`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4
1 row in set (0.00 sec)
 
# event_yarn_intervals
MariaDB [looker]> show create table event_yarn_intervals\G
*************************** 1. row ***************************
       Table: event_yarn_intervals
Create Table: CREATE TABLE `event_yarn_intervals` (
  `seq` bigint(20) unsigned NOT NULL,
  KEY `seq` (`seq`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4
1 row in set (0.00 sec)

Comment by Daniel Black [ 2022-05-16 ]

Thanks lujinke. MDEV-28504 (and this but a little less so), provide a way to crash a running instance, so we're keeping it private until a fix is released upon which it will be opened up again.

Generated at Thu Feb 08 10:01:39 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.