[MDEV-13552] Bug: Server Shutsdown when using LEAD Window Function within a CASE Function Created: 2017-08-16  Updated: 2019-03-27  Resolved: 2019-03-27

Status: Closed
Project: MariaDB Server
Component/s: Optimizer - Window functions
Affects Version/s: 10.2.6
Fix Version/s: N/A

Type: Bug Priority: Major
Reporter: Juan Telleria Assignee: Vicențiu Ciorbaru
Resolution: Duplicate Votes: 0
Labels: None
Environment:

Windows


Attachments: Microsoft Word MDEV-13552.xlsx     Text File NGALARRETA-P700-error.log     File mdev13552.test     File my.ini    
Issue Links:
Duplicate
duplicates MDEV-13649 Server crashes in set_field_to_null_w... Closed
Relates
relates to MDEV-13683 crash in Item_window_func::update_use... Closed
relates to MDEV-12851 Window functions with is null functio... Closed
relates to MDEV-13774 Server Crash on Execute of SQL Statement Closed

 Description   

When executing the following query, the Server shutdown:

CREATE OR REPLACE TABLE rel595.t3_scada_alarms_oper_mod_ots_states_cycles_2017_Events
ENGINE = Aria
  WITH t3_scada_alarms_oper_mod_ots_states_cycles_2017_aux
  AS
  (
    SELECT
      A.id_cycles,
      A.MachineID,
      A.MachineUT,
      A.ENVELOPE,
      A.Cycle_DownTime_State,
      A.Cycle_DownTime_State_Description,
      A.Cycle_DateTimeON,
      A.Cycle_DateTimeOFF,
      A.`Cycle_Time[sec]`,
      A.`Working Hours`,
      A.Cycle_Alarm_Code_OrderByAlarm,
      A.Cycle_Alarm_Description_OrderByAlarm,
      A.Cycle_First_Alarm_Code,
      A.Cycle_First_Alarm_Description,
      A.Cycle_Alarm_OrderByDateTimeON,
      A.Cycle_Alarm_Functional_Group,
      A.Cycle_Alarm_Turbine_State_Request,
      A.Cycle_Operation_Mode_Code,
      A.Cycle_Operation_Mode_Description,
      A.Cycle_Operation_Mode_Turbine_State_Request,
      A.Cycle_Final_Reset_Type,
      A.Cycle_Final_Reset_Criteria,
      A.Cycle_Final_Availability,
      A.Cycle_OT_NumOT,
      A.Cycle_OT_Description,
      A.Cycle_OT_AlarmCode,
      A.Cycle_OT_AlarmDescription,
      A.Cycle_OT_TypeCode,
      A.Cycle_OT_TypeDescription,
      A.Cycle_OT_Subtipo,
      A.Cycle_OT_LCNId,
      A.Cycle_OT_Planned,
      A.Cycle_OT_With_Consumption,
      A.Cycle_OT_Availability_Stop,
      B.Event_Code,
      B.Event_Description,
      B.DateTimeON
        AS Event_DateTimeON,
      B.DateTimeOFF
        AS Event_DateTimeOFF,
      MAX(B.DateTimeOFF) 
        OVER (PARTITION BY A.id_cycles)
        AS Event_Max_DateTimeOFF
    FROM
      rel595.t3_scada_alarms_oper_mod_ots_states_cycles_2017 AS A
      LEFT JOIN rel595.t1_scada_events_joined_dbvar_17543 AS B
        ON
          B.DateTimeON BETWEEN A.Cycle_DateTimeON AND A.Cycle_DateTimeOFF
          AND A.MachineUT = B.MachineUT
  )
    SELECT
      MachineID,
      MachineUT,
      ENVELOPE,
      Cycle_DownTime_State,
      Cycle_DownTime_State_Description,
      Cycle_DateTimeON,
      Cycle_DateTimeOFF,
      `Cycle_Time[sec]`,
      `Working Hours`,
      Cycle_Alarm_Code_OrderByAlarm,
      Cycle_Alarm_Description_OrderByAlarm,
      Cycle_First_Alarm_Code,
      Cycle_First_Alarm_Description,
      Cycle_Alarm_OrderByDateTimeON,
      Cycle_Alarm_Functional_Group,
      Cycle_Alarm_Turbine_State_Request,
      Cycle_Operation_Mode_Code,
      Cycle_Operation_Mode_Description,
      Cycle_Operation_Mode_Turbine_State_Request,
      Cycle_Final_Reset_Type,
      Cycle_Final_Reset_Criteria,
      Cycle_Final_Availability,
      Cycle_OT_NumOT,
      Cycle_OT_Description,
      Cycle_OT_AlarmCode,
      Cycle_OT_AlarmDescription,
      Cycle_OT_TypeCode,
      Cycle_OT_TypeDescription,
      Cycle_OT_Subtipo,
      Cycle_OT_LCNId,
      Cycle_OT_Planned,
      Cycle_OT_With_Consumption,
      Cycle_OT_Availability_Stop,
      CASE
        WHEN
          Event_Code IS NOT NULL
          AND NOT Cycle_Final_Reset_Criteria = 'LOCAL-WO'
          AND NOT Cycle_Alarm_Code_OrderByAlarm NOT REGEXP '903|904'
          -- Not Alarm Codes "903: Authoriz. person in turbine"
          -- nor "904:Presence of unauth. person in turbine".
        THEN
          1
        ELSE
          0
      END AS `WTG_Reset_by_SCADA_Command`,
      CASE
        WHEN
          Event_DateTimeOFF BETWEEN SUBTIME(Cycle_DateTimeOFF, '00:10:00') AND ADDTIME(Cycle_DateTimeOFF, '00:10:00')
          AND 
          (
            Event_Code IS NOT NULL
            AND NOT Cycle_Final_Reset_Criteria = 'LOCAL-WO'
            AND NOT Cycle_Alarm_Code_OrderByAlarm NOT REGEXP '903|904'
            -- Not Alarm Codes "903: Authoriz. person in turbine"
            -- nor "904:Presence of unauth. person in turbine".
          )
          AND 
          (
            ADDTIME(Cycle_DateTimeOFF, '00:30:00') < LEAD(Cycle_DateTimeON) OVER (PARTITION BY MachineID ORDER BY MachineID, Cycle_DateTimeON)
            OR (LEAD(Cycle_DateTimeON) OVER (PARTITION BY MachineID ORDER BY MachineID, Cycle_DateTimeON) IS NULL)
          )
        THEN
          1
        ELSE
          0
      END AS `WTG_SCADA_Command_Success_Offset10`,
      TIMESTAMPDIFF(MINUTE, Event_DateTimeOFF, Cycle_DateTimeOFF)
        AS DateTimeDiff_StateOFF_EventOFF,
      Event_Code,
      Event_Description,
      Event_DateTimeON,
      Event_DateTimeOFF
    FROM
      t3_scada_alarms_oper_mod_ots_states_cycles_2017_aux
    WHERE
      (Event_DateTimeOFF = Event_Max_DateTimeOFF)
      OR Event_DateTimeOFF IS NULL
    GROUP BY
      id_cycles
    ORDER BY
      MachineID,
      Cycle_DateTimeON

Thank you



 Comments   
Comment by Alice Sherepa [ 2017-08-17 ]

this looks similar to MDEV-12851 https://jira.mariadb.org/browse/MDEV-12851, please try MariaDB 10.2.7 or suggested workaround. If problem will still appear, please provide structures of the tables, that are involved in the query.

Comment by Juan Telleria [ 2017-08-25 ]

The bug still appears in MariaDB 10.2.8, and causes a Server Shutdown / Crash.

In fact, I think it is due to the fact of using a CASE functions along with LEAD / LAG.

Comment by Elena Stepanova [ 2017-08-25 ]

Juan,

Thanks for the report. I can reproduce a crash with a similar query, but the stack trace looks different from yours. To make sure we are fixing the same problem that you encounter and not anything else, could you please provide the output of

SHOW CREATE TABLE rel595.t3_scada_alarms_oper_mod_ots_states_cycles_2017;
SHOW INDEX IN rel595.t3_scada_alarms_oper_mod_ots_states_cycles_2017;
SHOW CREATE TABLE rel595.t1_scada_events_joined_dbvar_17543;
SHOW INDEX IN rel595.t1_scada_events_joined_dbvar_17543;

and paste or attach your cnf file(s)?

Comment by Juan Telleria [ 2017-08-28 ]

By the way, Window Functions are better documented in MariaDB Knowledge Base in the ColumnStore Section than in the general Server Section:

For example, LEAD and LAG aren't explained in the Server Section.

Comment by Juan Telleria [ 2017-08-28 ]

Attached CREATE TABLE and Indexes + my.ini

Comment by Elena Stepanova [ 2017-08-29 ]

Thank you.

So, your case seems to be a bit different. I can reproduce the exact crash on 10.2.6 both with your original structures and with the simplified test case below:

CREATE TABLE t1 (c VARCHAR(3), d DATE);
INSERT INTO t1 VALUES ('foo','2017-12-12'),('bar','2016-11-11'); # Optional
SELECT (LEAD(d) OVER (PARTITION BY c) IS NULL) AS f FROM t1;

10.2.6 release

2  <signal handler called>
#3  0x00005580549b9f63 in Item_field::used_tables (this=0x7f8f7c00f350) at /home/buildbot/buildbot/build/sql/item.cc:2869
#4  0x0000558054ae7aef in Item_window_func::update_used_tables (this=0x7f8f7c00f518) at /home/buildbot/buildbot/build/sql/item_windowfunc.cc:58
#5  0x00005580549d6235 in update_used_tables (this=0x7f8f7c00f5f8) at /home/buildbot/buildbot/build/sql/item_cmpfunc.h:1807
#6  Item_func_isnull::update_used_tables (this=0x7f8f7c00f5f8) at /home/buildbot/buildbot/build/sql/item_cmpfunc.h:1798
#7  0x00005580549f0c5c in fix_fields (thd=0x7f8f7c0009a8, this=0x7f8f7c00f5f8, ref=<optimized out>) at /home/buildbot/buildbot/build/sql/item_func.cc:236
#8  Item_func::fix_fields (this=0x7f8f7c00f5f8, thd=0x7f8f7c0009a8, ref=<optimized out>) at /home/buildbot/buildbot/build/sql/item_func.cc:176
#9  0x00005580547e175b in setup_fields (thd=0x7f8f7c0009a8, ref_pointer_array=..., fields=..., mark_used_columns=mark_used_columns@entry=MARK_COLUMNS_READ, sum_func_list=sum_func_list@entry=0x7f8f7c010138, allow_sum_func=allow_sum_func@entry=true) at /home/buildbot/buildbot/build/sql/sql_base.cc:7053
#10 0x000055805485d967 in prepare (unit_arg=0x7f8f7c004348, select_lex_arg=0x7f8f7c004a80, proc_param_init=0x0, having_init=0x0, group_init=0x0, skip_order_by=false, order_init=<optimized out>, og_num=<optimized out>, conds_init=<optimized out>, wild_num=0, tables_init=<optimized out>, this=<optimized out>) at /home/buildbot/buildbot/build/sql/sql_select.cc:806
#11 JOIN::prepare (this=<optimized out>, tables_init=<optimized out>, wild_num=0, conds_init=<optimized out>, og_num=<optimized out>, order_init=<optimized out>, skip_order_by=false, group_init=0x0, having_init=0x0, proc_param_init=0x0, select_lex_arg=0x7f8f7c004a80, unit_arg=0x7f8f7c004348) at /home/buildbot/buildbot/build/sql/sql_select.cc:688
#12 0x000055805486ed96 in mysql_select (thd=thd@entry=0x7f8f7c0009a8, tables=0x7f8f7c00f710, wild_num=0, fields=..., conds=<optimized out>, og_num=<optimized out>, order=0x0, group=0x0, having=0x0, proc_param=0x0, select_options=2147748608, result=result@entry=0x7f8f7c00fe00, unit=unit@entry=0x7f8f7c004348, select_lex=select_lex@entry=0x7f8f7c004a80) at /home/buildbot/buildbot/build/sql/sql_select.cc:3644
#13 0x000055805486fe44 in handle_select (thd=thd@entry=0x7f8f7c0009a8, lex=lex@entry=0x7f8f7c004280, result=result@entry=0x7f8f7c00fe00, setup_tables_done_option=setup_tables_done_option@entry=0) at /home/buildbot/buildbot/build/sql/sql_select.cc:373
#14 0x0000558054817486 in execute_sqlcom_select (thd=thd@entry=0x7f8f7c0009a8, all_tables=0x7f8f7c00f710) at /home/buildbot/buildbot/build/sql/sql_parse.cc:6431
#15 0x0000558054821a83 in mysql_execute_command (thd=thd@entry=0x7f8f7c0009a8) at /home/buildbot/buildbot/build/sql/sql_parse.cc:3448
#16 0x000055805482496a in mysql_parse (thd=thd@entry=0x7f8f7c0009a8, rawbuf=<optimized out>, length=59, parser_state=parser_state@entry=0x7f8f8cb71260, is_com_multi=is_com_multi@entry=false, is_next_command=is_next_command@entry=false) at /home/buildbot/buildbot/build/sql/sql_parse.cc:7874
#17 0x0000558054826bed in dispatch_command (command=command@entry=COM_QUERY, thd=thd@entry=0x7f8f7c0009a8, packet=<optimized out>, packet_length=2080436112, packet_length@entry=59, is_com_multi=is_com_multi@entry=false, is_next_command=is_next_command@entry=false) at /home/buildbot/buildbot/build/sql/sql_parse.cc:1812
#18 0x00005580548271b1 in do_command (thd=0x7f8f7c0009a8) at /home/buildbot/buildbot/build/sql/sql_parse.cc:1362
#19 0x00005580548e2bbf in do_handle_one_connection (connect=connect@entry=0x5580566eecb8) at /home/buildbot/buildbot/build/sql/sql_connect.cc:1354
#20 0x00005580548e2ce4 in handle_one_connection (arg=arg@entry=0x5580566eecb8) at /home/buildbot/buildbot/build/sql/sql_connect.cc:1260
#21 0x0000558054df382d in pfs_spawn_thread (arg=0x5580566bade8) at /home/buildbot/buildbot/build/storage/perfschema/pfs.cc:1862
#22 0x00007f8f93498494 in start_thread (arg=0x7f8f8cb72700) at pthread_create.c:333
#23 0x00007f8f92b5593f in clone () from /lib/x86_64-linux-gnu/libc.so.6

This particular issue was fixed in 10.2.7 in scope of MDEV-12851. So, if you upgrade, this crash should go away.
However, your original query hits another problem, with a different stack trace, the one described in MDEV-13649, and it is still happening on the current version of 10.2.

cvicentiu, I will attach the original unsimplified test case – it consists of the attached tables, a lot of auto-generated garbage data, and the query reported in the bug description. It should not be put into the regression suite, but after fixing MDEV-13649, please check it too, to make sure everything is fixed.

mdev13552.test

Comment by Alice Sherepa [ 2019-03-27 ]

The query from mdev13552.test does to crash, I am closing the bug

Generated at Thu Feb 08 08:06:29 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.