[MCOL-3369] ExeMgr stops processing queries Created: 2019-06-06  Updated: 2021-01-15

Status: Closed
Project: MariaDB ColumnStore
Component/s: ExeMgr
Affects Version/s: 1.0.0
Fix Version/s: None

Type: Bug Priority: Major
Reporter: Zdravelina Sokolovska (Inactive) Assignee: Unassigned
Resolution: Unresolved Votes: 0
Labels: regression
Environment:

1UM-1PM ; mcs deployed on CentOS7


Attachments: Text File dump_ExeMg.txt     Text File dump_PrimProc.txt     Text File um_logs.txt    

 Description   

ExeMgr stops processing queries

Issue appears sporadically after some number of queries were being processed.

First occurrence was observed when
query stream composed of 46 queries was sent towards DB loaded with 1TB data.
the 45th query from the stream hung for hours .

The seconds occurrence was observed after 2 times successful executes of the entire stream
with the 45 queries , but after passing more 10 queues, this time one by one and getting the exec time of each – it appeared again hung of query.
It remained in the processlist for hours in case of it's expected exec tine is 5-6 mins.

Both times the getActiveSQLStatements returned nothing, the ExeMgr takes 100% CPU usage

[root@um1 ~]#  /usr/local/mariadb/columnstore/bin/mcsadmin  getActiveSQLStatements
 
WARNING: running on non Parent OAM Module, can't make configuration changes in this session.
         Access Console from 'pm1' if you need to make changes.
 
getactivesqlstatements   Thu Jun  6 15:02:34 2019
 
Get List of Active SQL Statements
=================================
 
No Active SQL Statements at this time
 
[root@um1 ~]# /usr/local/mariadb/columnstore/mysql/bin/mysql -u root -p1
Welcome to the MariaDB monitor.  Commands end with ; or \g.
Your MariaDB connection id is 111
Server version: 10.1.38-MariaDB Columnstore 1.0.17-1
 
Copyright (c) 2000, 2018, Oracle, MariaDB Corporation Ab and others.
 
Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.
 
MariaDB [(none)]> show processlist ;
+-----+------+-----------+---------------------+---------+-------+----------------+-----------------------------------------------------------------------------+----------+
| Id  | User | Host      | db                  | Command | Time  | State          | Info                                                                        | Progress |
+-----+------+-----------+---------------------+---------+-------+----------------+-----------------------------------------------------------------------------+----------+
| 107 | root | um1:41054 | tpcds_1000          | Query   | 52745 | Sorting result | select *  from infinidb_vtable.$vtable_107 order by 1, 2, 3, 4 limit 0, 100 |    0.000 |
| 108 | root | localhost | infinidb_querystats | Sleep   | 12128 |                | NULL                                                                        |    0.000 |
| 111 | root | localhost | NULL                | Query   |     0 | init           | show processlist                                                            |    0.000 |
+-----+------+-----------+---------------------+---------+-------+----------------+-----------------------------------------------------------------------------+----------+
3 rows in set (0.00 sec)
 

top - 15:04:47 up 1 day, 55 min,  6 users,  load average: 1.00, 1.00, 1.04
Tasks: 219 total,   1 running, 210 sleeping,   8 stopped,   0 zombie
%Cpu(s): 25.2 us,  0.2 sy,  0.0 ni, 74.6 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st
KiB Mem : 61848088 total, 38212916 free, 19745964 used,  3889208 buff/cache
KiB Swap:  1048572 total,  1048572 free,        0 used. 41593956 avail Mem
 
  PID USER      PR  NI    VIRT    RES    SHR S  %CPU %MEM     TIME+ COMMAND
20561 root      19  -1   26.7g  17.8g  12060 S 100.0 30.2   1111:25 ExeMgr
 5074 root      20   0  162020   2392   1592 R   0.3  0.0   0:00.07 top
20301 mysql     20   0 1369188 265140  20252 S   0.3  0.4   1:22.24 mysqld
    1 root      20   0  190968   3916   2580 S   0.0  0.0   0:01.44 systemd
    2 root      20   0       0      0      0 S   0.0  0.0   0:00.02 kthreadd
    3 root      20   0       0      0      0 S   0.0  0.0   0:00.83 ksoftirqd/0
    5 root       0 -20       0      0      0 S   0.0  0.0   0:00.00 kworker/0:0H

First hanging observed at query:

select
        s_store_name,
        i_item_desc,
        sc.revenue,
        i_current_price,
        i_wholesale_cost,
        i_brand
 from store, item,
     (select ss_store_sk, avg(revenue) as ave
        from
            (select  ss_store_sk, ss_item_sk,
                     sum(ss_sales_price) as revenue
                from store_sales, date_dim
                where ss_sold_date_sk = d_date_sk and d_month_seq between 1187 and 1187+11
                group by ss_store_sk, ss_item_sk) sa
        group by ss_store_sk) sb,
     (select  ss_store_sk, ss_item_sk, sum(ss_sales_price) as revenue
        from store_sales, date_dim
        where ss_sold_date_sk = d_date_sk and d_month_seq between 1187 and 1187+11
        group by ss_store_sk, ss_item_sk) sc
 where sb.ss_store_sk = sc.ss_store_sk and
       sc.revenue <= 0.1 * sb.ave and
       s_store_sk = sc.ss_store_sk and
       i_item_sk = sc.ss_item_sk
 order by s_store_name, i_item_desc
limit 100;

Second hanging observed at query:

select
 i_item_id
 ,i_item_desc
 ,s_store_id
 ,s_store_name
 ,sum(ss_net_profit) as store_sales_profit
 ,sum(sr_net_loss) as store_returns_loss
 ,sum(cs_net_profit) as catalog_sales_profit
 from
 store_sales
 ,store_returns
 ,catalog_sales
 ,date_dim d1
 ,date_dim d2
 ,date_dim d3
 ,store
 ,item
 where
 d1.d_moy = 4
 and d1.d_year = 2001
 and d1.d_date_sk = ss_sold_date_sk
 and i_item_sk = ss_item_sk
 and s_store_sk = ss_store_sk
 and ss_customer_sk = sr_customer_sk
 and ss_item_sk = sr_item_sk
 and ss_ticket_number = sr_ticket_number
 and sr_returned_date_sk = d2.d_date_sk
 and d2.d_moy between 4 and 10
 and d2.d_year = 2001
 and sr_customer_sk = cs_bill_customer_sk
 and sr_item_sk = cs_item_sk
 and cs_sold_date_sk = d3.d_date_sk
 and d3.d_moy between 4 and  10
 and d3.d_year = 2001
 group by
 i_item_id
 ,i_item_desc
 ,s_store_id
 ,s_store_name
 order by
 i_item_id
 ,i_item_desc
 ,s_store_id
 ,s_store_name
 limit 100;


Generated at Thu Feb 08 02:42:14 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.