[MCOL-2202]  ExeMgr stops processing queries with set infinidb_vtable_mode 2 Created: 2019-02-28  Updated: 2021-02-25  Resolved: 2021-02-25

Status: Closed
Project: MariaDB ColumnStore
Component/s: ExeMgr
Affects Version/s: 1.2.3
Fix Version/s: N/A

Type: Bug Priority: Critical
Reporter: Zdravelina Sokolovska (Inactive) Assignee: Unassigned
Resolution: Won't Do Votes: 1
Labels: None
Environment:

um1-pm1/ centOS7


Attachments: Text File logs_um1.txt     Text File trace_ PrimProc.txt     Text File trace_ExeMgr.txt    
Issue Links:
Relates
relates to MCOL-1636 Query hangs after being warmed up Closed

 Description   

ExeMgr stops processing queries with set infinidb_vtable_mode 2

the problem was observed after passing queries in infinidb_vtable_mode=2
several queries was killed and it was received ERROR ExeMgr has caught an exception.
then although switching back explicitly to infinidb_vtable_mode=1 all queries send to columnsrore hangs.

 ERROR: ExeMgr has caught an exception. ExeMgr: error projecting rows for tableOID: 100; rowCnt: 8192; prevTotRowCnt: 662040; InetStreamSocket::write error: Broken pipe -- write from InetStreamSocket: sd: 56 inet: 172.20.2.206 port: 33034

MariaDB [(none)]> select count(*) from foo1_1.item   ;
 
query hangs //table item contains 18000 row only

[root@um1 mariadb]# mcsadmin getsystemi
 
WARNING: running on non Parent OAM Module, can't make configuration changes in this session.
         Access Console from 'pm1' if you need to make changes.
 
getsysteminfo   Thu Feb 28 15:36:32 2019
 
System columnstore-1
 
System and Module statuses
 
Component     Status                       Last Status Change
------------  --------------------------   ------------------------
System        ACTIVE                       Mon Feb 25 17:28:07 2019
 
Module um1    ACTIVE                       Mon Feb 25 17:28:03 2019
Module pm1    ACTIVE                       Mon Feb 25 17:27:53 2019
 
Active Parent OAM Performance Module is 'pm1'
MariaDB ColumnStore Replication Feature is enabled
MariaDB ColumnStore set for Distributed Install
 
 
MariaDB ColumnStore Process statuses
 
Process             Module    Status            Last Status Change        Process ID
------------------  ------    ---------------   ------------------------  ----------
ProcessMonitor      um1       ACTIVE            Mon Feb 25 01:36:49 2019        7941
ServerMonitor       um1       ACTIVE            Mon Feb 25 17:27:51 2019       27573
DBRMWorkerNode      um1       ACTIVE            Mon Feb 25 17:27:52 2019       27605
ExeMgr              um1       ACTIVE            Mon Feb 25 17:27:56 2019       27652
DDLProc             um1       ACTIVE            Mon Feb 25 17:28:00 2019       27700
DMLProc             um1       ACTIVE            Mon Feb 25 17:28:04 2019       27713
mysqld              um1       ACTIVE            Mon Feb 25 17:27:48 2019       27497
 
ProcessMonitor      pm1       ACTIVE            Mon Feb 25 01:36:31 2019        7874
ProcessManager      pm1       ACTIVE            Mon Feb 25 01:36:38 2019        7974
DBRMControllerNode  pm1       ACTIVE            Mon Feb 25 17:27:47 2019       17901
ServerMonitor       pm1       ACTIVE            Mon Feb 25 17:27:49 2019       17919
DBRMWorkerNode      pm1       ACTIVE            Mon Feb 25 17:27:49 2019       17951
PrimProc            pm1       ACTIVE            Mon Feb 25 17:27:53 2019       18050
WriteEngineServer   pm1       ACTIVE            Mon Feb 25 17:27:54 2019       18075
 
Active Alarm Counts: Critical = 2, Major = 1, Minor = 0, Warning = 0, Info = 0

MariaDB [(none)]> show processlist ;
+-----+-------------+-----------+--------------------+---------+-------+--------------------------+------------------------------------------------------------------------------------------------------+----------+
| Id  | User        | Host      | db                 | Command | Time  | State                    | Info                                                                                                 | Progress |
+-----+-------------+-----------+--------------------+---------+-------+--------------------------+------------------------------------------------------------------------------------------------------+----------+
|   1 | system user |           | NULL               | Daemon  |  NULL | InnoDB purge coordinator | NULL                                                                                                 |    0.000 |
|   2 | system user |           | NULL               | Daemon  |  NULL | InnoDB purge worker      | NULL                                                                                                 |    0.000 |
|   4 | system user |           | NULL               | Daemon  |  NULL | InnoDB purge worker      | NULL                                                                                                 |    0.000 |
|   3 | system user |           | NULL               | Daemon  |  NULL | InnoDB purge worker      | NULL                                                                                                 |    0.000 |
|   5 | system user |           | NULL               | Daemon  |  NULL | InnoDB shutdown handler  | NULL                                                                                                 |    0.000 |
| 197 | root        | localhost | information_schema | Sleep   | 15324 |                          | NULL                                                                                                 |    0.000 |
| 246 | root        | um1:43888 | tpcds_1            | Killed  | 10859 | Sending data             | with v1 as(  select i_category, i_brand,         s_store_name, s_company_name,         d_year, d_moy |    0.000 |
| 254 | root        | um1:57134 | tpcds_1            | Killed  |  7058 | Executing                | create temporary table infinidb_vtable.$vtable_254 engine = aria as select sum(ss_quantity)  from st |    0.000 |
| 259 | root        | um1:37260 | tpcds_1            | Killed  |  4639 | Executing                | create temporary table infinidb_vtable.$vtable_259 engine = aria as select channel, item, TRIM(TRAIL |    0.000 |
| 263 | root        | um1:39574 | tpcds_1            | Killed  |  3984 | Executing                | create temporary table infinidb_vtable.$vtable_263 engine = aria as select      s_store_name   ,s_co |    0.000 |
| 269 | root        | um1:39962 | NULL               | Killed  |  3879 | Executing                | create temporary table infinidb_vtable.$vtable_269 engine = aria as select count(*) from  tpcds_1.ca |    0.000 |
| 270 | root        | localhost | NULL               | Killed  |  3468 | Executing                | create temporary table infinidb_vtable.$vtable_270 engine = aria as select count(*) from tpcds_1.cal |    0.000 |
| 278 | root        | um1:40880 | tpcds_1            | Killed  |  3621 | Executing                | create temporary table infinidb_vtable.$vtable_278 engine = aria as select  dt.d_year   ,item.i_bran |    0.000 |
| 284 | root        | um1:41210 | NULL               | Killed  |  3529 | Executing                | create temporary table infinidb_vtable.$vtable_284 engine = aria as select count(*) from  tpcds_1.ca |    0.000 |
| 289 | root        | localhost | NULL               | Killed  |  3387 | Executing                | create temporary table infinidb_vtable.$vtable_289 engine = aria as select count(*) from tpcds_1.ite |    0.000 |
| 293 | root        | localhost | NULL               | Killed  |  2731 | Executing                | create temporary table infinidb_vtable.$vtable_293 engine = aria as select count(*) from tpcds_1.ite |    0.000 |
| 296 | root        | localhost | NULL               | Query   |  2681 | Executing                | create temporary table infinidb_vtable.$vtable_296 engine = aria as select count(*) from foo1_1.item |    0.000 |
| 297 | root        | localhost | NULL               | Query   |     0 | Init                     | show processlist                                                                                     |    0.000 |
+-----+-------------+-----------+--------------------+---------+-------+--------------------------+------------------------------------------------------------------------------------------------------+----------+
18 rows in set (0.000 sec)
 



 Comments   
Comment by Patrick LeBlanc (Inactive) [ 2019-03-04 ]

ZD is testing commit e849af0

Comment by Patrick LeBlanc (Inactive) [ 2019-03-04 ]

The backtraces are interesting. Primproc seems to be idle, but ExeMgr is definitely processing and sending data to the server.

I noticed that where ExeMgr is sending the data from is a special debug/trace mode where it will only send syscat data and the final 'EOF' msg to the server. I don't remember how it gets set exactly, I suspect 'calsettrace(8)' will do it, but the flag to enable that behavior is CalpontSelectExecutionPlan::TRACE_NO_ROWS3. Maybe that's useful to somebody.

Zdravelina, could you post steps to reproduce the problem? Right now what we have is
1) running some stuff in vtable mode 2 causes a broken pipe exception. (I'm guessing the server dropped its connection there)
2) after the exception, running some stuff in vtable mode 1 causes it to hang.

Comment by Todd Stoffel (Inactive) [ 2021-02-25 ]

Obsolete

Generated at Thu Feb 08 02:34:30 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.