Uploaded image for project: 'MariaDB ColumnStore'
  1. MariaDB ColumnStore
  2. MCOL-4784

Replication lag due to waiting for metadata lock on CS table

    XMLWordPrintable

    Details

    • Type: Bug
    • Status: Closed (View Workflow)
    • Priority: Major
    • Resolution: Cannot Reproduce
    • Affects Version/s: 5.5.2
    • Fix Version/s: N/A
    • Component/s: N/A
    • Labels:
      None
    • Environment:
      CentOS Linux release 7.9

      Description

      Truncate table is not able to execute due to waiting for metadata lock (thread id 1122427). There are many selects running for more than 7hrs. 3 of them are in Killed state (thread id 1361504, 1361536, 1361567) and truncate may be waiting for this

      MariaDB [(none)]> show processlist;
      +---------+-----------------+------------------+-----------+-----------+--------+----------------------------------+------------------------------------------------------------------------------------------------------+----------+
      | Id | User | Host | db | Command | Time | State | Info | Progress |
      +---------+-----------------+------------------+-----------+-----------+--------+----------------------------------+------------------------------------------------------------------------------------------------------+----------+
      | 1122426 | system user | | NULL | Slave_IO | 237516 | Waiting for master to send event | NULL | 0.000 |
      | 1122427 | system user | | dm_fait | Slave_SQL | 5560 | Waiting for table metadata lock | TRUNCATE TABLE dm_master.dim_structure | 0.000 |
      | 1355372 | userbi | mxs1:55280 | dm_fait | Query | 28634 | Executing | SELECT
      u.CODE_TYPE_USER AS CODE_TYPE_USER_UT,
      GROUP_CONCAT(DISTINCT RA.HEAD_ID) RUBRIQUE
      FROM
      dm | 0.000 |
      | 1355574 | userbi | mxs1:56124 | dm_fait | Query | 28694 | Executing | SELECT
      u.CODE_TYPE_USER AS CODE_TYPE_USER_UT,
      GROUP_CONCAT(DISTINCT RA.HEAD_ID) RUBRIQUE
      FROM
      dm | 0.000 |
      | 1360885 | userbi | mxs1:50452 | dm_fait | Query | 28770 | Executing | SELECT
      'P' AS CODE_TYPE_USER_P,
      GROUP_CONCAT(DISTINCT HEAD_ID) AS RUBRIQUE
      FROM
      dm_master.dim_ | 0.000 |
      | 1361504 | userbi | mxs1:53054 | dm_fait | Killed | 29006 | Executing | select * from (select `INVOICE`.`INV_NUMBER` as `c0`, `INVOICE`.`INV_DATE` as `c1`, COALESCE(CAST(CA | 0.000 |
      | 1361536 | userbi | mxs1:53168 | dm_fait | Killed | 28988 | Executing | select * from (select `INVOICE`.`INV_NUMBER` as `c0`, `INVOICE`.`INV_DATE` as `c1`, COALESCE(CAST(CA | 0.000 |
      | 1361567 | userbi | mxs1:53276 | dm_fait | Killed | 28970 | Executing | select * from (select `INVOICE`.`INV_NUMBER` as `c0`, `INVOICE`.`INV_DATE` as `c1`, COALESCE(CAST(CA | 0.000 |
      | 1361587 | userbi | mxs1:53346 | dm_fait | Query | 28961 | Executing | select * from (select `INVOICE`.`INV_NUMBER` as `c0`, `INVOICE`.`INV_DATE` as `c1`, COALESCE(CAST(CA | 0.000 |
      | 1361829 | userbi | mxs1:54582 | dm_fait | Query | 28768 | Executing | SELECT IF(COUNT(1)<>0,1,0) AS TEST_A_SUPP
      FROM dm_fait.fait_rav
      WHERE ENRG_A_SUPP = 1 | 0.000 |
      

      The truncate is running on a columnstore table so I have installed metadata_lock_info plugin to check what is holding the lock. I tried to kill one of the blocking thread but it remains in killed state and does not clear. To fix this for now, I have killed the mariadb process and started up again and the processlist is clear and there is no replication lag.

      --threads which are holding metadata locks.
      MariaDB [(none)]> SELECT CONCAT('Thread ',P.ID,' executing "',P.INFO,'" IS LOCKED BY Thread ',M.THREAD_ID) WhoLocksWho FROM INFORMATION_SCHEMA.PROCESSLIST P,INFORMATION_SCHEMA.METADATA_LOCK_INFO M WHERE LOCATE(lcase(LOCK_TYPE), lcase(STATE))>0 limit 5;
      +-----------------------------------------------------------------------------------------------+
      | WhoLocksWho |
      +-----------------------------------------------------------------------------------------------+
      | Thread 1122427 executing "TRUNCATE TABLE dm_master.dim_structure" IS LOCKED BY Thread 1396804 |
      | Thread 1122427 executing "TRUNCATE TABLE dm_master.dim_structure" IS LOCKED BY Thread 1355574 |
      | Thread 1122427 executing "TRUNCATE TABLE dm_master.dim_structure" IS LOCKED BY Thread 1361915 |
      | Thread 1122427 executing "TRUNCATE TABLE dm_master.dim_structure" IS LOCKED BY Thread 1355372 |
      | Thread 1122427 executing "TRUNCATE TABLE dm_master.dim_structure" IS LOCKED BY Thread 1361977 |
      +-----------------------------------------------------------------------------------------------+
      5 rows in set (0.008 sec)
       
      MariaDB [(none)]> select ID,USER,HOST,DB,COMMAND,TIME,INFO from information_schema.processlist where ID=1396804;
      +---------+--------+------------+---------+---------+------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
      | ID | USER | HOST | DB | COMMAND | TIME | INFO |
      +---------+--------+------------+---------+---------+------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
      | 1396804 | userbi | mxs1:37548 | dm_fait | Query | 5562 | SELECT 'dm_fait.fait_orders' AS 'COL', ENRG_A_SUPP AS ENRG_A_SUPP, COUNT(1) AS NB_LIGNES, CAST(MIN(DATE_MAJ) AS CHAR(20)) AS 'DATE_MIN', CAST(MAX(DATE_MAJ) AS CHAR(20)) AS 'DATE_MAX'
      FROM dm_fait.fait_orders GROUP BY 1,2 |
      +---------+--------+------------+---------+---------+------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
      MariaDB [(none)]> select ID,USER,HOST,DB,COMMAND,TIME,INFO from information_schema.processlist where ID=1355574;
      +---------+--------+------------+---------+---------+-------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
      | ID | USER | HOST | DB | COMMAND | TIME | INFO |
      +---------+--------+------------+---------+---------+-------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
      | 1355574 | userbi | mxs1:56124 | dm_fait | Query | 39752 | SELECT
      u.CODE_TYPE_USER AS CODE_TYPE_USER_UT,
      GROUP_CONCAT(DISTINCT RA.HEAD_ID) RUBRIQUE
      FROM
      dm_fait.fait_ra AS RA
      INNER JOIN dm_master.uti_uti_org AS uo ON
      RA.ORG_ID_STRUCTURE_EU = uo.ORG_ID
      INNER JOIN dm_master.uti_utilisateur AS u ON
      u.uti_id = uo.uti_id
      WHERE
      RA.ENRG_A_SUPP = 0
      AND uo.UTI_LDAP_ID = 'supervista2@pixid.biz'
      GROUP BY 1 |
      +---------+--------+------------+---------+---------+-------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
      1 row in set (0.003 sec)
       
      MariaDB [(none)]> select ID,USER,HOST,DB,COMMAND,TIME,INFO from information_schema.processlist where ID=1361915;
      +---------+--------+------------+---------+---------+-------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
      | ID | USER | HOST | DB | COMMAND | TIME | INFO |
      +---------+--------+------------+---------+---------+-------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
      | 1361915 | userbi | mxs1:55092 | dm_fait | Query | 39819 | SELECT
      u.CODE_TYPE_USER AS CODE_TYPE_USER_UT,
      GROUP_CONCAT(DISTINCT RA.HEAD_ID) RUBRIQUE
      FROM
      dm_fait.fait_ra AS RA
      INNER JOIN dm_master.uti_uti_org AS uo ON
      RA.ORG_ID_STRUCTURE_EU = uo.ORG_ID
      INNER JOIN dm_master.uti_utilisateur AS u ON
      u.uti_id = uo.uti_id
      WHERE
      RA.ENRG_A_SUPP = 0
      AND uo.UTI_LDAP_ID = 'supervista2@pixid.biz'
      GROUP BY 1 |
      +---------+--------+------------+---------+---------+-------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
      1 row in set (0.001 sec)
       
      --killing one of the blocking queries and thread remains in killed state
      MariaDB [(none)]> kill 1396804;
      Query OK, 0 rows affected (0.000 sec)
       
      MariaDB [(none)]> select ID,USER,HOST,DB,COMMAND,TIME,INFO from information_schema.processlist where ID=1396804;
      +---------+--------+------------+---------+---------+------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
      | ID | USER | HOST | DB | COMMAND | TIME | INFO |
      +---------+--------+------------+---------+---------+------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
      | 1396804 | userbi | mxs1:37548 | dm_fait | Killed | 6162 | SELECT 'dm_fait.fait_orders' AS 'COL', ENRG_A_SUPP AS ENRG_A_SUPP, COUNT(1) AS NB_LIGNES, CAST(MIN(DATE_MAJ) AS CHAR(20)) AS 'DATE_MIN', CAST(MAX(DATE_MAJ) AS CHAR(20)) AS 'DATE_MAX'
      FROM dm_fait.fait_orders GROUP BY 1,2 |
      +---------+--------+------------+---------+---------+------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
      1 row in set (0.003 sec)
       
      -- after mariadb restart
      MariaDB [(none)]> show processlist;
      +-----+-------------+------------------+------+-----------+------+--------------------------------------------------------+------------------+----------+
      | Id | User | Host | db | Command | Time | State | Info | Progress |
      +-----+-------------+------------------+------+-----------+------+--------------------------------------------------------+------------------+----------+
      | 5 | system user | | NULL | Slave_IO | 18 | Waiting for master to send event | NULL | 0.000 |
      | 6 | system user | | NULL | Slave_SQL | 0 | Slave has read all relay log; waiting for more updates | NULL | 0.000 |
      | 7 | mxs | mxs1:44510 | NULL | Sleep | 0 | | NULL | 0.000 |
      | 10 | monitoring | 10.10.1.98:38794 | NULL | Sleep | 0 | | NULL | 0.000 |
      | 24 | monitoring | mxs1:44520 | NULL | Sleep | 0 | | NULL | 0.000 |
      | 105 | root | localhost | NULL | Query | 0 | starting | show processlist | 0.000 |
      +-----+-------------+------------------+------+-----------+------+--------------------------------------------------------+------------------+----------+
      6 rows in set (0.000 sec)
      

        Attachments

          Issue Links

            Activity

              People

              Assignee:
              Unassigned Unassigned
              Reporter:
              sandhyabalakrishnan Sandhya Balakrishnan
              Votes:
              1 Vote for this issue
              Watchers:
              4 Start watching this issue

                Dates

                Created:
                Updated:
                Resolved:

                  Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.