[MCOL-3324] Table remains locked after force restart of columnstore system Created: 2019-05-22  Updated: 2021-04-19  Resolved: 2021-01-05

Status: Closed
Project: MariaDB ColumnStore
Component/s: cpimport
Affects Version/s: 1.2.4
Fix Version/s: 5.5.2, 5.6.1

Type: Bug Priority: Major
Reporter: Zdravelina Sokolovska (Inactive) Assignee: Daniel Lee (Inactive)
Resolution: Fixed Votes: 2
Labels: None
Environment:

UM1_PM1-PM2


Attachments: Text File logs_PM1.txt     Text File logs_PM2.txt    

 Description   

Table remains locked after force restart of columnstore system

expected : clear tables locks after force restart columnstore system

Start load data in table catalog_sales with cpimport in mode m1
Break the cpimport from the client
kill the cpimport process on the PM1
Force restart mcs system , columnstore is successfully restarted, all mcs process are active
Drop the table catalog_sales -it's returned Error , although the table is empty a, it appears that the table lock was not cleared after restating ;

note : cannot perform clear tableloacks as the PID of the not existing cpimport process is hold in table locks list after restarting

 
[root@pm1_1 ~]# mcsadmin shut
shutdownsystem   Wed May 22 14:13:02 2019
 
This command stops the processing of applications on all Modules within the MariaDB ColumnStore System
 
   Checking for active transactions
The following tables are locked:
LockID Name                     Process  PID   Session  CreationTime           State    DBRoots
304    tpcds_1000.catalog_sales cpimport 18854 BulkLoad 2019-05-22 01:10:03 PM LOADING  1      ,2
Your options are:
    Cancel    -- Cancel the shutdown request
    Wait      -- Wait for write operations to end and then shutdown
    Force     -- Force a shutdown
What would you like to do: [Cancel]: Force
 
   Stopping System...
   Successful stop of System
 
   Shutting Down System...
   Successful shutdown of System
 
[root@pm1_1 ~]# mcsadmin start
startsystem   Wed May 22 14:14:08 2019
 
startSystem command, 'columnstore' service is down, sending command to
start the 'columnstore' service on all modules
 
 
   System being started, please wait..........
 
   System Not Ready, DMLProc is checking/processing rollback of abandoned transactions. Processing could take some time, please wait....
   Successful start of System
 

after system has being restarted

MariaDB [(none)]> drop table tpcds_1000.catalog_sales ;
ERROR 1815 (HY000): Internal error: CAL0009: Drop table failed due to  IDB-2009: Unable to perform the drop table operation because cpimport with PID 16301 is currently holding the table lock for session -1.
MariaDB [(none)]> select count(*) from tpcds_1000.catalog_sales ;
+----------+
| count(*) |
+----------+
|        0 |
+----------+
1 row in set (0.098 sec)
 

after system has being restarted

[root@pm1_1 ~]# /usr/local/mariadb/columnstore/bin/viewtablelock
 There is 1 table lock
 
  Table                     LockID  Process   PID    Session   Txn  CreationTime              State    DBRoots
  tpcds_1000.catalog_sales  1       cpimport  16301  BulkLoad  n/a  Wed May 22 14:20:42 2019  LOADING  1,2
[root@pm1_1 ~]# /usr/local/mariadb/columnstore/bin/cleartablelock 1
Rolling back and clearing table lock for table tpcds_1000.catalog_sales; table lock 1
 
Rollback error: Unable to grab lock; Lock not found or still in use.
Table lock 1 for table tpcds_1000.catalog_sales is not cleared.

[root@pm1_1 ~]# ps aux | grep cpimport | grep -v grep
[root@pm2_2 ~]# ps aux | grep cpimport | grep -v grep



 Comments   
Comment by David Hall (Inactive) [ 2020-12-14 ]

I tried this in 5.5:

use dhall
create table lineitem (
        l_orderkey int,
        l_partkey int,
        l_suppkey int,
        l_linenumber bigint,
        l_quantity decimal(12,2),
        l_extendedprice decimal(12,2),
        l_discount decimal(12,2),
        l_tax decimal(12,2),
        l_returnflag char (1),
        l_linestatus char (1),
        l_shipdate date,
        l_commitdate date,
        l_receiptdate date,
        l_shipinstruct char (25),
        l_shipmode char (10),
        l_comment varchar (44)
) engine=columnstore;
 
#cpimport dhall lineitem /shared/100g/lineitem.tbl
 
kill -9 <pid of cpimport.bin>
 
# viewtablelock 
 There is 1 table lock
 
  Table           LockID  Process             PID    Session   Txn  CreationTime              State    DBRoots  
  dhall.lineitem  40      cpimport.bin (pm1)  24813  BulkLoad  n/a  Mon Dec 14 11:13:44 2020  LOADING  1        
 
#systemctl stop mariadb
#systemctl stop mariadb-columnstore
#systemctl start mariadb-columnstore
#systemctl start mariadb
 
# viewtablelock 
 No tables are locked in the database.
 
drop table lineitem;
ERROR 2006 (HY000): MySQL server has gone away
No connection. Trying to reconnect...
Connection id:    3
Current database: dhall
 
Query OK, 0 rows affected (1.378 sec)

Comment by Daniel Lee (Inactive) [ 2021-01-05 ]

Build tested: 5.6.1-1 (drone #1437) and develop branch

The locking issue no longer exist in 5.6.1-1. The cleartablelock also successfully cleared the lock set by the killed cpimport process without restart mariaDB or mariadb-columnstore.

The old expected behavior was that if the lock was set by a process that no longer exist, such as a kill cpimport job, cleartablelock would fail to clear it. A restartsystem command would be needed to clear it. Also, in 5.6.1-1, there is no longer OAM module.

Comment by Daniel Lee (Inactive) [ 2021-01-05 ]

BTW, the "ERROR 2006 (HY000): MySQL server has gone away" msg is expected since the server has been restarted.

Generated at Thu Feb 08 02:41:53 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.