[MCOL-700] Failure of clientrotator.cpp causes DMLProc to lock up Created: 2017-05-04 Updated: 2017-05-08 Resolved: 2017-05-08 |
|
| Status: | Closed |
| Project: | MariaDB ColumnStore |
| Component/s: | DMLProc, ExeMgr, ProcMgr |
| Affects Version/s: | 1.0.8 |
| Fix Version/s: | Icebox |
| Type: | Bug | Priority: | Major |
| Reporter: | Allan | Assignee: | Unassigned |
| Resolution: | Duplicate | Votes: | 0 |
| Labels: | None | ||
| Environment: |
Centos 7; XFS; 32GB mem; % ll /usr/local/mariadb % df -h /data |
||
| Attachments: |
|
||||||||||||
| Issue Links: |
|
||||||||||||
| Description |
|
A java program was written to get around the issue of bulk deleting data in the current partition scheme (SEE https://jira.mariadb.org/browse/MCOL-685 , https://jira.mariadb.org/browse/MCOL-680) This program works by doing repeated DELETE sql calls of a certain LIMIT size until the data for a particular column value is gone. The program starts fine and with repeated tries does some number of success deletes until the following occurs: SQLState: HY000; Error Code: 1815; Internal error: /home/builder/mariadb-columnstore-server/mariadb-columnstore-engine/dbcon/execplan/clientrotator.cpp: Could not get a connection to a ExeMgr Subsequent attempts always fail with the following: SQLState: HY000; Error Code: 1815; Internal error: IDB-2009: Unable to perform the delete operation because DMLProc with PID 27675 is currently holding the table lock for session 71 The problem is resolved by doing at kill -TERM on the DMLProc. The program now runs fine again for some number of block deletes until the problem happens again. And again killing the DMLProc resolves the problem temporarily. The pertinent section of code doing the delete is the following. The whole program is available if needed for your needs: {{ private static void doDelete(Connection connection) { String deleteRecords = "DELETE FROM " + database + "." + table + // log.info("Creating statement ..."); try (Statement statement = connection.createStatement()) { // See if we can get rid of alot using the fast method, then take care log.info("Executing » " + partitionDrop); catch (Exception e) { log.info("Ignoring » " + e); }// Now get rid of the rest the hard way do { log.info("Executing » " + deleteRecords); numberDeleted = statement.executeUpdate(deleteRecords); if (numberDeleted > 0) log.info("Deleted block of " + numberDeleted + " records."); } while (numberDeleted > 0); catch (Exception e) { exitCode = 1; log.error("Deletion Failure", e); }}}} |
| Comments |
| Comment by Andrew Hutchings (Inactive) [ 2017-05-04 ] | |||
|
This sounds like it could be In the mean time if it is
| |||
| Comment by Allan [ 2017-05-04 ] | |||
|
Update: Adding a one second pause between DELETE calls seems to make the problem go away. the test is still running and it has been an hour. I never got this far before. | |||
| Comment by Allan [ 2017-05-04 ] | |||
|
[root@dev2 ~]# /usr/local/mariadb/columnstore/bin/columnstoreSupport -a Note: This output shows SysV services only and does not include native If you want to list systemd services use 'systemctl list-unit-files'. Note: This output shows SysV services only and does not include native If you want to list systemd services use 'systemctl list-unit-files'. Get log report data for pm1
Columnstore Support Script Successfully completed, files located in columnstoreSupportReport.columnstore-1.tar.gz | |||
| Comment by Andrew Hutchings (Inactive) [ 2017-05-04 ] | |||
|
This definitely makes it sound like | |||
| Comment by David Thompson (Inactive) [ 2017-05-08 ] | |||
|
Please reopen if 1.0.9 does not fix this (has fix for | |||
| Comment by Allan [ 2017-05-08 ] | |||
|
Do you have a reference to where I can download 1.0.9 to try it out? | |||
| Comment by David Thompson (Inactive) [ 2017-05-08 ] | |||
|
we are working on final bug fixes / stabilization so hopefully within the next week. | |||
| Comment by Allan [ 2017-05-08 ] | |||
|
|