Uploaded image for project: 'MariaDB ColumnStore'
  1. MariaDB ColumnStore
  2. MCOL-3296

ctrl+c sometimes leaves DMLProc in bad state

Details

    • Bug
    • Status: Closed (View Workflow)
    • Critical
    • Resolution: Fixed
    • 1.1.7
    • 1.1.0, 1.2.4
    • DMLProc
    • None
    • 2019-05

    Description

      PackageHandler::synchTableAccess() is a kludge to serialize DML against a single table, but allow parallel processing against different tables. This kludge is required because the vss can't accept transaction ID's out of numerical order on a single table, and asserts if they're not in order. If there are multiple transactions running on multiple threads, there's no guarantee of order.

      Something occasionally breaks if ctrl+c is hit while processing DML. The internal tables and synch conditions of synchTableAccess() can get out of whack. This has caused two catastrophic events. In one case, DMLProc segfaulted while accessing the synchro map (27656), and in other cases, DML statements block indefinitely.

      Attachments

        Activity

          When CTRL+C is hit, the query is removed from the queue that keeps track of the queries for this table and the query is marked as cancelled. The query, running in a different thread, begins cleaning up stuff and removes the top item (should be the running item) from the queue. Of course he's not in there so the queue gets corrupted. Different, but similar, breakage occurs when the query is waiting.

          A query that blocks doesn't log anything yet, as it's blocked before the normal logging occurs. This led to significant confusion trying to analyze this. Added logging when a query blocks here.

          David.Hall David Hall (Inactive) added a comment - When CTRL+C is hit, the query is removed from the queue that keeps track of the queries for this table and the query is marked as cancelled. The query, running in a different thread, begins cleaning up stuff and removes the top item (should be the running item) from the queue. Of course he's not in there so the queue gets corrupted. Different, but similar, breakage occurs when the query is waiting. A query that blocks doesn't log anything yet, as it's blocked before the normal logging occurs. This led to significant confusion trying to analyze this. Added logging when a query blocks here.

          For QA:
          Before fix: Start any update or insert (not using cpimport) and hit CTRL+C during execution. It should work fine. However, after this, dml on this table is likely to never complete. CTRL+C this and it might start working again, might not. Start multiple dml against the same table and ctrl+c out of not the first one. This may eventually cause problems.

          After fix: All the above should work as expected: Which ever dml is cancelled via CTRL+C is cancelled and all other queries continue (in order). Check the debug log to see log lines showing when a query is blocked.

          David.Hall David Hall (Inactive) added a comment - For QA: Before fix: Start any update or insert (not using cpimport) and hit CTRL+C during execution. It should work fine. However, after this, dml on this table is likely to never complete. CTRL+C this and it might start working again, might not. Start multiple dml against the same table and ctrl+c out of not the first one. This may eventually cause problems. After fix: All the above should work as expected: Which ever dml is cancelled via CTRL+C is cancelled and all other queries continue (in order). Check the debug log to see log lines showing when a query is blocked.

          Build tested: 1.1.7-1, 1.2.3-1

          Finally reproduced the issue in the above releases.

          At first, failed many times to reproduce it using smaller table. I eventually reproduced it when updating a 10gb dbt3 lineitem table.

          Waiting for a nightly build with the fix.

          dleeyh Daniel Lee (Inactive) added a comment - Build tested: 1.1.7-1, 1.2.3-1 Finally reproduced the issue in the above releases. At first, failed many times to reproduce it using smaller table. I eventually reproduced it when updating a 10gb dbt3 lineitem table. Waiting for a nightly build with the fix.

          Build verified: 1.1.8-1 nightly

          server commit:
          01cc1ef
          engine commit:
          0af6994

          Still waiting for 1.2.4-1

          dleeyh Daniel Lee (Inactive) added a comment - Build verified: 1.1.8-1 nightly server commit: 01cc1ef engine commit: 0af6994 Still waiting for 1.2.4-1

          Build tested: 1.2.4-1 GitHub source

          Made a build with the latest source and verified the fixed.

          /root/columnstore/mariadb-columnstore-server
          commit e3d99393916f0231db02564dd5e316e803bdbbe9
          Author: Andrew Hutchings <andrew@linuxjedi.co.uk>
          Date: Mon Jan 14 16:20:01 2019 +0000

          Disable Travis triggering on pull requests

          /root/columnstore/mariadb-columnstore-server/mariadb-columnstore-engine
          commit 122038e36a8a4d4bd632eb137bde31a09a88d2b9
          Merge: 8afc3f8 ea2ff9c
          Author: Roman Nozdrin <drrtuy@gmail.com>
          Date: Mon May 20 13:50:36 2019 +0300

          Merge pull request #767 from mariadb-corporation/develop-1.2-merge-up-20190517

          Merge develop-1.1 into develop-1.2

          dleeyh Daniel Lee (Inactive) added a comment - Build tested: 1.2.4-1 GitHub source Made a build with the latest source and verified the fixed. /root/columnstore/mariadb-columnstore-server commit e3d99393916f0231db02564dd5e316e803bdbbe9 Author: Andrew Hutchings <andrew@linuxjedi.co.uk> Date: Mon Jan 14 16:20:01 2019 +0000 Disable Travis triggering on pull requests /root/columnstore/mariadb-columnstore-server/mariadb-columnstore-engine commit 122038e36a8a4d4bd632eb137bde31a09a88d2b9 Merge: 8afc3f8 ea2ff9c Author: Roman Nozdrin <drrtuy@gmail.com> Date: Mon May 20 13:50:36 2019 +0300 Merge pull request #767 from mariadb-corporation/develop-1.2-merge-up-20190517 Merge develop-1.1 into develop-1.2

          People

            dleeyh Daniel Lee (Inactive)
            David.Hall David Hall (Inactive)
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Git Integration

                Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.