Details
-
Bug
-
Status: Closed (View Workflow)
-
Major
-
Resolution: Fixed
-
1.0.12, 1.1.1
-
2017-22, 2017-23
Description
Build tested: 1.1.1-1
Testing stack: Single server, with 1 dbroot, using local VM.
This issue was identified by the Autopilot test concurrency.concurDML. The same test worked for 1.1.0-1 beta, not failed on 1.1.1-1.
The test suite run's the same test for update, delete, and ldi. For update and delete tests, dbt3 tables were loaded with 1gb data before test. ldi test starts with empty tables. update and delete tests worked fine, but ldi tests failed.
The test basically does the following:
1) run ldi, which uses cpimport, concurrently for all 8 dbt3 tables loading 1gb data
2) concurrently, run select count queries on the tables being tested
The purpose of this test is to ensure data consistency, commands should commit none or all changes
During ldi tests, mysqld crashed, leaving tables locks from cpimport processes. Sometimes, queries also return "missing data block" errors.
I spent hours to debug this issue and this is what I found:
1) instead of testing with all 8 tables, testing nation and region went fine
2) lineitem and order also fine
3) When testing lineitem, orders, and partsupp, it failed
4) when testing nation, region, customer, it worked some times. Nation and region tables are small and could finish quickly.
I thing this could be related to persisting data blocks, update or invalidating cache, updating extent map, or timing of doing these operations, etc.