[MCOL-455] redistribute data's "START REMOVE" option did not move data from the requested dbroot Created: 2016-12-08 Updated: 2023-10-26 Resolved: 2017-01-23 |
|
| Status: | Closed |
| Project: | MariaDB ColumnStore |
| Component/s: | None |
| Affects Version/s: | 1.0.6 |
| Fix Version/s: | 1.0.7 |
| Type: | Bug | Priority: | Minor |
| Reporter: | Daniel Lee (Inactive) | Assignee: | Ben Thompson (Inactive) |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | None | ||
| Environment: | |||
| Issue Links: |
|
||||||||
| Sprint: | 2016-24, 2016-25, 2017-01, 2017-2 | ||||||||
| Description |
|
Build tested: 1.0.6-1 mcsadmin> getsoft Name : mariadb-columnstore-platform This issue was identified when testing "redistributedata START REMOVE" does move any data. It finished immediately mcsadmin> redistributedata start remove 3 WriteEngineServer returned status 1: Cleared. [6:10]
----------
---------- |
| Comments |
| Comment by David Hall (Inactive) [ 2017-01-06 ] | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
There are two issues involved here. First is a logic error that caused the algorithm to stop too soon in certain circumstances. Second is built in logic to prevent moving segments of a partition to a dbroot already containing segments from that partition. This logic is designed to attempt to keep segments distributed accross the dbroots. However, when trying to remove a dbroot, it may be necessary to move segments to a dbroot already with that partition. This is especially true when a partition is spread across all dbroots. Logic was added to allow segments to be added to dbroots with that partition only in the case of removing a dbroot. This is not an ideal solution. If one were to remove one or more dbroots, say because the hardware was being replaced, and then moved back later, our current algorithm would not redistribute the segments properly. It would move stuff such that the new hardware would have the correct amount of data, but those segments piled on top in a dbroot would forever remain fused together. The algorithm will eventually be changed to account for segment distribution, attempting to keep them as distributed as feasible. | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Daniel Lee (Inactive) [ 2017-01-20 ] | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
Build tested: 1.0.7-1 mcsadmin> getsoft Name : mariadb-columnstore-platform Found couple issues: 1) writeengine crashed, redistributedata returned fail status mcsadmin> redistributedata start WriteEngineServer returned status 1: Cleared. [1:36] [1:37] [1:37] WriteEngineServer returned status 1: Cleared. mcsadmin> redistributedata status mcsadmin> redistributedata status [1:38] [1:38] 2) The "start move n" option does not distribute data among remaining dbroots MariaDB [mytest]> select idbdbroot(o_orderkey) d, count
-----
----- [8:29]
-----
----- | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Daniel Lee (Inactive) [ 2017-01-20 ] | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
MariaDB [mytest]> select idbdbroot(o_orderkey) d, idbpartition(o_orderkey) p, count
-----
----- | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Daniel Lee (Inactive) [ 2017-01-20 ] | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
I forgot to mention issue #3 3) The help text for redistributedata does not have the "START REMOVE n" option. | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by David Hall (Inactive) [ 2017-01-20 ] | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
Crash is caused by rewind of unopened plan file when no data is scheduled to be moved. This happens in the displayPlan() function. | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Andrew Hutchings (Inactive) [ 2017-01-20 ] | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
Fix for the problem Daniel found merged | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Daniel Lee (Inactive) [ 2017-01-21 ] | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
Build tested: 1.0.7-1 mcsadmin> getsoft Name : mariadb-columnstore-platform Retested and not seeing the write engine crashing issue anymore. Still waiting for the help text to be fixed. | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by David Thompson (Inactive) [ 2017-01-21 ] | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
update the help text to (only change is the new START REMOVE line in args): Command: redistributeData Description: Redistribute table data accross all dbroots to balance disk usage Arguments: START to begin a redistribution | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by David Hill (Inactive) [ 2017-01-21 ] | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
review pull request | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by David Hill (Inactive) [ 2017-01-21 ] | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
development test with new build mcsadmin> help redistributeData Command: redistributeData Description: Redistribute table data accross all dbroots to balance disk usage Arguments: START to begin a redistribution | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Daniel Lee (Inactive) [ 2017-01-23 ] | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
Build verified: 1.0.7-1 Name : mariadb-columnstore-platform mcsadmin> help redistributedata Command: redistributeData Description: Redistribute table data accross all dbroots to balance disk usage Arguments: START to begin a redistribution Made suggestion on improving the help text, which is in For the redistributedata, we may want to consider change this line of help text. We can do it in the later and we don't need to redo the packages thought. "START REMOVE n to being a redistribution where data is removed from dbroot 'n'". The word removed is just a bit too alarming for me. Some think like "...where data is moved off dbroot n" maybe a bit better. |