[MCOL-307] implement redistribution logic Created: 2016-09-22 Updated: 2023-10-26 Resolved: 2016-12-08 |
|
| Status: | Closed |
| Project: | MariaDB ColumnStore |
| Component/s: | ? |
| Affects Version/s: | None |
| Fix Version/s: | 1.0.6 |
| Type: | New Feature | Priority: | Critical |
| Reporter: | David Thompson (Inactive) | Assignee: | Daniel Lee (Inactive) |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | None | ||
| Sprint: | 2016-21, 2016-22, 2016-23, 2016-24 |
| Description |
|
A data rebalancing function is needed to support redistribution of data between nodes. This should be surfaced as a command in mcsadmin. This should check for writes being disabled before proceeding. Suspend write can be run to perform this. |
| Comments |
| Comment by David Thompson (Inactive) [ 2016-10-25 ] | ||||||||||
|
This should be implemented as a set of mcsadmin commands to support:
This should disable writes while running. Not sure if it's realistic or not to consider concurrent queries? | ||||||||||
| Comment by David Hall (Inactive) [ 2016-11-14 ] | ||||||||||
|
Added command 4 - Redistribute to the mcsadmin code | ||||||||||
| Comment by David Hall (Inactive) [ 2016-11-14 ] | ||||||||||
|
I couldn't get this thing into review mode, so it's in test for now. DHill, please review. | ||||||||||
| Comment by David Hill (Inactive) [ 2016-12-02 ] | ||||||||||
|
reviewed completed... Needs some test cases from David Hall.... 2 of them should be: dbroot assigned to pm1, and dbroot assigned to another pm.. | ||||||||||
| Comment by David Hill (Inactive) [ 2016-12-06 ] | ||||||||||
|
Did recommend changing the command name from redistribute to redistributeData and here is a link to the document, which will be publish in 1.0.6 release https://mariadb.com/kb/en/mariadb/columnstore-redistribute-data/ | ||||||||||
| Comment by David Hall (Inactive) [ 2016-12-06 ] | ||||||||||
|
Name changed to redistributeData. Because this command works at partition granularity, it won't move any data for smaller tables (See documentation). There are three tests that make sense. Start a multi PM system. 1) Add large amounts of data using cpimport mode 2 or 3 to create an unbalanced table (or more). Run redistributeData and watch the result. For example using lineitem, run this query before and after to see how many rows moved. The counts should be about the same (+/- 64000000). 2) Start with a multi PM system. Have an extra PM ready to add (or at least an extra dbroot or two). Add a bunch of data, then add the new PM in and run redistributeData. See what happens. 3). After test (2), run redistributeData again, but with "remove" and the newly added dbroots. It should empty those new roots, moving the data to the old dbroots. | ||||||||||
| Comment by Daniel Lee (Inactive) [ 2016-12-07 ] | ||||||||||
|
Per discussion on the Slack channel today, redistributeData does not check for database write suspension for now. It will run as the use requests it. That set's the testing scope for QA for testing this feature for this release. | ||||||||||
| Comment by Daniel Lee (Inactive) [ 2016-12-08 ] | ||||||||||
|
Build tested: 1.0.6-1 mcsadmin> getsoft Name : mariadb-columnstore-platform "START REMOVE" does move any data. It finished immediately [6:10] WriteEngineServer returned status 1: Cleared. [6:10]
----------
---------- In the KB article, this statement "Even TRUNCATE doesn't really get rid of the partition, though the space may eventually get re-used. DROP PARTITION will affect the balancing of Redistribute." is not correct. TRUNCATE drops all files for the columns and pre allocate each column with an abbreviated extent file. | ||||||||||
| Comment by Daniel Lee (Inactive) [ 2016-12-08 ] | ||||||||||
|
reopen per my last comment. | ||||||||||
| Comment by David Hall (Inactive) [ 2016-12-08 ] | ||||||||||
|
The KB has been changed to remove the comment about TRUNCATE. All references to start remove have been removed from the help file and KB. Closing this. | ||||||||||
| Comment by David Hall (Inactive) [ 2016-12-08 ] | ||||||||||
|
Needs testing | ||||||||||
| Comment by David Hall (Inactive) [ 2016-12-08 ] | ||||||||||
|
Fixed the problem with the help file | ||||||||||
| Comment by Daniel Lee (Inactive) [ 2016-12-08 ] | ||||||||||
|
Build verified: 1.0.6-1 Github source [root@localhost mariadb-columnstore-server]# git show Update README.md [root@localhost mariadb-columnstore-server]# cd mariadb-columnstore-engine/ Merge pull request #76 from mariadb-corporation/ mcsadmin> redistributedata Command: redistributeData Description: Redistribute table data accross all dbroots to balance disk usage Arguments: START to begin a redistribution Also verified KB article regarding TRUNCATE. "Deleted records still take up space, so deleting a bunch of rows won't affect Redistribute." The "START REMOVE" issue is not being tracked by ticket |