[MCOL-680] cpimport not creating clean (new) partitions Created: 2017-04-24 Updated: 2017-11-27 Resolved: 2017-11-27 |
|
| Status: | Closed |
| Project: | MariaDB ColumnStore |
| Component/s: | cpimport |
| Affects Version/s: | 1.0.8 |
| Fix Version/s: | Icebox |
| Type: | Bug | Priority: | Major |
| Reporter: | Allan | Assignee: | Andrew Hutchings (Inactive) |
| Resolution: | Not a Bug | Votes: | 0 |
| Labels: | None | ||
| Environment: |
centos 7; XFS file system |
||
| Attachments: |
|
||||||||||||
| Issue Links: |
|
||||||||||||
| Description |
|
Under Centos 6 when loading data with cpimport the successfully loaded data occupied a fraction of the (apparently) last partition and added new partitions. Under Centos 7 with the XFS file system it appears to interleave the new data with the old. This make calDropPartitionByValue not work and also produces the error: Internal error: IDB-2040: No partition found for the given value range Yet when you show partitions, they are there. |
| Comments |
| Comment by David Thompson (Inactive) [ 2017-04-25 ] | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
Thanks, i think i can reproduce this. What i think is happening is that the partition id parts are not being made unique causing the partition procs to get confused when referencing partitions.
as you can see there are 8 entries in the extent map but 4 (with overlapping date values). | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Andrew Hutchings (Inactive) [ 2017-04-25 ] | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
OK. I think this behaviour is correct and I'll try my best to explain why (warning: some of the terms used here are pretty overloaded, particularly 'partition' and 'segment')... ColumnStore's algorithm for filling extents goes as follows, I'm assuming single server here as it is easier to explain: 1. Make 4 segments for the extent Step 2 is the interesting part here, it adds a second extent segment to each file. When using information_schema.columnstore_extents you can see this effect:
When the block offset is 4096 that means it starts at block 4096 on the previous file. Now, the two segments for each segment ID are stored in the same file and the partition functions are based on segment files. The IDs in the partition functions are in the format partition_id.segment_id.dbroot. It cannot distinguish between the two segments in the segment file and effectively merges the min/max for the purposes of these functions. So, to summarise the sequence when filling extents is: segment 0 As far as the partition functions go, this will look like a full partition gets appended because it works at the file level and a segment file can contain 2 segments. I hope this helps. Please let me know if any of this needs explaining in more detail. | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by David Thompson (Inactive) [ 2017-04-25 ] | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
I've updated the partition document to clarify this: https://mariadb.com/kb/en/mariadb/columnstore-partition-management The behavior still seems wrong to me but need to think through why it was done this way and if we are losing something with this. | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Allan [ 2017-04-25 ] | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
I agree that the behavior is wrong and this is why it impacts us in a very large way. We load 100 million records or more every day (for just one machine) and we also need to age out old records. Because the data loaded 'today' by cpimport is not distinguished by partition we can't do a drop partition to get rid of 'old' data. We have to result to a MySQL DELETE statement that is 3 orders of magnitude slower, literally. We did not have this problem under InfiniDB | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Allan [ 2017-04-25 ] | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
The second issue in this report is the error message (which was never addressed): Internal error: IDB-2040: No partition found for the given value range What's going on with this. | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Andrew Hutchings (Inactive) [ 2017-04-25 ] | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
Sorry, I missed that part with my initial message. I've noticed you have opened a new bug report for this ( | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Andrew Hutchings (Inactive) [ 2017-04-28 ] | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
Another suggestion by Allan in "I suggest that if bulk loading is going to fill the extents of partitions that some bulk delete/drop method be made available to function like calDropPartitionByValue. Perhaps calDropExtentByValue" We could do this once Alternatively we could re-use extents marked as free. | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by David Thompson (Inactive) [ 2017-11-27 ] | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
Have changed |