[MCOL-2044] Update CPRange max/min for extent doing DML Created: 2018-12-20  Updated: 2021-07-09  Resolved: 2021-06-25

Status: Closed
Project: MariaDB ColumnStore
Component/s: ?
Affects Version/s: 1.0.0, 1.2
Fix Version/s: 6.1.1

Type: Bug Priority: Major
Reporter: Richard Stracke Assignee: Sergey Zefirov
Resolution: Fixed Votes: 0
Labels: None

Attachments: File gen-cross-extent-and-insert-again.py     File gen-cross-extent-and-insert-again.py     File gen-cross-extent.py    
Issue Links:
PartOf
includes MCOL-4722 Add extent range keeping for single-v... Open
is part of MCOL-4343 umbrella for tech debt issues Open
Problem/Incident
causes MCOL-4766 UPDATE/INSERT in a transaction does n... Closed
causes MCOL-4779 Invalid extent ranges for short chara... Closed
Relates
relates to MCOL-4673 Regression: calShowPartition returns N/A Closed
Epic Link: ColumnStore Performance Improvements
Sprint: 2021-1, 2021-2, 2021-3, 2021-4, 2021-5, 2021-6, 2021-7, 2021-8, 2021-9

 Description   

An extent is marked as invalid if DELETE, INSERT, UPDATE touches this extent. The extent has to be reread by SELECT to become valid again. If one bulk inserts in the invalid extent then min/max values aren't updated in the EM.
One should read the data out after DML operations and before you bulk inserting data to make extents touched by DML valid again. To reduce number of extents to read one could set an impossible filter to eliminate all valid extents.



 Comments   
Comment by Sergey Zefirov [ 2020-11-16 ]

I will talk below about MAX operation and/or property. Same logic can be applied to MIN operation as well.

The mix operation is saturating: if you add an element to a set, the max change if new element is greater than old min/max. Basically, INSERT operations can compute max on the fly.

The DELETE operation may result in recomputation if element to delete is equal to MAX.

The UPDATE operation can result in recomputation if element to be updated is equal to MAX and new value is less than MAX.

Comment by Sergey Zefirov [ 2020-11-16 ]

Typical sequence of actions is "process INSERT/UPDATE/DELETE, collecting LBIDs; mark extents for LBIDs invalid".

The collecting phase is done with method AddLBIDToList in writeengine/wrapper/writeengine.* - it is called from the processing of operations.

The second part is to call DBRM instance's method markExtentsInvalid (note the "s" in Extent*s* - it operates over vector). At this point we are out of luck to change something.

From the notes above and general structure of code it looks like we cannot write some general code. We must patch operations of INSERT/UPDATE/DELETE instead.

The idea is to fetch extent ranges before calling AddLBIDToList and check them and not call AddLBIDToList if ranges do not change.

Comment by Sergey Zefirov [ 2020-11-17 ]

CPRange contains int64_t (uint64_t) ranges.

So we have to work with subset of all database types, namely, integers.

The "invalid" state is when upper bound of range is smaller than lower bound.

Comment by Sergey Zefirov [ 2020-11-17 ]

Thorough look through the code revealed that actuall calls for extent invalidaton live in writeengine/wrapper/writeengine.cpp:

  1. Bulk inserts should store correct min/max - there is a call to setExtentsMinMax which does the actual work.
  2. Single inserts mark extents as invalid.
  3. Single row updates mark extents as invalid.
  4. Bulk row updates mark extents as invalid
  5. Deletes internally use updates and, consequently, mark extents as invalid.

So I think I will do experiment to see whether batch inserts do actually record information. And if so, I will use them as an example on how to proceed with other operations.

Comment by Sergey Zefirov [ 2020-11-17 ]

No, bulk inserts processing does not store correct min/max values. For some reason it clears extent with explicitly created cprange values with invalid marks.

It also set extents as invalid twice - in the beginning for new PM (what it is, I want to know?) and later if all of the processing was done without errors.

Comment by Sergey Zefirov [ 2020-11-18 ]

From what I get, the number of rows that get inserted is bounded by extent size: the call to allocRowId sets number of rows to be added (by changing rowsLeft variable).

Comment by Sergey Zefirov [ 2020-11-20 ]

Relevant branch in my fork is here.

Not much there yet.

I am still gathering information.

Comment by Sergey Zefirov [ 2020-11-20 ]

The proper handling of ranges seem to be more complex than simply setting them.

We have to keep two copies of cpranges - original for rollback and current ones. At commit point we do nothing or may copy ranges into rollback copy immediately. At rollback we do the opposite - ranges for rolled back extents are copied into current ones.

Because of this, we must remove bulk drop through calling AddLBIDToList (writeengine.cpp) and subsequent marking of extents in this list as invalid ones.

Write Engine class contains a map of transactions (sparse array represented as map) to TxnLBIDRec's. These TxnLBIDRecs contain a map of LBIDs processed. We may record cprange changes there.

I think I'll start with that.

Comment by Roman [ 2020-11-20 ]

As I said we don't need to account for rollback doing this task. Mark the
extents as invalid on rollback.

Regards,
Roman Nozdrin
ColumnStore Engineering
MariaDB Corporation

On Fri, Nov 20, 2020 at 6:15 PM Sergey Zefirov (Jira) <jira@mariadb.org>

Comment by Sergey Zefirov [ 2020-11-25 ]

Relevant code branch: https://github.com/mariadb-SergeyZefirov/mariadb-columnstore-engine/tree/MCOL-2044-update-ranges-during-DML

Right now I am rewriting and fixing various things in the main write engine.

Comment by Sergey Zefirov [ 2020-11-26 ]

Tests:

MariaDB [test]> use test; create table t (x integer) engine=columnstore; insert into t(x) values (1), (2);
Database changed
ERROR 1815 (HY000): Internal error: Lost connection to DDLProc
ERROR 1146 (42S02): Table 'test.t' doesn't exist

It is strange that problem arise in DDL part.

Comment by Sergey Zefirov [ 2020-11-27 ]

Refactored "mark used extents as invalid" logic works.

I have to review and test it for probable memory leaks, though.

Comment by Sergey Zefirov [ 2020-11-27 ]

Branch relevant to refactoring done is here.

No leaks reported or found during review.

Comment by Sergey Zefirov [ 2020-11-30 ]

PR fore refactored code.

There is another structure for keeping map of LBIDs and their ranges, called CPMaxMinMergeMap_t (bmtypes.h). It is quite close but to completely same to what I am doing.

Comment by Sergey Zefirov [ 2020-12-01 ]

There is a good new functionality of type handlers.

It allows to find a type handler for a column type (which I have in inserts etc) and perform various tasks there.

Also, there's clear need to make bridges with WriteEngine:

  boost::any getNullValueForType(const SystemCatalog::TypeAttributesStd &attr)
                                                                const override
  {
    //TODO: How to communicate with write engine?
    return boost::any();
  }

(it is from datatypes/mcs_datatypes.h)

I guess I have to make at least one bridge then - convert boost::any that is in ColTuple to a value,

Comment by Sergey Zefirov [ 2020-12-07 ]

First tries show no core dumps but no extent information change either. Will review code again, most probble I forgot something.

Comment by Sergey Zefirov [ 2020-12-08 ]

Currently I have an issue:

MariaDB [(none)]> use test; create table t (x integer) engine=columnstore;
Database changed
Query OK, 0 rows affected (0.307 sec)
 
MariaDB [test]> select * from information_schema.columnstore_tables;
Empty set (0.040 sec)
 
MariaDB [test]> show create table test.t;
+-------+-------------------------------------------------------------------------------------------+
| Table | Create Table                                                                              |
+-------+-------------------------------------------------------------------------------------------+
| t     | CREATE TABLE `t` (
  `x` int(11) DEFAULT NULL
) ENGINE=Columnstore DEFAULT CHARSET=latin1 |
+-------+-------------------------------------------------------------------------------------------+
1 row in set (0.001 sec)
 
MariaDB [test]> Bye

I can create table but CS does not recognize it or record it in its internal records.

I am looking for what and who is the reason.

Comment by Roman [ 2020-12-09 ]

I'd tested the case and it works for me in develop.
Regards,
Roman Nozdrin
ColumnStore Engineering
MariaDB Corporation

On Tue, Dec 8, 2020 at 10:40 PM Sergey Zefirov (Jira) <jira@mariadb.org>

Comment by Sergey Zefirov [ 2020-12-14 ]

Porting my changes from MCOL-2044-update-ranges-after-DML into new branch from current develop.

So far, so good, but slow.

Comment by Sergey Zefirov [ 2021-01-11 ]

The culprit: parameters with default values, bool fromList = true and my addition CPInfo* maxmins. And also my insufficient familiarity with the code base.

The pointer (maxmins) was silently converted to boolean value, because I forgot to pass explicit "true" for the call to convertValArray.

And result is:

MariaDB [(none)]>use test; create table t (x integer) engine=columnstore; insert into t(x) values (42), (42);
Database changed
Query OK, 0 rows affected (0.289 sec)
 
ERROR 1815 (HY000): Internal error: CAL0001: Insert Failed:   BRM error getting extent max/min [BRM error status: image inconsistency]  
MariaDB [test]> select * from INFORMATION_SCHEMA.COLUMNSTORE_TABLES;
+--------------+------------+-----------+---------------------+--------------+---------------+
| TABLE_SCHEMA | TABLE_NAME | OBJECT_ID | CREATION_DATE       | COLUMN_COUNT | AUTOINCREMENT |
+--------------+------------+-----------+---------------------+--------------+---------------+
| test         | t          |      3000 | 2021-01-11 00:00:00 |            1 |          NULL |
+--------------+------------+-----------+---------------------+--------------+---------------+
1 row in set (0.022 sec)

So table gets added but DML (insertion) does not work due to error above.

Comment by Sergey Zefirov [ 2021-01-12 ]

The "image inconsistency" error is due to different semantics of "int" return code in different parts of the system. It has nothing to do with the distributed work.

Comment by Sergey Zefirov [ 2021-01-12 ]

The plan:

  1. finish multiple inserts part - single extent and cross-extent inserts. First part expected tomorrow (13 Jan). Second part needs more complex tests than is used for single extent and I hope to have them ready and debugged by the end of the week (15 Jan).
  2. Add same code for updates. The difference is that we can overwrite minimum or maximum value and it is not possible to know whether there are such values in other blocks. But we can rescan blocks we are updating for old max/min values and not mark extent range invallid if we have these value in the block after update. A week till 22 Jan.
  3. Add same code for deletes. The problem with updates extend to deletes too. A week till 22 Jan.
Comment by Sergey Zefirov [ 2021-01-13 ]

Some operation can drop ranges to "unknown range" value: as an example, deletion of value that is equal to maximum or minimum.

Thus, there can be insertion into an extent that has "unknown range" range value. This insertion also must result in "unknown range".

The code to test:

CREATE TABLE t (x INTEGER) ENGINE=COLUMNSTORE;
INSERT INTO t (x) VALUES (1), (2), (4), (3); -- range is 1..4.
DELETE  FROM t WHERE x=1; -- range may drop to unknown, but not neccessarily so.
INSERT INTO t(x) VALUES (5); -- may result in unknown range or in range 1..5.

The code above does not test for cross-block or cross-extent operation.

Comment by Sergey Zefirov [ 2021-01-13 ]

As a side note, immutable data strctures like in log structured merge trees sense (or any other static structure for that matter) does not suffer from proliferation of corner cases.

A sequence of values has a range of values (including counting ranges where each max and min value has associated count, below an explanation). The range is computed for that concrete sequence of values and will not change because sequence will not change too.

Instead of update we insert. Instead of delete we write deletion mark . The insertions form sequence of values which has range (possibly, counted). The deletion also form sequence of values and also has range, possiblly counted.

We read sequences of values according to the ranges. If some sequence of values does not have values from the range we interested in, we will not read it.

We may have a hierarchy of sequences just like B-tree index, with ranges associated with coarser levels.

Now I stop - I have to work on concrete code for concrete problem. But I am not conceding.

Comment by Sergey Zefirov [ 2021-01-13 ]

Okay, debugging the updates.

So far I see no regressions but also ranges do not get updated. Investigating the latter.

Comment by Sergey Zefirov [ 2021-01-14 ]

Now the culprit in the commit logic - commit does a call to bulk set of HW and CPInfo with the sequence number of -1 (magic sequence number indicating the need of reset).

It does so for all extents modified.

Comment by Sergey Zefirov [ 2021-01-18 ]

Last friday I succed with the range updates.

Now I am working on the test case for cross-extent test case.

Comment by Sergey Zefirov [ 2021-01-19 ]

I have strange behavior here:

MariaDB [(none)]> USE test;
Database changed
MariaDB [test]> CREATE TABLE t (x integer) ENGINE=columnstore;
Query OK, 0 rows affected (0.318 sec)
 
MariaDB [test]> INSERT INTO t(x) SELECT 1 FROM seq_1_to_8388600;
Query OK, 8388600 rows affected (8.222 sec)
Records: 8388600  Duplicates: 0  Warnings: 0
 
MariaDB [test]> INSERT INTO t(x) VALUES (-42), (43), (44);
Query OK, 3 rows affected (0.140 sec)
Records: 3  Duplicates: 0  Warnings: 0
 
MariaDB [test]> SELECT * FROM information_schema.columnstore_extents;
+-----------+-------------+---------------------+-------------------+-----------+-----------+-------+--------+--------------+------------+--------------+------------+-----------------+-------+-----------+-----------+
| OBJECT_ID | OBJECT_TYPE | LOGICAL_BLOCK_START | LOGICAL_BLOCK_END | MIN_VALUE | MAX_VALUE | WIDTH | DBROOT | PARTITION_ID | SEGMENT_ID | BLOCK_OFFSET | MAX_BLOCKS | HIGH_WATER_MARK | STATE | STATUS    | DATA_SIZE |
+-----------+-------------+---------------------+-------------------+-----------+-----------+-------+--------+--------------+------------+--------------+------------+-----------------+-------+-----------+-----------+
|      3001 | Column      |              234496 |            238591 |         1 |         1 |     4 |      1 |            0 |          0 |            0 |       4096 |            4095 | Valid | Available |  33554432 |
|      3001 | Column      |              238592 |            242687 |       -42 |        44 |     4 |      1 |            0 |          1 |            0 |       4096 |               0 | Valid | Available |         0 |
+-----------+-------------+---------------------+-------------------+-----------+-----------+-------+--------+--------------+------------+--------------+------------+-----------------+-------+-----------+-----------+
2 rows in set (0.014 sec)
 
MariaDB [test]>

I insert 8388600 (0x07ffff8 - 8 less than 2^23^) , which obviously uses cpimport. Then I insert three values and all three go into second extent - which extent should not exist in a first place.

So I will work on the update and delete part now on.

Comment by Sergey Zefirov [ 2021-01-20 ]

Partial success in cross-extent insertion.

The test gen-cross-extent.py is in the attachment and it should produce two extents, one filled with 1 and ending with 42 (range 1..42) and second with two values 43 and 44 (with obvious range 43..44).

I've got correct result for first extent and incorrect one for second (below). But I see no exceptions, assertions or other regressions.

MariaDB [(none)]> SELECT * FROM information_schema.columnstore_extents;
+-----------+-------------+---------------------+-------------------+-----------+-----------+-------+--------+--------------+------------+--------------+------------+-----------------+----------+-----------+-----------+
| OBJECT_ID | OBJECT_TYPE | LOGICAL_BLOCK_START | LOGICAL_BLOCK_END | MIN_VALUE | MAX_VALUE | WIDTH | DBROOT | PARTITION_ID | SEGMENT_ID | BLOCK_OFFSET | MAX_BLOCKS | HIGH_WATER_MARK | STATE    | STATUS    | DATA_SIZE |
+-----------+-------------+---------------------+-------------------+-----------+-----------+-------+--------+--------------+------------+--------------+------------+-----------------+----------+-----------+-----------+
|      3001 | Column      |              234496 |            238591 |         1 |        42 |     4 |      1 |            0 |          0 |            0 |       4096 |            4095 | Valid    | Available |  33554432 |
|      3001 | Column      |              238592 |            242687 |      NULL |      NULL |     4 |      1 |            0 |          1 |            0 |       4096 |               0 | Updating | Available |         0 |
+-----------+-------------+---------------------+-------------------+-----------+-----------+-------+--------+--------------+------------+--------------+------------+-----------------+----------+-----------+-----------+
MariaDB [(none)]> select * from test.t where x > 1;
+------+
| x    |
+------+
|   43 |
|   44 |
|   42 |
+------+
3 rows in set (0.220 sec)

Comment by Sergey Zefirov [ 2021-01-22 ]

The results for gen-cross-extent-and-insert-again.py :

[root@sergueyz-devel-1 serguey_zefirov_mariadb_com]# cs-docker-tools/testmarks/test-insertion/gen-cross-extent-and-insert-again.py |mysql
[root@sergueyz-devel-1 serguey_zefirov_mariadb_com]# mysql
Welcome to the MariaDB monitor.  Commands end with ; or \g.
Your MariaDB connection id is 12
Server version: 10.6.0-MariaDB-debug MariaDB Server
 
Copyright (c) 2000, 2018, Oracle, MariaDB Corporation Ab and others.
 
Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.
 
MariaDB [(none)]> SELECT * FROM information_schema.columnstore_extents;
+-----------+-------------+---------------------+-------------------+-----------+-----------+-------+--------+--------------+------------+--------------+------------+-----------------+---------+-----------+-----------+
| OBJECT_ID | OBJECT_TYPE | LOGICAL_BLOCK_START | LOGICAL_BLOCK_END | MIN_VALUE | MAX_VALUE | WIDTH | DBROOT | PARTITION_ID | SEGMENT_ID | BLOCK_OFFSET | MAX_BLOCKS | HIGH_WATER_MARK | STATE   | STATUS    | DATA_SIZE |
+-----------+-------------+---------------------+-------------------+-----------+-----------+-------+--------+--------------+------------+--------------+------------+-----------------+---------+-----------+-----------+
|      3001 | Column      |              234496 |            238591 |         1 |        42 |     4 |      1 |            0 |          0 |            0 |       4096 |            4095 | Valid   | Available |  33554432 |
|      3001 | Column      |              238592 |            242687 |      NULL |      NULL |     4 |      1 |            0 |          1 |            0 |       4096 |               0 | Invalid | Available |         0 |
+-----------+-------------+---------------------+-------------------+-----------+-----------+-------+--------+--------------+------------+--------------+------------+-----------------+---------+-----------+-----------+
[root@sergueyz-devel-1 serguey_zefirov_mariadb_com]# mysql
Welcome to the MariaDB monitor.  Commands end with ; or \g.
Your MariaDB connection id is 17
Server version: 10.6.0-MariaDB-debug MariaDB Server
 
Copyright (c) 2000, 2018, Oracle, MariaDB Corporation Ab and others.
 
Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.
 
MariaDB [(none)]> select * from test.t where x > 1;
+--------+
| x      |
+--------+
|     43 |
|     44 |
| 424242 |
|     42 |
+--------+

I've added an insertion after cross-extent insertion. I expected the following:

  1. In case everything working file I would get ranges 1..42 for first extent and 43..424242 for second.
  2. in (worst) case when update works only for values that are getting inserted, I would get ranges 1..42 and 424242..424242.
  3. the result above indicates that range update does not work at all for second extent. Possible reasons are: sequence numbers are off, we do not get extent range at all, etc.

The bright side: I have no regressions otherwise.

Comment by Sergey Zefirov [ 2021-01-22 ]

The reason for not quite expected result in previous attempt was due to single wor insert. Single row and batch inserts have different code paths in the write engine (a problem in itself) and single row insertion does not have range update code (yet).

I've changed test code from singular insert to batch insert ("insert into t ( x ) values (424242),(424243);" instead of insertion of just 424242) and got the following result:

MariaDB [(none)]> select * from test.t where x < 43 and x > 1;
+------+
| x    |
+------+
|   42 |
+------+
1 row in set (0.058 sec)
 
MariaDB [(none)]> SELECT * FROM information_schema.columnstore_extents;
+-----------+-------------+---------------------+-------------------+-----------+-----------+-------+--------+--------------+------------+--------------+------------+-----------------+-------+-----------+-----------+
| OBJECT_ID | OBJECT_TYPE | LOGICAL_BLOCK_START | LOGICAL_BLOCK_END | MIN_VALUE | MAX_VALUE | WIDTH | DBROOT | PARTITION_ID | SEGMENT_ID | BLOCK_OFFSET | MAX_BLOCKS | HIGH_WATER_MARK | STATE | STATUS    | DATA_SIZE |
+-----------+-------------+---------------------+-------------------+-----------+-----------+-------+--------+--------------+------------+--------------+------------+-----------------+-------+-----------+-----------+
|      3009 | Column      |              234496 |            238591 |         1 |        42 |     4 |      1 |            0 |          0 |            0 |       4096 |            4095 | Valid | Available |  33554432 |
|      3009 | Column      |              238592 |            242687 |    424242 |    424243 |     4 |      1 |            0 |          1 |            0 |       4096 |               0 | Valid | Available |         0 |
+-----------+-------------+---------------------+-------------------+-----------+-----------+-------+--------+--------------+------------+--------------+------------+-----------------+-------+-----------+-----------+

The result of first select should be 43 and 44 - the values we inserted into second insert in our cross-extent insertion. These values were not recorded (the range for that extent was set to empty range with "updating" state).

The values 424242 and 424243 were recorded into range and resulted in a wrong range and, consequently, wrong selection result.

If we are updating ranges, we have to do that consistently.

Comment by Sergey Zefirov [ 2021-01-25 ]

About consistency of updates.

Terms:

  • "value is strictly within range" - value is less than range's max and greater than range's min.
  • "to drop range" is to make range invalid.

Deletion of single element from block:

  • we keep range intact if element we are deleting is strictly within range.
  • we drop the range if element is equal to range's max or equal to range's min.

Please note that it is possible to reexamine the block from which element(s) are deleted and check whether it still has elements that define same range (e.g., equal to max and equal to min). It is like the rescan operation, but for block only.

Update of an element:

  • we keep range if both old value and new value are strictly within range.
  • we should drop range if old value is on the boundary of range and new value is not equal to old.
  • we update range if new value is outside of range's boundary (less than min or greater than max).

It is clear that it is possible to have some extents with invalid ranges even if we try to keep ranges updated.

This means that insertion can encounter extents with invalid ranges and not only in case of the new extent.

For example, imagine we inserted enough values selected alternatively from 42 and 44 to cross block bundaries and now we have at least two blocks. Then, in some block, we update everything to 43. Now this block will not produce old range after block rescan and that forces us to drop whole extent.

This, basically, forces us to adopt the following rule: on insertion, we update invalid extent range to valid range only if we are inserting into new extent.

  • If the first rowid for insertion is not zero we update only valid ranges and keep invalid ones.
Comment by Sergey Zefirov [ 2021-01-25 ]

There are two separate allocation paths in the WriteEngine::insert* code: one for isFirstExtentOnThisPM==true and another is for regular situation but when existing extent was overflown.

This indicates a need for tests which invoke creation of "first extent on this partition manager" situation.

Comment by Sergey Zefirov [ 2021-01-25 ]

https://jira.mariadb.org/browse/MCOL-2044?focusedCommentId=178041&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel - now whole test works allright:

MariaDB [(none)]> SELECT * FROM information_schema.columnstore_extents;
+-----------+-------------+---------------------+-------------------+-----------+-----------+-------+--------+--------------+------------+--------------+------------+-----------------+-------+-----------+-----------+
| OBJECT_ID | OBJECT_TYPE | LOGICAL_BLOCK_START | LOGICAL_BLOCK_END | MIN_VALUE | MAX_VALUE | WIDTH | DBROOT | PARTITION_ID | SEGMENT_ID | BLOCK_OFFSET | MAX_BLOCKS | HIGH_WATER_MARK | STATE | STATUS    | DATA_SIZE |
+-----------+-------------+---------------------+-------------------+-----------+-----------+-------+--------+--------------+------------+--------------+------------+-----------------+-------+-----------+-----------+
|      3001 | Column      |              234496 |            238591 |         1 |        42 |     4 |      1 |            0 |          0 |            0 |       4096 |            4095 | Valid | Available |  33554432 |
|      3001 | Column      |              238592 |            242687 |        43 |    424243 |     4 |      1 |            0 |          1 |            0 |       4096 |               0 | Valid | Available |         0 |
+-----------+-------------+---------------------+-------------------+-----------+-----------+-------+--------+--------------+------------+--------------+------------+-----------------+-------+-----------+-----------+

Going to have tests for updates, etc, using MTR framework.

Comment by Sergey Zefirov [ 2021-01-26 ]

Update and delete operations works the same and they operate on columns split into extents, quite fortunately.

Column operation for those operations reads a block from extent (based on rowid), updates it and then writes back.,

"Write column records" operation of write engine also differ between bulk inserts and update/delete.

This means we can more or less safely change logic of the specific routines.

Comment by Sergey Zefirov [ 2021-01-28 ]

Degenerate corner case.

Suppose we have an extent with just single value, 5, ont or more rows.

Current logic of update checks are:

if ( (oldValue <= currentMin && value > oldValue) || (oldValue >= currentMax && value < oldValue))
{ // old value at boundary and new value does not extend or preserve that boundary.
    mark extent range invalid.
}
else
{
    update range with new value.
}

Updating row from 5 to 6 will trigger right subexpression of logical OR in condition above and updating from 5 to 4 will trigger left subexpression. In any case we will get extent range invalidated.

Comment by Sergey Zefirov [ 2021-01-29 ]

Tests are here.
Partial success with updates:

MariaDB [(none)]> use test;
Database changed
MariaDB [test]> drop table if exists t;
Query OK, 0 rows affected, 1 warning (0.093 sec)
 
MariaDB [test]> create table t(x integer) engine=columnstore;
Query OK, 0 rows affected (0.330 sec)
 
MariaDB [test]> insert into t(x) values (44),(55),(66); -- range must be 44..66.
Query OK, 3 rows affected (0.314 sec)
Records: 3  Duplicates: 0  Warnings: 0
 
MariaDB [test]> update t set x=65 where x=44; -- range must drop to invalid
Query OK, 0 rows affected (0.201 sec)
Rows matched: 0  Changed: 0  Warnings: 0
 
MariaDB [test]> select * from t;
+------+
| x    |
+------+
|   44 |
|   55 |
|   66 |
+------+
3 rows in set (0.028 sec)
 
MariaDB [test]> select * from information_schema.columnstore_extents;
+-----------+-------------+---------------------+-------------------+-----------+-----------+-------+--------+--------------+------------+--------------+------------+-----------------+-------+-----------+-----------+
| OBJECT_ID | OBJECT_TYPE | LOGICAL_BLOCK_START | LOGICAL_BLOCK_END | MIN_VALUE | MAX_VALUE | WIDTH | DBROOT | PARTITION_ID | SEGMENT_ID | BLOCK_OFFSET | MAX_BLOCKS | HIGH_WATER_MARK | STATE | STATUS    | DATA_SIZE |
+-----------+-------------+---------------------+-------------------+-----------+-----------+-------+--------+--------------+------------+--------------+------------+-----------------+-------+-----------+-----------+
|      3001 | Column      |              234496 |            238591 |      NULL |      NULL |     4 |      1 |            0 |          0 |            0 |       4096 |               0 | Valid | Available |         0 |
+-----------+-------------+---------------------+-------------------+-----------+-----------+-------+--------+--------------+------------+--------------+------------+-----------------+-------+-----------+-----------+
1 row in set (0.029 sec)

Range is dropped but update does not work correctly.

Comment by Sergey Zefirov [ 2021-01-29 ]

Regression is in insert code - insertion should update invalid range only if that invalid range for completely empty extent.

Comment by Sergey Zefirov [ 2021-01-29 ]

Regression was due to incorrect operation type set in WriteEngine.

Comment by Sergey Zefirov [ 2021-02-01 ]

Regression is fixed, debugging update's failure to keep ranges.

Comment by Sergey Zefirov [ 2021-02-03 ]

Fixed regression in "drop table" for columnstore-managed tables.

Comment by Sergey Zefirov [ 2021-02-03 ]

First non-trivial test runs:

-- tests updates within range.
use test;
drop table if exists t;
create table t(x integer) engine=columnstore;
insert into t(x) values (44),(55),(66); -- range must be 44..66.
update t set x=65 where x=55; -- range must stay 44..66.

We have 4-line defense tier in the range keeping: not so long ago we dropped ranges in writeengine, in we_dml, in dmlproc and, finally, in commandpackageprocessor in the COMMIT/ROLLBACK part.

Comment by Sergey Zefirov [ 2021-02-05 ]

It appears that my by-hand runs of tests indicate that most of cases work. Tests that should preserve or extend ranges do so, tests that should drop ranges also do so.

I am putting tests into MTR and form a PR then.

Comment by Sergey Zefirov [ 2021-02-10 ]

MTR tests successfully fail - I am working on providing proper results and validating MTR output.

Comment by Sergey Zefirov [ 2021-02-10 ]

Current results:

The servers were restarted 0 times
Spent 0.800 of 11 seconds executing testcases
 
Too many failed: Failed 10/11 tests, 9.09% were successful.

Comment by Sergey Zefirov [ 2021-02-12 ]

Current state of tests: Failed 3/11 tests, 72.73% were successful.

Deletes do not work properly - investigating.

Comment by Sergey Zefirov [ 2021-02-12 ]

Okay, only one remained to fix - delete of element inside range.

Comment by Sergey Zefirov [ 2021-02-12 ]

All current tests pass.

Comment by Sergey Zefirov [ 2021-02-15 ]

Some tests for unsigned integer types fail due to invalid range being displayed as max=0, min=-1 instead of NULL, NULL.

MTR output below:

--- ../storage/columnstore/columnstore/mtr/csinternal/extents_ranges/unsigned_integer//r/update-drop-min.result2021-02-15 13:45:37.850433247 +0000
+++ ../storage/columnstore/columnstore/mtr/csinternal/extents_ranges/unsigned_integer//r/update-drop-min.reject2021-02-15 19:07:47.997655426 +0000
@@ -6,4 +6,4 @@
 update t set x=65 where x=44;
 select max_value, min_value from information_schema.columnstore_extents;
 max_value	min_value
-NULL	NULL
+0	-1

Test expects two NULL values and get 0 and -1.

Comment by Sergey Zefirov [ 2021-02-15 ]

Tests for signed and unsigned integers and big ints have successfully passed.

Comment by Sergey Zefirov [ 2021-02-16 ]

Tests for tinyints, smallints, integers and bigints are passing, for both signed and unsigned variants.

Comment by Sergey Zefirov [ 2021-02-16 ]

Relevant pull request: https://github.com/mariadb-corporation/mariadb-columnstore-engine/pull/1771

Comment by Sergey Zefirov [ 2021-02-18 ]

There are two regressions: mcs30_update_table and mcs81_self_join.

msc30_update_table regression is due to SELECT with integer column comparison with constant wrongfully returning empty set. I suspect it is due to invalid range being in state CP_VALID and not CP_UPDATE.

Comment by Sergey Zefirov [ 2021-02-18 ]

The code that reports on extent ranges does not take into account case for unsigned types (especially bigint). It is in function generate_result() in file is_columnstore_extents.cpp:

                if (iter->partition.cprange.loVal == std::numeric_limits<int64_t>::max() ||
                        iter->partition.cprange.loVal <= (std::numeric_limits<int64_t>::min() + 1))
                {
                    table->field[4]->set_null();
                }
                else
                {
                    table->field[4]->set_notnull();
                    table->field[4]->store(iter->partition.cprange.loVal);
                }
 
                if (iter->partition.cprange.hiVal <= (std::numeric_limits<int64_t>::min() + 1))
                {
                    table->field[5]->set_null();
                }
                else
                {
                    table->field[5]->set_notnull();
                    table->field[5]->store(iter->partition.cprange.hiVal);
                }

It may bite some curious souls.

Comment by Sergey Zefirov [ 2021-02-19 ]

With recent changes (do not set invalid ranges to CP_VALID state) mcs30_update_table test passes. mcs81_self_join does not.

Comment by Sergey Zefirov [ 2021-02-19 ]

The same thing (in different place at write engine code) was the reason mcs81_self_join was failing.

Comment by Sergey Zefirov [ 2021-02-24 ]

I now have one test failing: mcs211_idbExtentId_function

I suspect it is due to my tests not clearing database/tables they created.

Comment by Sergey Zefirov [ 2021-02-24 ]

mcs211_idbExtentId_function is green now. I was right in my suspicion.

Comment by Sergey Zefirov [ 2021-03-01 ]

I am investigating strange discrepance in results:

select max_value, min_value from information_schema.columnstore_extents;
max_value min_value
-NULL NULL
+9223372036854776000 NULL

The bold value is ALMOST NULL - NULL being 0x8000000000000000 and the above is 0x80000000000000c0.

Logs for extent range set code indicate that we are setting everything right. But virtual table select result gives us something wrong.

Comment by Sergey Zefirov [ 2021-03-01 ]

One more interesting discrepancy:

+++ ../storage/columnstore/columnstore/mtr/basic/r/mcol2044_unsigned_tinyint_update-keeps-invalid-range.reject2021-03-01 18:26:14.127176717 +0000
@@ -7,5 +7,5 @@
 update t set x=65 where x=55;
 select max_value, min_value from information_schema.columnstore_extents;
 max_value	min_value
-NULL	NULL
+0	-1
 drop database test_ranges;

-1..0 is a valid invalid range for unsigned columns. Yet generate_result() in code of virtual table columnstore_extents fails to do so.

Fixing that ay break other tests.

Comment by Sergey Zefirov [ 2021-03-03 ]

I have genuine regressions in 8 tests - WE update operations sometimes fail and even BRM operations fail.

Comment by Sergey Zefirov [ 2021-03-03 ]

Here: https://ci.columnstore.mariadb.net/mariadb-corporation/mariadb-columnstore-engine/1788/1/9

"mcol_2044_unsigned_.*" tests are fixed (spaces vs tabs) the remaining ones are in work.

Comment by Sergey Zefirov [ 2021-03-03 ]

Failure of mcs12_alter_table test was due to inconsistent assignment of types in write engine.

During operations on system catalog token columns aren't marked as WR_TOKEN, but as WR_ULONGLONG and for these columns a special flag (tokenColumn) is set in column information record.

Some other tests (extent functions in particular) aer failing due to incomplete cleanup in failing tests.

Comment by Sergey Zefirov [ 2021-03-04 ]

Due to extent rescan during, well, select operation in other DML operations, it is not possible to properly test that update and delete keep invalid range.

This is so because we have to use selection operation (WHERE x=y) and that triggers extent rescan for invalid extents.

So I deleted these tests (16 in total). MTR is now green.

Comment by Sergey Zefirov [ 2021-03-11 ]

Fixed DECIMAL(38) tests, they should pass.

Also will see whether mcs30_update-table failure will persist.

Comment by Sergey Zefirov [ 2021-03-16 ]

The test001.sh from old regression tests (mariadb-columnstore-regression-tests) shows no difference in 10.6 and my PR behavior except for slightly better run times in my PR run.

Comment by Sergey Zefirov [ 2021-04-06 ]

The concerns I have:

  1. It is not easy to test rollback during single single DML operation. I just do not have tools for that. There might be bugs despite my best efforts to account for rollback.
  2. It is not easy to test distributed DML operation. Bulk insertion in a distributed setting has a special case for insertion when one of nodes doing work is just went up.
Comment by Sergey Zefirov [ 2021-04-19 ]

The algorithm for keeping ranges in update operation

Values and their meaning

old_value - old values corresponding to some RID to update; this value was overwritten.
new_value - new value corresponding to same RID to update as old_value; this value overwrite old one above.

range_max - max bound of a range, all values in extent are <= range_max
range_min - min bound of a range, all values in extent are >= range_min
range_max >= range_min is a prerequisite (range is not empty)

The condition to drop range to invalid state (empty range)

if (old_value == range_min && new_value > range_min) || (old_value == range_max && new_value < range_min): mark range invalid because of possible contraction of range.

Condition to test

The tests that demonstrate range dropping: https://github.com/mariadb-SergeyZefirov/mariadb-columnstore-engine/blob/develop/mysql-test/columnstore/basic/t/mcol2044_signed_bigint_update-drop-max.test and https://github.com/mariadb-SergeyZefirov/mariadb-columnstore-engine/blob/develop/mysql-test/columnstore/basic/t/mcol2044_signed_bigint_update-drop-min.test

The test that demonstrate range keeping while doing update operations that do not touch bounds: https://github.com/mariadb-SergeyZefirov/mariadb-columnstore-engine/blob/develop/mysql-test/columnstore/basic/t/mcol2044_signed_bigint_update-within-range.test

The tests that extends min and max: https://github.com/mariadb-SergeyZefirov/mariadb-columnstore-engine/blob/develop/mysql-test/columnstore/basic/t/mcol2044_signed_bigint_update-extends-min.test and https://github.com/mariadb-SergeyZefirov/mariadb-columnstore-engine/blob/develop/mysql-test/columnstore/basic/t/mcol2044_signed_bigint_update-extends-max.test

They provide demonstration of how algorithm behaves.

Comment by Sergey Zefirov [ 2021-04-20 ]

Range contraction problem

The problem is that we do not look at whole extent when doing update. We look at data on the block level.

If we update a block and overwrite a value that lies on the minimum bound and new value is bigger than old minimum bound, what new minimum bound should we assign? We can have another block with exact same minimum bound value.

Let me offer an example.

Imagine there is an extent with three values 1, 1, and 10 and our blocks hold exactly one value. We overwrite first 1 with 2. From information that is accessible to us, we can set new range as 1..10 and 2..10. Second choice is clearly wrong. First choice (never shrink a range) is partially wrong and will bring us to an overextended ranges. E.g., we may end having ranges like -1e6..1e6 for actual values of -100..100 due to events happened long ago (one customer was rich, then took huge overdraft and now they are all middle class).

Marking ranges as invalid and set them to the "empty set" value would trigger rescan which will set ranges properly.

Probable cpimport problems

I was wrong.

The case where extent set into valid state is here: https://github.com/mariadb-corporation/mariadb-columnstore-engine/blob/develop/versioning/BRM/extentmap.cpp#L1018

It does not trigger during normal operation.

Comment by Sergey Zefirov [ 2021-04-20 ]

https://jira.mariadb.org/browse/MCOL-4621 - the ticket where I've fixed cpimport problems.

Comment by Sergey Zefirov [ 2021-04-21 ]

Just in case, in MCOL-4621 I provided an explanation of defect manifestation outside of test I've fixed and what troubles it may cause.

Comment by Sergey Zefirov [ 2021-06-23 ]

This is interesting - Write Engine treats such short varchar(s) as integers that are large enough and does not do any additional processing.

Comment by Sergey Zefirov [ 2021-06-24 ]

tntnatbry what is bug 4594? JIRA says I cannot see this task because of insufficient permissions or task was deleted.

As per problem with calshowpartitionsbyvalue, the extents after insert linger in UPDATING state. I will look into why. It is much like https://jira.mariadb.org/browse/MCOL-4673 only with character columns.

Generated at Thu Feb 08 02:33:18 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.