[MCOL-2055] insert..select inserts shorter strings when batch insert is disabled, for data typeS BLOCK LONGBLOB TEXT LONGTEXT Created: 2018-12-31 Updated: 2021-04-29 Resolved: 2021-04-29 |
|
| Status: | Closed |
| Project: | MariaDB ColumnStore |
| Component/s: | ExeMgr |
| Affects Version/s: | 1.2.2 |
| Fix Version/s: | 5.6.1 |
| Type: | Bug | Priority: | Major |
| Reporter: | Daniel Lee (Inactive) | Assignee: | Daniel Lee (Inactive) |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | None | ||
| Attachments: |
|
||||||||||||
| Issue Links: |
|
||||||||||||
| Sprint: | 2021-7 | ||||||||||||
| Description |
|
Build tested: 1.2.2-1 When executing an insert..select statement from a table that has 5 rows, with batch insert off, all 5 rows were inserted correctly. But when I repeat the same insert..select statement, the first inserted row is shorter than the source row. I simplified the test case by insert..select only one row, the execute is still correct, but from the 2nd execution and onward, the insert is not correct. When batch insert is enabled (using cpimort), it does not have this issue. The same issue occurs to blob, longblob, text, and longtext data types. MariaDB [mytest]> set infinidb_use_import_for_batchinsert=0; MariaDB [mytest]> desc text1;
------
------ MariaDB [mytest]> desc text2;
------
------ MariaDB [mytest]> select * from text1; MariaDB [mytest]> select c1, length(c2) from text2;
-----
----- MariaDB [mytest]> insert into text1 select * from text2; MariaDB [mytest]> select c1, length(c2) from text1;
-----
----- MariaDB [mytest]> truncate table text1; MariaDB [mytest]> insert into text1 select * from text2 where text2.c1 = 1; MariaDB [mytest]> select c1, length(c2) from text1;
-----
----- MariaDB [mytest]> insert into text1 select * from text2 where text2.c1 = 1; MariaDB [mytest]> select c1, length(c2) from text1;
-----
----- MariaDB [mytest]> insert into text1 select * from text2 where text2.c1 = 1; MariaDB [mytest]> select c1, length(c2) from text1;
-----
----- MariaDB [mytest]> insert into text1 select * from text2 where text2.c1 = 1; MariaDB [mytest]> select c1, length(c2) from text1;
-----
----- When enabled batch insert (using cpimport), the same insertselect statement worked MariaDB [mytest]> insert into text1 select * from text2 where text2.c1 = 1; MariaDB [mytest]> select c1, length(c2) from text1;
-----
----- |
| Comments |
| Comment by Daniel Lee (Inactive) [ 2019-04-24 ] | |||||||||
|
Build tested: 1.2.4-1 (nightly) ColumnStore: DML update also has the same issue for LONGBLOB and LONGTEXT -------------- tableName length(c2) -------------- tableName length(c2) MariaDB Server reference [root@localhost datatypesTextBlob]# cat longtext.n.update.column.sql.ref.log -------------- tableName length(c2) -------------- tableName length(c2) | |||||||||
| Comment by David Hall (Inactive) [ 2020-10-29 ] | |||||||||
|
CREATE TABLE `text1` (`c1` int(11) DEFAULT NULL, `c2` text(65535) DEFAULT NULL ) ENGINE=Columnstore DEFAULT CHARSET=latin1; | |||||||||
| Comment by David Hall (Inactive) [ 2020-10-29 ] | |||||||||
|
The problem seems to only occur if a query is made before the insert and never occurs on the first insert into an empty table.
Will work, but
will break | |||||||||
| Comment by Jose Rojas (Inactive) [ 2020-12-02 ] | |||||||||
|
After investigating, I discovered the first block of the dictionary writeengineserver writes to during the insert/select is not found when trying to retrieve the newly inserted value(s). Restarting DMLProc after the insert/select would fix this issue due to a primproc cache flush. This fix flushes the dict oids from the primproc cache after the batch insert, which allows retrieval of the entire text/blob without having to restart DMLProc. | |||||||||
| Comment by Jose Rojas (Inactive) [ 2020-12-02 ] | |||||||||
|
For QA: To test the fix, you can simply take the attached file and rerun the queries from the decription and observer there is no longer issues retrieving columns with missing data. The fix only includes a call to flush the dict oids from the cache so running queries from description is a sufficient test. | |||||||||
| Comment by Roman [ 2020-12-03 ] | |||||||||
|
I'm very sorry but we need to investigate case a bit more b/c it doesn't look right if we just flush the cache for all dict oids involved.
| |||||||||
| Comment by Roman [ 2020-12-03 ] | |||||||||
|
There is another way. Plz somehow proove the used approach doesn't increase IO and affect the performance of the reading queries running at the same time with the ingestion. | |||||||||
| Comment by David Hall (Inactive) [ 2020-12-09 ] | |||||||||
|
columnstore-5.5.1-1 has been built with the modified cache flush algorithm. I'm moving this to 5.6 for further optimizations. | |||||||||
| Comment by David Hall (Inactive) [ 2021-04-27 ] | |||||||||
|
| |||||||||
| Comment by Daniel Lee (Inactive) [ 2021-04-29 ] | |||||||||
|
Build 5.6.1 ( Drone #2270) The mentioned test case no longer show the report issue. |