[MCOL-4276] LOAD DATA INFILE DOES NOT WORK - TABLE DATA IS MUNGED Created: 2020-08-25  Updated: 2023-07-02  Resolved: 2023-07-02

Status: Closed
Project: MariaDB ColumnStore
Component/s: N/A
Affects Version/s: None
Fix Version/s: Icebox

Type: Bug Priority: Major
Reporter: Justin Swanhart Assignee: Unassigned
Resolution: Cannot Reproduce Votes: 1
Labels: None


 Description   

> Create a text file with the following values:
> 1,1,221401,620758,24819,"19960102","5-LOW",0,17,2853824,19179206,4,2739671,100723,2,"19960212","TRUCK"
> 1,2,221401,269237,4888,"19960102","5-LOW",0,36,4342392,19179206,9,3951576,72373,6,"19960228","MAIL"
> 1,3,221401,254800,2126,"19960102","5-LOW",0,8,1403832,19179206,10,1263448,105287,2,"19960305","REG AIR"
> 1,4,221401,8526,28263,"19960102","5-LOW",0,28,4016656,19179206,9,3655156,86071,6,"19960330","AIR"
> 1,5,221401,96107,48736,"19960102","5-LOW",0,24,2647440,19179206,10,2382696,66186,4,"19960314","FOB"
> 1,6,221401,62538,41016,"19960102","5-LOW",0,32,4801696,19179206,7,4465577,90031,2,"19960207","MAIL"
> 2,1,468010,424679,31970,"19961201","1-URGENT",0,38,6093870,6398563,0,6093870,96219,5,"19970114","RAIL"
> 3,1,739885,17188,58754,"19931014","5-LOW",0,45,4973310,22250151,6,4674911,66310,0,"19940104","AIR"
> 3,2,739885,76142,49996,"19931014","5-LOW",0,49,5478886,22250151,10,4930997,67088,0,"19931220","RAIL"
> 3,3,739885,513793,42270,"19931014","5-LOW",0,27,4878279,22250151,6,4585582,108406,7,"19931122","SHIP"
> Create the following table:
> CREATE TABLE `lineorder` (
> `LO_OrderKey` bigint(20) NOT NULL,
> `LO_LineNumber` tinyint(4) NOT NULL,
> `LO_CustKey` int(11) NOT NULL,
> `LO_PartKey` int(11) NOT NULL,
> `LO_SuppKey` int(11) NOT NULL,
> `LO_OrderDateKey` int(11) NOT NULL,
> `LO_OrderPriority` varchar(15) DEFAULT NULL,
> `LO_ShipPriority` char(1) DEFAULT NULL,
> `LO_Quantity` tinyint(4) DEFAULT NULL,
> `LO_ExtendedPrice` int(11) DEFAULT NULL,
> `LO_OrdTotalPrice` int(11) DEFAULT NULL,
> `LO_Discount` int(11) DEFAULT NULL,
> `LO_Revenue` int(11) DEFAULT NULL,
> `LO_SupplyCost` int(11) DEFAULT NULL,
> `LO_Tax` tinyint(4) DEFAULT NULL,
> `LO_CommitDateKey` int(11) NOT NULL,
> `LO_ShipMode` varchar(10) DEFAULT NULL
> ) ENGINE=Columnstore DEFAULT CHARSET=utf8mb4;
> LOAD DATA LOCAL INFILE '/path/to/file' INTO TABLE lineorder FIELDS TERMINATED BY ',' OPTIONALLY ENCLOSED BY '""
> The LOAD DATA LOCAL INFILE returned warnings:
> udo mariadb test2
> Query OK, 179998372 rows affected, 65535 warnings (13 min 10.477 sec)
> Records: 179998372 Deleted: 0 Skipped: 0 Warnings: 1407119
> MariaDB [test2]> show warnings;
> -----------------------------
> | Level | Code | Message |
> -----------------------------
> | Warning | 9999 | Values saturated |
> ...
> The data is completely munged:
> MariaDB [test2]> select * from lineorder where LO_OrderKey =1 and LO_LineNumber =1 \G
> *************************** 1. row ***************************
> LO_OrderKey: 1
> LO_LineNumber: 1
> LO_CustKey: 221401
> LO_PartKey: 620758
> LO_SuppKey: 24819
> LO_OrderDateKey: 19960102
> LO_OrderPriority: 5-LOW
> LO_ShipPriority: NULL
> LO_Quantity: 0
> LO_ExtendedPrice: 0
> LO_OrdTotalPrice: 0
> LO_Discount: 805306368
> LO_Revenue: 287318048
> LO_SupplyCost: 2853824
> LO_Tax: -58
> LO_CommitDateKey: 67183782
> LO_ShipMode: NULL
> 1 row in set (0.134 sec)
> The data file contains no NULL values, nor does it have any negative values (see LO_Tax and LO_ShipMode). Everything past the "5-LOW" field appears to be munged.



 Comments   
Comment by Justin Swanhart [ 2020-08-25 ]

I tested LOAD DATA INFILE both with and without enclosures, and it munges the data EVERY TIME. cpimport loaded the data properly. The file is formatted properly. The file can be loaded on InnoDB.

Do not delete this bug. LOAD DATA INFILE does not work. Tested on Ubuntu 18 LTE.

Comment by Valerii Kravchuk [ 2020-08-25 ]

See https://jira.mariadb.org/browse/MCOL-4103 and try not to use utf8mb4 in this context.

Comment by Justin Swanhart [ 2020-08-25 ]

Note that the database is in STRICT mode, and it loads the data and leaves it in a bad state, it should have rolled back the LOAD DATA INFILE.

Everything after LO_OrderPriority is munged. There are no NULL values in the file, there are no negative values in the file. It is the SSB lineorder table created by the ssb data generator. It should not have committed the data with saturated values, and it should not have written any "saturated" values, since there are not bad values in the table. It loads perfectly into InnoDB, but can not be loaded into columnstore.

Comment by Justin Swanhart [ 2020-08-25 ]

Valerii: it works from cpimport though - and none of the columns have anything but latin1 characters. And the data loads into InnoDB.

Comment by Todd Stoffel (Inactive) [ 2023-07-02 ]

Tested in Community Server 11.1.1 with ColumnStore 23.02.3 and this is no longer a problem.

Generated at Thu Feb 08 02:49:08 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.