[MDEV-29548] LOAD DATA INFILE doesn't read NULL values accurately Created: 2022-09-15  Updated: 2023-10-31  Resolved: 2023-10-31

Status: Closed
Project: MariaDB Server
Component/s: Documentation
Affects Version/s: 10.9.2, 10.3, 10.4, 10.5, 10.6, 10.7, 10.8, 10.9, 10.10
Fix Version/s: N/A

Type: Bug Priority: Major
Reporter: Duong Thien Ly Assignee: Ian Gilfillan
Resolution: Not a Bug Votes: 0
Labels: None


 Description   

Steps to reproduce.

SET SESSION sql_mode = "NO_BACKSLASH_ESCAPES";
SELECT NULL, 0, '' INTO OUTFILE 'data-server.csv';
CREATE
OR REPLACE TABLE loading (col1 VARCHAR(255) NULL, col2 INT, col3 VARCHAR(255));
 
LOAD DATA LOCAL INFILE 'data-server.csv' INTO TABLE loading;
 
SELECT
   *
FROM
   loading
WHERE
   col1 IS NULL;

It reads NULL as a string.



 Comments   
Comment by Alice Sherepa [ 2022-09-19 ]

Thank you for the report!I repeated as described on 10.3-10.10.

CREATE TABLE t1 (col1 VARCHAR(255));
 
SELECT NULL INTO OUTFILE 't1.data';
LOAD DATA INFILE 't1.data' INTO TABLE t1;
 
SELECT col1, col1 is NULL FROM t1;
 
SET SQL_MODE= 'NO_BACKSLASH_ESCAPES';
 
SELECT NULL INTO OUTFILE 't2.data';
LOAD DATA INFILE 't2.data' INTO TABLE t1; 
SELECT col1, col1 is NULL FROM t1;
 
# Cleanup
--let $datadir= `SELECT @@datadir`
--remove_file $datadir/test/t1.data
--remove_file $datadir/test/t2.data
DROP TABLE t1;

SELECT col1, col1 is NULL FROM t1;
col1	col1 is NULL
NULL	1
NULL	0

Comment by Sergei Golubchik [ 2022-09-19 ]

This doesn't seem to be a bug, but a very insufficient documentation related to how FIELDS ... and LINES ... clauses work and interact. Compare with https://dev.mysql.com/doc/refman/8.0/en/load-data.html — it says that

If you specify no FIELDS or LINES clause, the defaults are the same as if you had written this:

FIELDS TERMINATED BY '\t' ENCLOSED BY '' ESCAPED BY '\\'
LINES TERMINATED BY '\n' STARTING BY ''

...

  • For the default FIELDS and LINES values, NULL is written as a field value of \N for output, and a field value of \N is read as NULL for input (assuming that the ESCAPED BY character is \ ).
  • If FIELDS ENCLOSED BY is not empty, a field containing the literal word NULL as its value is read as a NULL value. This differs from the word NULL enclosed within FIELDS ENCLOSED BY characters, which is read as the string 'NULL'.
  • If FIELDS ESCAPED BY is empty, NULL is written as the word NULL.

In your case the ENCLOSED BY is empty, and ESCAPED BY is empty too, because of NO_BACKSLASH_ESCAPES. As a result, both NULL value and a 'NULL' string literal are written a literal word NULL, which is ambiguous. LOAD DATA prefers to interpret it as a string literal 'NULL'.

Comment by Michael Widenius [ 2023-10-31 ]

Closed based on Serg's last comment

Generated at Thu Feb 08 10:09:27 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.