[MDEV-17254] Load failing on json data Created: 2018-09-20  Updated: 2018-10-25  Resolved: 2018-10-25

Status: Closed
Project: MariaDB Server
Component/s: JSON
Affects Version/s: 10.3.9
Fix Version/s: N/A

Type: Bug Priority: Major
Reporter: Ramesh Assignee: Unassigned
Resolution: Incomplete Votes: 0
Labels: need_feedback
Environment:

RedHat 7.2



 Description   

We we try to load data from json file the escape characters dont get padded and the load fails.

DROP TABLE IF EXISTS mdb.dn;
 
CREATE TABLE dn
(
  id                        INTEGER     NOT NULL PRIMARY KEY AUTO_INCREMENT,
  jsondata                  json,
check (json_valid(jsondata))
)
 
ALTER TABLE dn ADD   an                         VARCHAR(100) AS (JSON_VALUE (jsondata, '$.an'));
ALTER TABLE dn ADD   modification_datetime      BIGINT     AS (JSON_VALUE (jsondata, '$.modification_datetime'));
ALTER TABLE dn ADD   body  LONGTEXT AS (JSON_VALUE (jsondata, '$.body'));
 
CREATE UNIQUE INDEX dn_an_modification_datetime_pk ON dn (an, modification_datetime);

Sample row:

{"an" : "XXXXEDX", "modification_datetime" : 1535027267000, "body" : " This is a test \" I think it does not work\n However when we add an extra ba
ckslash escape, \n Then it works\n"}

load data local infile '/usr/local/mysql/scripts/dn.json' into table dn(jsondata);
 
 load data local infile '/usr/local/mysql/scripts/dn.json' into table dn(jsondata);
Query OK, 0 rows affected, 1 warning (0.000 sec)     
Records: 0  Deleted: 0  Skipped: 0  Warnings: 1
 
MariaDB[mdb]> show warnings;
+---------+------+-----------------------------------------------------+
| Level   | Code | Message                                             |
+---------+------+-----------------------------------------------------+
| Warning | 4025 | CONSTRAINT `CONSTRAINT_1` failed for `mdb`.`dn` |
+---------+------+-----------------------------------------------------+
1 row in set (0.000 sec)

{"an" : "XXXXEDX", "modification_datetime" : 1535027267000, "body" : " This is a test \\" I think it does not work\\n However when we add an extra ba
ckslash escape, \\n Then it works\\n"}

MariaDB [mdb]> load data local infile '/usr/local/mysql/scripts/dn.json' into table dn(jsondata);
Query OK, 1 row affected (0.001 sec)                 
Records: 1  Deleted: 0  Skipped: 0  Warnings: 0
 
MariaDB [mdb]> select * from dn;
+----+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+-------------------------------------------------------------------------------------------------------------+---------+-----------------------+
| id | jsondata                                                                                                                                                                                | body                                                                                                        | an      | modification_datetime |
+----+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+-------------------------------------------------------------------------------------------------------------+---------+-----------------------+
|  1 | {"an" : "XXXXEDX", "modification_datetime" : 1535027267000, "body" : " This is a test \" I think it does not work\n However when we add an extra backslash escape, \n Then it works\n"} |  This is a test " I think it does not work
 However when we add an extra backslash escape, 
 Then it works
 | XXXXEDX |         1535027267000 |
+----+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+-------------------------------------------------------------------------------------------------------------+---------+-----------------------+
1 row in set (0.001 sec)

Not sure how we can load the data as it come from the source like shown on the first sample line.}}



 Comments   
Comment by Alice Sherepa [ 2018-09-26 ]

You should use ESCAPED BY '' in this case. (https://mariadb.com/kb/en/library/load-data-infile/)

| dn    | CREATE TABLE `dn` (
  `id` int(11) NOT NULL AUTO_INCREMENT,
  `jsondata` longtext CHARACTER SET utf8mb4 COLLATE utf8mb4_bin DEFAULT NULL,
  `an` varchar(100) GENERATED ALWAYS AS (json_value(`jsondata`,'$.an')) VIRTUAL,
  `modification_datetime` bigint(20) GENERATED ALWAYS AS (json_value(`jsondata`,'$.modification_datetime')) VIRTUAL,
  `body` longtext GENERATED ALWAYS AS (json_value(`jsondata`,'$.body')) VIRTUAL,
  PRIMARY KEY (`id`),
  UNIQUE KEY `dn_an_modification_datetime_pk` (`an`,`modification_datetime`),
  CONSTRAINT `CONSTRAINT_1` CHECK (json_valid(`jsondata`))
) ENGINE=InnoDB DEFAULT CHARSET=latin1 |
1 row in set (0.000 sec)
 
MariaDB [test]> LOAD DATA LOCAL INFILE '~/1.json' INTO TABLE dn
    -> FIELDS ESCAPED BY ''
    -> LINES TERMINATED BY '\n'
    -> (jsondata);
Query OK, 3 rows affected (0.001 sec)                
Records: 3  Deleted: 0  Skipped: 0  Warnings: 0
 
MariaDB [test]> select * from dn;
+----+--------------------+------+-----------------------+--------+
| id | jsondata           | an   | modification_datetime | body   |
+----+--------------------+------+-----------------------+--------+
|  1 | {"body":"a\a"}     | NULL |                  NULL | aa     |
|  2 | {"body":"bb'\\bb"} | NULL |                  NULL | bb'\bb |
|  3 | {"body":"cd\nd"}   | NULL |                  NULL | cd
d   |
+----+--------------------+------+-----------------------+--------+
3 rows in set (0.001 sec)
 
MariaDB [test]> \! cat ~/1.json
{"body":"a\a"}
{"body":"bb'\\bb"}
{"body":"cd\nd"}

And maybe in the column 'body' it should be JSON_EXTRACT function instead of JSON_Value.

MariaDB [test]> SELECT json_extract(jsondata,'$.body') from dn;
+---------------------------------+
| json_extract(jsondata,'$.body') |
+---------------------------------+
| "a\a"                           |
| "bb'\\bb"                       |
| "cd\nd"                         |
+---------------------------------+
3 rows in set (0.001 sec)

Generated at Thu Feb 08 08:35:04 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.