Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-40154

Bulk load cause inconsistent output

    XMLWordPrintable

Details

    Description

      Create the CSV file

      cat /tmp/test_2col.csv
      AAA,BBB,value1
      AAA ,BBB,value2
      AAA ,BBB,value2
      AAA ,BBB  ,value2
      AAA ,BBB ,value2
      ZZZ,BBB,value3
      

      CREATE TABLE `test_pk_2col` (
        `col1` varchar(50) NOT NULL,
        `col2` varchar(50) NOT NULL,
        `value` varchar(50) DEFAULT NULL,
        PRIMARY KEY (`col1`,`col2`)
      ) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_bin
       
      SET unique_checks=0; set foreign_key_checks=0;
       
      LOAD DATA LOCAL INFILE '/tmp/test_2col.csv' IGNORE INTO TABLE test_pk_2col FIELDS TERMINATED BY ',' LINES TERMINATED BY '\n' (col1,col2,value);
      

      LOAD DATA LOCAL INFILE '/tmp/test_2col.csv' IGNORE INTO TABLE test_pk_2col FIELDS TERMINATED BY ',' LINES TERMINATED BY '\n' (col1,col2,value);
      Query OK, 4 rows affected, 2 warnings (0.007 sec)    
      Records: 6  Deleted: 0  Skipped: 2  Warnings: 2
       
       
       select * from test_pk_2col;
      +------+------+--------+
      | col1 | col2 | value  |
      +------+------+--------+
      | AAA  | BBB  | value2 |
      | ZZZ  | BBB  | value3 |
      +------+------+--------+
      

      Expected: `Records:6 *Skipped:4 * COUNT(1):2`
      Actual: `Records:6 Skipped:2 COUNT(1):2`
      → 2 duplicate rows silently vanished from Skipped counter with no trace.

      When we set SET unique_checks=0; set foreign_key_checks=1;
      It works as expected Records:6 Skipped:4 COUNT(1):2

      When the Pk has more than 10 columns, like below example:

      Create a CSV and below table structure for reproduce the issue:

      cat /tmp/test_prod.csv
      AAA,BBB,CCC,DDD,EEE,FFF,GGG,HHH,III,JJJ,KKK,LLL,MMM,value1
      AAA ,BBB,CCC,DDD,EEE,FFF,GGG,HHH,III,JJJ,KKK,LLL,MMM,value2
      AAA,BBB,CCC ,DDD,EEE,FFF,GGG,HHH,III,JJJ,KKK,LLL,MMM,value3
      AAA ,BBB ,CCC,DDD,EEE,FFF,GGG,HHH,III,JJJ,KKK,LLL,MMM,value4
      ZZZ,BBB,CCC,DDD,EEE,FFF,GGG,HHH,III,JJJ,KKK,LLL,MMM,value5
      

      CREATE TABLE test_pk (
          col1  VARCHAR(50) NOT NULL,
          col2  VARCHAR(50) NOT NULL,
          col3  VARCHAR(50) NOT NULL,
          col4  VARCHAR(50) NOT NULL,
          col5  VARCHAR(50) NOT NULL,
          col6  VARCHAR(50) NOT NULL,
          col7  VARCHAR(50) NOT NULL,
          col8  VARCHAR(50) NOT NULL,
          col9  VARCHAR(50) NOT NULL,
          col10 VARCHAR(50) NOT NULL,
          col11 VARCHAR(50) NOT NULL,
          col12 VARCHAR(50) NOT NULL,
          col13 VARCHAR(50) NOT NULL,
          value VARCHAR(50),
          PRIMARY KEY (col1,col2,col3,col4,col5,
                       col6,col7,col8,col9,col10,
                       col11,col12,col13)
      ) ENGINE=InnoDB
        CHARACTER SET utf8mb4
        COLLATE utf8mb4_bin;
       
      SET unique_checks=0; set foreign_key_checks=0;
      LOAD DATA LOCAL INFILE '/tmp/test_prod.csv' IGNORE
      INTO TABLE test_pk FIELDS TERMINATED BY ',' LINES TERMINATED BY '\n'
      (col1,col2,col3,col4,col5,col6,col7,col8,col9,col10,col11,col12,col13,value);
       
      show warnings;
       
      SELECT COUNT(*) FROM test_pk;
      
      

      The output:

      SET unique_checks=0; set foreign_key_checks=0;
      Query OK, 0 rows affected (0.001 sec)
       
      Query OK, 0 rows affected (0.000 sec)
       
      MariaDB [duplicatetest]> LOAD DATA LOCAL INFILE '/tmp/test_prod.csv' IGNORE
          -> INTO TABLE test_pk FIELDS TERMINATED BY ',' LINES TERMINATED BY '\n'
          -> (col1,col2,col3,col4,col5,col6,col7,col8,col9,col10,col11,col12,col13,value);
      Query OK, 3 rows affected, 2 warnings (0.008 sec)    
      Records: 5  Deleted: 0  Skipped: 2  Warnings: 2
       
      MariaDB [duplicatetest]> show warnings;
      +---------+------+-------------------------------------------------------------------------------------------+
      | Level   | Code | Message                                                                                   |
      +---------+------+-------------------------------------------------------------------------------------------+
      | Warning | 1062 | Duplicate entry 'AAA -BBB-CCC-DDD-EEE-FFF-GGG-HHH-III-JJJ-KKK-LLL-MMM' for key 'PRIMARY'  |
      | Warning | 1062 | Duplicate entry 'AAA-BBB-CCC -DDD-EEE-FFF-GGG-HHH-III-JJJ-KKK-LLL-MMM' for key 'PRIMARY'  |
      | Warning | 1062 | Duplicate entry 'AAA -BBB -CCC-DDD-EEE-FFF-GGG-HHH-III-JJJ-KKK-LLL-MMM' for key 'PRIMARY' |
      +---------+------+-------------------------------------------------------------------------------------------+
      3 rows in set (0.000 sec)
       
       
      MariaDB [duplicatetest]> select * from test_pk;
      +------+------+------+------+------+------+------+------+------+-------+-------+-------+-------+--------+
      | col1 | col2 | col3 | col4 | col5 | col6 | col7 | col8 | col9 | col10 | col11 | col12 | col13 | value  |
      +------+------+------+------+------+------+------+------+------+-------+-------+-------+-------+--------+
      | ZZZ  | BBB  | CCC  | DDD  | EEE  | FFF  | GGG  | HHH  | III  | JJJ   | KKK   | LLL   | MMM   | value5 |
      +------+------+------+------+------+------+------+------+------+-------+-------+-------+-------+--------+
      1 row in set (0.000 sec)
      
      

      Expected: Query OK, 2 rows affected.
      Actual result: Query OK, 3 rows affected

      It has only one row available in the table; it silently skips another row that has multiple duplicates.

      when we set SET unique_checks=0; set foreign_key_checks=1;, the load data works as expected.

      select * from test_pk;
      +------+------+------+------+------+------+------+------+------+-------+-------+-------+-------+--------+
      | col1 | col2 | col3 | col4 | col5 | col6 | col7 | col8 | col9 | col10 | col11 | col12 | col13 | value  |
      +------+------+------+------+------+------+------+------+------+-------+-------+-------+-------+--------+
      | AAA  | BBB  | CCC  | DDD  | EEE  | FFF  | GGG  | HHH  | III  | JJJ   | KKK   | LLL   | MMM   | value1 |
      | ZZZ  | BBB  | CCC  | DDD  | EEE  | FFF  | GGG  | HHH  | III  | JJJ   | KKK   | LLL   | MMM   | value5 |
      +------+------+------+------+------+------+------+------+------+-------+-------+-------+-------+--------+
      

      The issue was identified on the 11.8 version

      select @@Version,@@sql_mode;
      +---------------------------------+-------------------------------------------------------------------------------------------+
      | @@Version                       | @@sql_mode                                                                                |
      +---------------------------------+-------------------------------------------------------------------------------------------+
      | 11.8.6-4-MariaDB-enterprise-log | STRICT_TRANS_TABLES,ERROR_FOR_DIVISION_BY_ZERO,NO_AUTO_CREATE_USER,NO_ENGINE_SUBSTITUTION |
      +---------------------------------+-------------------------------------------------------------------------------------------+
      

      Attachments

        Issue Links

          Activity

            People

              serg Sergei Golubchik
              pandi.gurusamy Pandikrishnan Gurusamy
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated:

                Time Tracking

                  Estimated:
                  Original Estimate - 0d
                  0d
                  Remaining:
                  Remaining Estimate - 0d
                  0d
                  Logged:
                  Time Spent - 1.5h
                  1.5h

                  Git Integration

                    Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.