Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-34883

LOAD DATA INFILE with geometry data fails

Details

    Description

      "LOAD DATA INFILE" fails, for binary (or, geometry) data , for the file created with SELECT * INTO OUTFILE

      I'm attaching example data where it can be demonstrated.

      To reproduce, extract the dump file and load it

      mariadb -uroot test < /mnt/e/airport_geo.mysqldump.sql
      mariadb -uroot test -e "SELECT * FROM airport_geo INTO OUTFILE '/mnt/e/airport_geo.tsv'"
      mariadb -uroot test -e "TRUNCATE airport_geo;  LOAD DATA INFILE '/mnt/e/airport_geo.tsv' INTO TABLE airport_geo"
      

      The last command fails with

      --------------
      LOAD DATA INFILE '/mnt/e/airport_geo.tsv' INTO TABLE airport_geo
      --------------
       
      ERROR 1263 (22004) at line 1: Column set to default value; NULL supplied to NOT NULL column 'geolocation' at row 42
      

      this is the DDL for the create table, column "geolocation" is a POINT

      CREATE TABLE `airport_geo` (
        `airport_id` smallint(6) NOT NULL,
        `name` varchar(50) NOT NULL,
        `city` varchar(50) DEFAULT NULL,
        `country` varchar(50) DEFAULT NULL,
        `latitude` decimal(11,8) NOT NULL,
        `longitude` decimal(11,8) NOT NULL,
        `geolocation` point NOT NULL,
        KEY `flughafen_idx` (`airport_id`),
        SPATIAL KEY `geolokation_spt` (`geolocation`)
      ) ENGINE=MyISAM DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_unicode_ci;
      

      "git bisect" points to MDEV-19123 as first bad commit

      36eba98817339f53cc57f1d4884ace3efb38db8d is the first bad commit
      commit 36eba98817339f53cc57f1d4884ace3efb38db8d
      Date:   Tue May 28 09:08:51 2024 +0400
       
          MDEV-19123 Change default charset from latin1 to utf8mb4
       
          Changing the default server character set from latin1 to utf8mb4.
      

      Attachments

        Issue Links

          Activity

            bar Alexander Barkov added a comment - - edited

            A workaround is:

            mariadb -uroot test -e "TRUNCATE airport_geo;  LOAD DATA INFILE '/mnt/e/airport_geo.tsv' INTO TABLE airport_geo CHARACTER SET latin1"
            

            or better:

            mariadb -uroot test -e "TRUNCATE airport_geo;  LOAD DATA INFILE '/mnt/e/airport_geo.tsv' INTO TABLE airport_geo CHARACTER SET binary"
            

            This command also worked for me:

            mariadb -uroot test -e "TRUNCATE airport_geo;  SET character_set_database=latin1; LOAD DATA INFILE '/mnt/e/airport_geo.tsv' INTO TABLE airport_geo"
            

            bar Alexander Barkov added a comment - - edited A workaround is: mariadb -uroot test -e "TRUNCATE airport_geo; LOAD DATA INFILE '/mnt/e/airport_geo.tsv' INTO TABLE airport_geo CHARACTER SET latin1" or better: mariadb -uroot test -e "TRUNCATE airport_geo; LOAD DATA INFILE '/mnt/e/airport_geo.tsv' INTO TABLE airport_geo CHARACTER SET binary" This command also worked for me: mariadb -uroot test -e "TRUNCATE airport_geo; SET character_set_database=latin1; LOAD DATA INFILE '/mnt/e/airport_geo.tsv' INTO TABLE airport_geo"
            bar Alexander Barkov added a comment - - edited

            The safest way would be to dump binary data using some encoding: hex or base64.

            wlad, what was the exact command line with the mysqldump command?

            bar Alexander Barkov added a comment - - edited The safest way would be to dump binary data using some encoding: hex or base64. wlad , what was the exact command line with the mysqldump command?

            A smaller script demonstating the problem:

            CREATE OR REPLACE TABLE `airport_geo` (
              `airport_id` smallint(6) NOT NULL,
              `name` varchar(50) NOT NULL,
              `city` varchar(50) DEFAULT NULL,
              `country` varchar(50) DEFAULT NULL,
              `latitude` decimal(11,8) NOT NULL,
              `longitude` decimal(11,8) NOT NULL,
              `geolocation` point NOT NULL,
              KEY `flughafen_idx` (`airport_id`),
              SPATIAL KEY `geolokation_spt` (`geolocation`)
            ) ENGINE=MyISAM DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_unicode_ci;
             
            INSERT INTO `airport_geo` VALUES (53,'ACCOMACK CO','MELFA','UNITED STATES',37.64694400,-75.76111100,GeomFromText('POINT(37.646944 -75.761111)'));
            SELECT * FROM airport_geo INTO OUTFILE 'airport_geo.tsv';
             
            CREATE OR REPLACE TABLE airport_geo2 LIKE airport_geo;
            LOAD DATA INFILE 'airport_geo.tsv' INTO TABLE airport_geo2;
            

            ERROR 1263 (22004): Column set to default value; NULL supplied to NOT NULL column 'geolocation' at row 2
            

            bar Alexander Barkov added a comment - A smaller script demonstating the problem: CREATE OR REPLACE TABLE `airport_geo` ( `airport_id` smallint (6) NOT NULL , ` name ` varchar (50) NOT NULL , `city` varchar (50) DEFAULT NULL , `country` varchar (50) DEFAULT NULL , `latitude` decimal (11,8) NOT NULL , `longitude` decimal (11,8) NOT NULL , `geolocation` point NOT NULL , KEY `flughafen_idx` (`airport_id`), SPATIAL KEY `geolokation_spt` (`geolocation`) ) ENGINE=MyISAM DEFAULT CHARSET=utf8mb4 COLLATE =utf8mb4_unicode_ci;   INSERT INTO `airport_geo` VALUES (53, 'ACCOMACK CO' , 'MELFA' , 'UNITED STATES' ,37.64694400,-75.76111100,GeomFromText( 'POINT(37.646944 -75.761111)' )); SELECT * FROM airport_geo INTO OUTFILE 'airport_geo.tsv' ;   CREATE OR REPLACE TABLE airport_geo2 LIKE airport_geo; LOAD DATA INFILE 'airport_geo.tsv' INTO TABLE airport_geo2; ERROR 1263 (22004): Column set to default value; NULL supplied to NOT NULL column 'geolocation' at row 2
            wlad Vladislav Vaintroub added a comment - - edited

            The exact mysqldump command line was, I think

            mysqldump -uroot flughafendb_large --tables airport_geo --hex-blob ...
            

            The important part was --hex-blob

            The original dataset is from https://github.com/stefanproell/flughafendb, and the my intention was trying something large-ish and more "real-life" for "mysqldump --dir / mariadb-import --dir" combo, which both do SELECT * into outfile rsp LOAD DATA INFILE for every table in the database internally.

            wlad Vladislav Vaintroub added a comment - - edited The exact mysqldump command line was, I think mysqldump -uroot flughafendb_large --tables airport_geo --hex-blob ... The important part was --hex-blob The original dataset is from https://github.com/stefanproell/flughafendb , and the my intention was trying something large-ish and more "real-life" for "mysqldump --dir / mariadb-import --dir" combo, which both do SELECT * into outfile rsp LOAD DATA INFILE for every table in the database internally.
            bar Alexander Barkov added a comment - - edited

            Even smaller script:

            CREATE OR REPLACE TABLE t1 (
              p point NOT NULL
            ) ENGINE=MyISAM DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_unicode_ci;
            INSERT INTO t1 VALUES (GeomFromText('POINT(37.646944 -75.761111)'));
            SELECT * FROM t1 INTO OUTFILE 't1.tsv';
             
            CREATE OR REPLACE TABLE t2 LIKE t1;
             
            LOAD DATA INFILE 't1.tsv' INTO TABLE t2;
            

            ERROR 1416 (22003): Cannot get geometry object from data you send to the GEOMETRY field
            

            bar Alexander Barkov added a comment - - edited Even smaller script: CREATE OR REPLACE TABLE t1 ( p point NOT NULL ) ENGINE=MyISAM DEFAULT CHARSET=utf8mb4 COLLATE =utf8mb4_unicode_ci; INSERT INTO t1 VALUES (GeomFromText( 'POINT(37.646944 -75.761111)' )); SELECT * FROM t1 INTO OUTFILE 't1.tsv' ;   CREATE OR REPLACE TABLE t2 LIKE t1;   LOAD DATA INFILE 't1.tsv' INTO TABLE t2; ERROR 1416 (22003): Cannot get geometry object from data you send to the GEOMETRY field

            The problem is repeatable with 10.5+ with this script:

            CREATE DATABASE db1 CHARACTER SET utf8mb4;
            USE db1;
             
            CREATE OR REPLACE TABLE t1 (
              p point NOT NULL
            ) ENGINE=MyISAM;
            INSERT INTO t1 VALUES (GeomFromText('POINT(37.646944 -75.761111)'));
            SELECT * FROM t1 INTO OUTFILE 't1.tsv';
             
            CREATE OR REPLACE TABLE t2 LIKE t1;
             
            LOAD DATA INFILE 't1.tsv' INTO TABLE t2;
            

            bar Alexander Barkov added a comment - The problem is repeatable with 10.5+ with this script: CREATE DATABASE db1 CHARACTER SET utf8mb4; USE db1;   CREATE OR REPLACE TABLE t1 ( p point NOT NULL ) ENGINE=MyISAM; INSERT INTO t1 VALUES (GeomFromText( 'POINT(37.646944 -75.761111)' )); SELECT * FROM t1 INTO OUTFILE 't1.tsv' ;   CREATE OR REPLACE TABLE t2 LIKE t1;   LOAD DATA INFILE 't1.tsv' INTO TABLE t2;

            commit e8b1507dcb7300ba84a07fdb3a04ed4b65b0b566 (HEAD -> bb-10.5-MDEV-34883, origin/bb-10.5-MDEV-34883)
            Author: Oleksandr Byelkin <sanja@mariadb.com>
            Date:   Wed Oct 16 13:28:20 2024 +0200
             
                MDEV-34883 LOAD DATA INFILE with geometry data fails
                
                We rite field using field data charset, so we should read it
                using the field charset.
            

            sanja Oleksandr Byelkin added a comment - commit e8b1507dcb7300ba84a07fdb3a04ed4b65b0b566 (HEAD -> bb-10.5-MDEV-34883, origin/bb-10.5-MDEV-34883) Author: Oleksandr Byelkin <sanja@mariadb.com> Date: Wed Oct 16 13:28:20 2024 +0200   MDEV-34883 LOAD DATA INFILE with geometry data fails We rite field using field data charset, so we should read it using the field charset.

            commit c492d7fcc8b94720fc7b62402c39cc66f8892b50 (HEAD -> bb-10.5-MDEV-34883, origin/bb-10.5-MDEV-34883)
            Author: Oleksandr Byelkin <sanja@mariadb.com>
            Date:   Wed Oct 16 13:28:20 2024 +0200
             
                MDEV-34883 LOAD DATA INFILE with geometry data fails
                
                We rite field using field data charset, so we should read it
                using the field charset.
            

            sanja Oleksandr Byelkin added a comment - commit c492d7fcc8b94720fc7b62402c39cc66f8892b50 (HEAD -> bb-10.5-MDEV-34883, origin/bb-10.5-MDEV-34883) Author: Oleksandr Byelkin <sanja@mariadb.com> Date: Wed Oct 16 13:28:20 2024 +0200   MDEV-34883 LOAD DATA INFILE with geometry data fails We rite field using field data charset, so we should read it using the field charset.

            People

              bar Alexander Barkov
              wlad Vladislav Vaintroub
              Votes:
              0 Vote for this issue
              Watchers:
              5 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.