Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-34883

LOAD DATA INFILE with geometry data fails

Details

    Description

      "LOAD DATA INFILE" fails, for binary (or, geometry) data , for the file created with SELECT * INTO OUTFILE

      I'm attaching example data where it can be demonstrated.

      To reproduce, extract the dump file and load it

      mariadb -uroot test < /mnt/e/airport_geo.mysqldump.sql
      mariadb -uroot test -e "SELECT * FROM airport_geo INTO OUTFILE '/mnt/e/airport_geo.tsv'"
      mariadb -uroot test -e "TRUNCATE airport_geo;  LOAD DATA INFILE '/mnt/e/airport_geo.tsv' INTO TABLE airport_geo"
      

      The last command fails with

      --------------
      LOAD DATA INFILE '/mnt/e/airport_geo.tsv' INTO TABLE airport_geo
      --------------
       
      ERROR 1263 (22004) at line 1: Column set to default value; NULL supplied to NOT NULL column 'geolocation' at row 42
      

      this is the DDL for the create table, column "geolocation" is a POINT

      CREATE TABLE `airport_geo` (
        `airport_id` smallint(6) NOT NULL,
        `name` varchar(50) NOT NULL,
        `city` varchar(50) DEFAULT NULL,
        `country` varchar(50) DEFAULT NULL,
        `latitude` decimal(11,8) NOT NULL,
        `longitude` decimal(11,8) NOT NULL,
        `geolocation` point NOT NULL,
        KEY `flughafen_idx` (`airport_id`),
        SPATIAL KEY `geolokation_spt` (`geolocation`)
      ) ENGINE=MyISAM DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_unicode_ci;
      

      "git bisect" points to MDEV-19123 as first bad commit

      36eba98817339f53cc57f1d4884ace3efb38db8d is the first bad commit
      commit 36eba98817339f53cc57f1d4884ace3efb38db8d
      Date:   Tue May 28 09:08:51 2024 +0400
       
          MDEV-19123 Change default charset from latin1 to utf8mb4
       
          Changing the default server character set from latin1 to utf8mb4.
      

      Attachments

        Issue Links

          Activity

            commit c492d7fcc8b94720fc7b62402c39cc66f8892b50 (HEAD -> bb-10.5-MDEV-34883, origin/bb-10.5-MDEV-34883)
            Author: Oleksandr Byelkin <sanja@mariadb.com>
            Date:   Wed Oct 16 13:28:20 2024 +0200
             
                MDEV-34883 LOAD DATA INFILE with geometry data fails
                
                We rite field using field data charset, so we should read it
                using the field charset.
            

            sanja Oleksandr Byelkin added a comment - commit c492d7fcc8b94720fc7b62402c39cc66f8892b50 (HEAD -> bb-10.5-MDEV-34883, origin/bb-10.5-MDEV-34883) Author: Oleksandr Byelkin <sanja@mariadb.com> Date: Wed Oct 16 13:28:20 2024 +0200   MDEV-34883 LOAD DATA INFILE with geometry data fails We rite field using field data charset, so we should read it using the field charset.

            commit e8b1507dcb7300ba84a07fdb3a04ed4b65b0b566 (HEAD -> bb-10.5-MDEV-34883, origin/bb-10.5-MDEV-34883)
            Author: Oleksandr Byelkin <sanja@mariadb.com>
            Date:   Wed Oct 16 13:28:20 2024 +0200
             
                MDEV-34883 LOAD DATA INFILE with geometry data fails
                
                We rite field using field data charset, so we should read it
                using the field charset.
            

            sanja Oleksandr Byelkin added a comment - commit e8b1507dcb7300ba84a07fdb3a04ed4b65b0b566 (HEAD -> bb-10.5-MDEV-34883, origin/bb-10.5-MDEV-34883) Author: Oleksandr Byelkin <sanja@mariadb.com> Date: Wed Oct 16 13:28:20 2024 +0200   MDEV-34883 LOAD DATA INFILE with geometry data fails We rite field using field data charset, so we should read it using the field charset.

            The problem is repeatable with 10.5+ with this script:

            CREATE DATABASE db1 CHARACTER SET utf8mb4;
            USE db1;
             
            CREATE OR REPLACE TABLE t1 (
              p point NOT NULL
            ) ENGINE=MyISAM;
            INSERT INTO t1 VALUES (GeomFromText('POINT(37.646944 -75.761111)'));
            SELECT * FROM t1 INTO OUTFILE 't1.tsv';
             
            CREATE OR REPLACE TABLE t2 LIKE t1;
             
            LOAD DATA INFILE 't1.tsv' INTO TABLE t2;
            

            bar Alexander Barkov added a comment - The problem is repeatable with 10.5+ with this script: CREATE DATABASE db1 CHARACTER SET utf8mb4; USE db1;   CREATE OR REPLACE TABLE t1 ( p point NOT NULL ) ENGINE=MyISAM; INSERT INTO t1 VALUES (GeomFromText( 'POINT(37.646944 -75.761111)' )); SELECT * FROM t1 INTO OUTFILE 't1.tsv' ;   CREATE OR REPLACE TABLE t2 LIKE t1;   LOAD DATA INFILE 't1.tsv' INTO TABLE t2;
            bar Alexander Barkov added a comment - - edited

            Even smaller script:

            CREATE OR REPLACE TABLE t1 (
              p point NOT NULL
            ) ENGINE=MyISAM DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_unicode_ci;
            INSERT INTO t1 VALUES (GeomFromText('POINT(37.646944 -75.761111)'));
            SELECT * FROM t1 INTO OUTFILE 't1.tsv';
             
            CREATE OR REPLACE TABLE t2 LIKE t1;
             
            LOAD DATA INFILE 't1.tsv' INTO TABLE t2;
            

            ERROR 1416 (22003): Cannot get geometry object from data you send to the GEOMETRY field
            

            bar Alexander Barkov added a comment - - edited Even smaller script: CREATE OR REPLACE TABLE t1 ( p point NOT NULL ) ENGINE=MyISAM DEFAULT CHARSET=utf8mb4 COLLATE =utf8mb4_unicode_ci; INSERT INTO t1 VALUES (GeomFromText( 'POINT(37.646944 -75.761111)' )); SELECT * FROM t1 INTO OUTFILE 't1.tsv' ;   CREATE OR REPLACE TABLE t2 LIKE t1;   LOAD DATA INFILE 't1.tsv' INTO TABLE t2; ERROR 1416 (22003): Cannot get geometry object from data you send to the GEOMETRY field
            wlad Vladislav Vaintroub added a comment - - edited

            The exact mysqldump command line was, I think

            mysqldump -uroot flughafendb_large --tables airport_geo --hex-blob ...
            

            The important part was --hex-blob

            The original dataset is from https://github.com/stefanproell/flughafendb, and the my intention was trying something large-ish and more "real-life" for "mysqldump --dir / mariadb-import --dir" combo, which both do SELECT * into outfile rsp LOAD DATA INFILE for every table in the database internally.

            wlad Vladislav Vaintroub added a comment - - edited The exact mysqldump command line was, I think mysqldump -uroot flughafendb_large --tables airport_geo --hex-blob ... The important part was --hex-blob The original dataset is from https://github.com/stefanproell/flughafendb , and the my intention was trying something large-ish and more "real-life" for "mysqldump --dir / mariadb-import --dir" combo, which both do SELECT * into outfile rsp LOAD DATA INFILE for every table in the database internally.

            A smaller script demonstating the problem:

            CREATE OR REPLACE TABLE `airport_geo` (
              `airport_id` smallint(6) NOT NULL,
              `name` varchar(50) NOT NULL,
              `city` varchar(50) DEFAULT NULL,
              `country` varchar(50) DEFAULT NULL,
              `latitude` decimal(11,8) NOT NULL,
              `longitude` decimal(11,8) NOT NULL,
              `geolocation` point NOT NULL,
              KEY `flughafen_idx` (`airport_id`),
              SPATIAL KEY `geolokation_spt` (`geolocation`)
            ) ENGINE=MyISAM DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_unicode_ci;
             
            INSERT INTO `airport_geo` VALUES (53,'ACCOMACK CO','MELFA','UNITED STATES',37.64694400,-75.76111100,GeomFromText('POINT(37.646944 -75.761111)'));
            SELECT * FROM airport_geo INTO OUTFILE 'airport_geo.tsv';
             
            CREATE OR REPLACE TABLE airport_geo2 LIKE airport_geo;
            LOAD DATA INFILE 'airport_geo.tsv' INTO TABLE airport_geo2;
            

            ERROR 1263 (22004): Column set to default value; NULL supplied to NOT NULL column 'geolocation' at row 2
            

            bar Alexander Barkov added a comment - A smaller script demonstating the problem: CREATE OR REPLACE TABLE `airport_geo` ( `airport_id` smallint (6) NOT NULL , ` name ` varchar (50) NOT NULL , `city` varchar (50) DEFAULT NULL , `country` varchar (50) DEFAULT NULL , `latitude` decimal (11,8) NOT NULL , `longitude` decimal (11,8) NOT NULL , `geolocation` point NOT NULL , KEY `flughafen_idx` (`airport_id`), SPATIAL KEY `geolokation_spt` (`geolocation`) ) ENGINE=MyISAM DEFAULT CHARSET=utf8mb4 COLLATE =utf8mb4_unicode_ci;   INSERT INTO `airport_geo` VALUES (53, 'ACCOMACK CO' , 'MELFA' , 'UNITED STATES' ,37.64694400,-75.76111100,GeomFromText( 'POINT(37.646944 -75.761111)' )); SELECT * FROM airport_geo INTO OUTFILE 'airport_geo.tsv' ;   CREATE OR REPLACE TABLE airport_geo2 LIKE airport_geo; LOAD DATA INFILE 'airport_geo.tsv' INTO TABLE airport_geo2; ERROR 1263 (22004): Column set to default value; NULL supplied to NOT NULL column 'geolocation' at row 2
            bar Alexander Barkov added a comment - - edited

            The safest way would be to dump binary data using some encoding: hex or base64.

            wlad, what was the exact command line with the mysqldump command?

            bar Alexander Barkov added a comment - - edited The safest way would be to dump binary data using some encoding: hex or base64. wlad , what was the exact command line with the mysqldump command?
            bar Alexander Barkov added a comment - - edited

            A workaround is:

            mariadb -uroot test -e "TRUNCATE airport_geo;  LOAD DATA INFILE '/mnt/e/airport_geo.tsv' INTO TABLE airport_geo CHARACTER SET latin1"
            

            or better:

            mariadb -uroot test -e "TRUNCATE airport_geo;  LOAD DATA INFILE '/mnt/e/airport_geo.tsv' INTO TABLE airport_geo CHARACTER SET binary"
            

            This command also worked for me:

            mariadb -uroot test -e "TRUNCATE airport_geo;  SET character_set_database=latin1; LOAD DATA INFILE '/mnt/e/airport_geo.tsv' INTO TABLE airport_geo"
            

            bar Alexander Barkov added a comment - - edited A workaround is: mariadb -uroot test -e "TRUNCATE airport_geo; LOAD DATA INFILE '/mnt/e/airport_geo.tsv' INTO TABLE airport_geo CHARACTER SET latin1" or better: mariadb -uroot test -e "TRUNCATE airport_geo; LOAD DATA INFILE '/mnt/e/airport_geo.tsv' INTO TABLE airport_geo CHARACTER SET binary" This command also worked for me: mariadb -uroot test -e "TRUNCATE airport_geo; SET character_set_database=latin1; LOAD DATA INFILE '/mnt/e/airport_geo.tsv' INTO TABLE airport_geo"

            People

              bar Alexander Barkov
              wlad Vladislav Vaintroub
              Votes:
              0 Vote for this issue
              Watchers:
              5 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.