[MDEV-9073] LOAD DATA INFILE and BIT datatype Created: 2015-11-03  Updated: 2015-11-03

Status: Open
Project: MariaDB Server
Component/s: None
Fix Version/s: None

Type: Task Priority: Minor
Reporter: roberto spadim Assignee: Unassigned
Resolution: Unresolved Votes: 0
Labels: None

Attachments: PNG File t.png     Text File t.txt    

 Description   

hi guys, today we need a workaround to load BIT fields:

BIT values cannot be loaded using binary notation (for example, b'011010'). To work around this, specify the values as regular integers and use the SET clause to convert them so that MySQL performs a numeric type conversion and loads them into the BIT column properly:

shell> cat /tmp/bit_test.txt
2
127
shell> mysql test
mysql> LOAD DATA INFILE '/tmp/bit_test.txt'
    -> INTO TABLE bit_test (@var1) SET b = CAST(@var1 AS UNSIGNED);
Query OK, 2 rows affected (0.00 sec)
Records: 2  Deleted: 0  Skipped: 0  Warnings: 0

the problem is, when i have a big file, with CAST it takes 3 seconds and without it takes 1 second
maybe this isn't too difficult to optimize, i think all bit fields today use a CAST, why not implement it as a native solution? always cast to unsigned bigint when using BIT fields for example, i don't know if this optimize something, but i think the SET part isn't running as fast as it could, the idea is remove the SET part and avoid some CAST functions

without cast we have: WARNING 1024 Out of range value for column 'xxxx' at row yyy

maybe something here:
https://github.com/MariaDB/server/blob/e19a6f3dca7509eb8e042796a9311dba937ad9d7/sql/sql_load.cc#L853
that's something with field->store (i think)

        field->store((char*) pos,length,read_info.read_charset);

https://github.com/MariaDB/server/blob/e19a6f3dca7509eb8e042796a9311dba937ad9d7/sql/sql_load.cc#L1025

        field->store((char*) pos, length, read_info.read_charset);

like:

if(field->type()==MYSQL_TYPE_BIT){
// convert string to unsigned long long 
}else{
// use normal field->store()
}


set datatype do this:
https://github.com/MariaDB/server/blob/641644a8629b9b437a1886cf19451f31a0181860/sql/field.cc#L8739

patch idea at sql_load.cc L853 / L1025

    if(field->type()==MYSQL_TYPE_BIT){
      char *bit_end;
      int bit_err=0;
      ulonglong bit_tmp=my_strntoull(read_info.read_charset,(char*) pos,length,10,&bit_end,&bit_err);
      if (bit_err || bit_end != pos+length){
        bit_tmp=0;
        field->set_warning(WARN_DATA_TRUNCATED, 1);
      }
      field->store((longlong) bit_tmp,true);
    }else{
      field->store((char*) pos,length,read_info.read_charset);
    }


Test case:
file t.txt attached

drop table if exists a;
create table if not exists a(i bit(64)) ENGINE=MYISAM;
LOAD DATA LOCAL INFILE "c:\\spadim\\t.txt" 
IGNORE INTO TABLE `a` FIELDS TERMINATED BY ';' LINES TERMINATED BY '\n' (i);
select * from a;

RESULT:

i
00001101    (binary 00001101 = decimal 13 = hex 0d = octal 15)
00001101
00001101
00001101
00001101
00001101
00001101
00001101


Generated at Thu Feb 08 07:31:56 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.