[MDEV-9714] Connect CSV file sep_char option Created: 2016-03-11  Updated: 2016-03-16  Resolved: 2016-03-12

Status: Closed
Project: MariaDB Server
Component/s: Storage Engine - Connect
Affects Version/s: 10.1
Fix Version/s: 10.0.25, 10.1.14

Type: Bug Priority: Major
Reporter: Btissam Assignee: Olivier Bertrand
Resolution: Not a Bug Votes: 0
Labels: None
Environment:

Windows



 Description   

Hi,

I have a file with separators "(#]" I need to load it ,using CONNECT engine:

name.csv

Name(#]birth(#]children
"Archibald"(#]17/05/01(#]3
"Nabucho"(#]12/08/03(#]2

create table people (
  name char(12) not null,
  birth date not null date_format='DD/MM/YY',
  children smallint(2) not null)
engine=CONNECT table_type=CSV file_name='C:\\MariaDB\\data\\name.csv'
header=1 sep_char='(#]' quoted=1;

It gives :

Erreur SQL (1296) : Got error 122 'Field 2 too long for birth line 1 of C:\MariaDB\data\name.csv' from CONNECT */

Is there any way to load this kind of file?

Thanks for help.

Regards



 Comments   
Comment by Elena Stepanova [ 2016-03-12 ]

From the documentation it looks like sep_char is supposed to be 1 symbol, but it's not said so explicitly. Besides, if it's so, it would make more sense to produce an error upon table creation, not upon SELECT.

Comment by Btissam [ 2016-03-12 ]

I got the error from the Select statement

Comment by Olivier Bertrand [ 2016-03-12 ]

Your file is not a CSV file. CSV files have a one character separator as said in the MariaDB documentation:

Many source data files are formatted with variable length fields and records. The simplest format, known as CSV (Comma Separated Variables), has column fields separated by a separator character. By default, the separator is a comma but can be specified by the SEP_CHAR option as any character, for instance a semi-colon.

I agree that this should be checked when creating the table and will fix it as soon as possible.

Meanwhile you can create and load your table as a FMT table:

create table people (
  name char(12) not null field_format='"%n%[^"]%n"',
  birth date not null date_format='DD/MM/YY' field_format='(#]%n%[^(]%n',
  children smallint(2) not null field_format='(#]%n%d%n')
engine=CONNECT table_type=FMT file_name='C:\\MariaDB\\data\\name.csv' header=1;

Note that FMT being read only tables, this enables you to load and use the table but not to modify it.

Comment by Btissam [ 2016-03-12 ]

Thanks a lot it works perfectly !

I need to index this table (let say on name) to load data fast , how to do it using this command? (I have a file that can reach 1Million records..).

Many Thanks

Comment by Olivier Bertrand [ 2016-03-12 ]

Like for any table. You can define name as key or simply execute:

create index pn on people(name);

Comment by Btissam [ 2016-03-14 ]

Thanks a lot for your feedback!

Generated at Thu Feb 08 07:36:46 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.