[MDEV-12603] Insert command truncates ZIP CSV table Created: 2017-04-27  Updated: 2017-05-11  Resolved: 2017-05-08

Status: Closed
Project: MariaDB Server
Component/s: Storage Engine - Connect
Affects Version/s: 10.1.22
Fix Version/s: 10.1.24, 10.0.31

Type: Bug Priority: Major
Reporter: Alice Sherepa Assignee: Olivier Bertrand
Resolution: Fixed Votes: 0
Labels: None


 Description   

commands below show that after insert all rows in ZIP CSV table(CONNECT engine) disappear, zip file is also rewritten.

--------------
CREATE TABLE `c1` (
	`n1` varchar(255)  NOT NULL,
 	`n2` varchar(255)  NOT NULL
) ENGINE=CONNECT  `table_type`=CSV `file_name`='/home/alice/t/d1.zip' sep_char=';' zipped=1
--------------
Query OK, 0 rows affected (0.06 sec)
 
--------------
select count(*) from c1
--------------
+----------+
| count(*) |
+----------+
|      101 |
+----------+
1 row in set (0.00 sec)
 
--------------
insert into c1 select 1,1
--------------
Query OK, 1 row affected (0.00 sec)
Records: 1  Duplicates: 0  Warnings: 0
 
--------------
select count(*) from c1
--------------
+----------+
| count(*) |
+----------+
|        1 |
+----------+
1 row in set (0.00 sec)



 Comments   
Comment by Alice Sherepa [ 2017-05-02 ]

When table is created from zip file with mutiple files in it(option_list='Entry=.csv), all files in zip file are truncated with file with a name ".csv", if OPTION_LIST='MULENTRIES=1', then file inside zip file has name "-", so zip file is impossible to decompress it because of error in the name.

CREATE TABLE `c1` (
	`n1` varchar(255)  NOT NULL,
 	`n2` varchar(255)  NOT NULL
) ENGINE=CONNECT  `table_type`=CSV `file_name`='/home/alice/Downloads/Downloads.zip' zipped=1 sep_char="|" option_list='Entry=*.csv'
--------------
Query OK, 0 rows affected (0.05 sec)
 
--------------
insert into c1 values ('1','1')
--------------
Query OK, 1 row affected (0.00 sec)
 
 
CREATE TABLE `c2` (
	`n1` varchar(255)  NOT NULL,
 	`n2` varchar(255)  NOT NULL
) ENGINE=CONNECT  `table_type`=CSV `file_name`='/home/alice/Downloads/Downloads1.zip' zipped=1 sep_char="|" OPTION_LIST='MULENTRIES=1'
--------------
Query OK, 0 rows affected (0.02 sec)
 
--------------
insert into c2 values ('1','1')
--------------
Query OK, 1 row affected (0.00 sec)

Comment by Olivier Bertrand [ 2017-05-02 ]

Please read the documentation at:
https://mariadb.com/kb/en/mariadb/connect-zipped-file-tables/

This is explained in the paragraph Creating new zipped tables

Comment by Elena Stepanova [ 2017-05-03 ]

bertrandop:

Hi Olivier,

Regarding the initial issue (INSERT into an existing table overriding the contents), we know that it's the current design, but I'd like to ask you to reconsider before it's too late.

As we are trying to achieve the state when CONNECT engine can be officially recommended for production, it is important that the engine provides some safety which can be normally expected from a stable engine.

It is quite common that an engine has limitations upon certain types of DML/DDL, there is nothing wrong with it; but then it should actually refuse to perform the operation rather than silently damage the data. If you take ARCHIVE engine for example, it cannot do UPDATE, but then it just says so:

MariaDB [test]> update ta set i = 2;
ERROR 1031 (HY000): Storage engine ARCHIVE of the table `test`.`ta` doesn't have this option

I believe CONNECT engine, when restrictions apply, should behave in a similar way.

In this particular case, a more reasonable approach would be to allow to create the data file using the "Insert" method, but there is no functional value in allowing to replace the data the same way. If somebody wants to discard the existing data and create a new set, they can (and should) drop it explicitly. Replacing only opens the wide possibility for human errors.

Please also note that while technically documentation does warn about data replacement, it does so not very clearly.
First, it says

This is a read-only implementation. No insert, update or delete.

OK, for me it means that the user can be sure that no damage will be done – that's what "read-only" normally stands for.

Then it says

Creating new zipped tables
Tables can be created to access already existing zip files. However, is it also possible to make the zip file from an existing file or table. Two ways are available to make the zip file:
Insert method
...

OK, so as long as I don't need to create new zip files, it does not concern me.
But only there it finally says:

when executing an insert statement the inserted records are not added but replace the existing ones

Which contradicts the first statement and does not really have much to do with the section it is under.

We cannot rely on users to read and memorize the fine print, and while technically the data loss would be their fault, they won't be happy.

Comment by Olivier Bertrand [ 2017-05-03 ]

I agree. I shall reconsider this.

Comment by Olivier Bertrand [ 2017-05-08 ]

Done

Generated at Thu Feb 08 07:59:01 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.