[MDEV-12354] MariaDB CONNECT CSV (Zipped Tables) Created: 2017-03-24  Updated: 2017-05-22  Resolved: 2017-05-11

Status: Closed
Project: MariaDB Server
Component/s: Storage Engine - Connect
Affects Version/s: 10.1.22
Fix Version/s: 10.1.24, 10.0.31, 10.2.6, 10.3.1

Type: Bug Priority: Critical
Reporter: Juan Telleria Assignee: Olivier Bertrand
Resolution: Fixed Votes: 0
Labels: None
Environment:

Windows


Attachments: Text File P700-error.log     Text File error log.log    
Issue Links:
Relates
relates to MDEV-11295 CONNECT Storage Engine - ZIP Engine Closed

 Description   

When using MULTIPLE=1 for importing 566 CSV Zipped Files, of 1000 rows each one, my Server ShutsDown.

My CREATE TABLE is as follows:

CREATE TABLE dispoin.t_CONNECT_CSV_Zipped_DISPOIN_Server
(
  `SCADA` VARCHAR(25) DEFAULT NULL,
  `TAG` VARCHAR(25) DEFAULT NULL,
  `ID_del_AEG` VARCHAR(25) DEFAULT NULL,
  `Descripcion` VARCHAR(255) DEFAULT NULL,
  `Time_ON` VARCHAR(19) DEFAULT NULL,
  `Time_OFF` VARCHAR(19) DEFAULT NULL,
  `Delta_Time` VARCHAR(19) DEFAULT NULL,
  `Comentario` VARCHAR(255) DEFAULT NULL,
  `Es_Alarma` VARCHAR(5) DEFAULT NULL,
  `Es_Ultima` VARCHAR(5) DEFAULT NULL,
  `Comentarios` VARCHAR(255) DEFAULT NULL
)
  ENGINE = CONNECT
  TABLE_TYPE = CSV
  FILE_NAME='//srvdiscsv/data/2017-03-23/*AL*.zip'
  HEADER=NO
  SEP_CHAR=';'
  MULTIPLE=1
  ZIPPED=YES
  READONLY=YES
  OPTION_LIST='maxerr=10000';

In the same folder. For 1 single CSV ZIP file, same format (MULTIPLE=NO) works perfect.

ERROR LOG
Thread pointer: 0x1e611018
Attempting backtrace. You can use the following information to find out
where mysqld died. If you see no messages after this, something went
terribly wrong...
ha_connect.dll!MAPFAM::ReadBuffer()[filamap.cpp:365]
ha_connect.dll!TDBCSV::ReadBuffer()[tabfmt.cpp:906]
ha_connect.dll!TDBMUL::ReadDB()[tabmul.cpp:520]
ha_connect.dll!CntReadNext()[connect.cc:464]
ha_connect.dll!ha_connect::rnd_next()[ha_connect.cc:3885]
mysqld.exe!handler::ha_rnd_next()[handler.cc:2578]
mysqld.exe!rr_sequential()[records.cc:470]
mysqld.exe!sub_select()[sql_select.cc:18332]
mysqld.exe!do_select()[sql_select.cc:17968]
mysqld.exe!JOIN::exec_inner()[sql_select.cc:3221]
mysqld.exe!JOIN::exec()[sql_select.cc:2512]
mysqld.exe!mysql_select()[sql_select.cc:3447]
mysqld.exe!handle_select()[sql_select.cc:372]
mysqld.exe!execute_sqlcom_select()[sql_parse.cc:5920]
mysqld.exe!mysql_execute_command()[sql_parse.cc:2979]
mysqld.exe!mysql_parse()[sql_parse.cc:7348]
mysqld.exe!dispatch_command()[sql_parse.cc:1492]
mysqld.exe!do_command()[sql_parse.cc:1109]
mysqld.exe!threadpool_process_request()[threadpool_common.cc:271]
mysqld.exe!io_completion_callback()[threadpool_win.cc:568]
kernel32.dll!BaseFormatTimeOut()
ntdll.dll!RtlEqualDomainName()
ntdll.dll!TpIsTimerSet()
kernel32.dll!BaseThreadInitThunk()
ntdll.dll!RtlUserThreadStart()

Trying to get some variables.
Some pointers may be invalid and cause the dump to abort.
Query (0x1e488030): SELECT * FROM dispoin.t_CONNECT_CSV_Zipped_DISPOIN_Server
Connection ID (thread ID): 7
Status: NOT_KILLED

Optimizer switch: index_merge=on,index_merge_union=on,index_merge_sort_union=on,index_merge_intersection=on,index_merge_sort_intersection=off,engine_condition_pushdown=off,index_condition_pushdown=on,derived_merge=on,derived_with_keys=on,firstmatch=on,loosescan=on,materialization=on,in_to_exists=on,semijoin=on,partial_match_rowid_merge=on,partial_match_table_scan=on,subquery_cache=on,mrr=off,mrr_cost_based=off,mrr_sort_keys=off,outer_join_with_cache=on,semijoin_with_cache=on,join_cache_incremental=on,join_cache_hashed=on,join_cache_bka=on,optimize_join_buffer_size=off,table_elimination=on,extended_keys=on,exists_to_in=on,orderby_uses_equalities=off



 Comments   
Comment by Olivier Bertrand [ 2017-05-08 ]

There has been many changes in CONNECT since this. Do you have still this problem?
If yes, can you attach new error log because line numbers do not match anymore?
Thanks.

Comment by Juan Telleria [ 2017-05-09 ]

I obtained this error by using Connect Version 1.05.0003, with mariaDB 10.1.22

I have not been able to test the new version of connect, as it was not included with MariaDB 10.1.23 distribution, as long as I know.

However, I obtained this error by using the test files I attached you in MDEV-11295 , filtering them with a WHERE clause, or leaving the engine to obtain the table definition by means of CREATE OR REPLACE TABLE myTable AS SELECT [...]

Comment by Olivier Bertrand [ 2017-05-09 ]

This query returns a result set of 996873 rows! Fortunately I have a GUI client program allowing to execute a query without displaying the result set.

With it, I could successfully execute this query responding:

996873 rows in set (38.96 sec) (64 warnings)

The warnings are:

Out of range value for column Time_ON at row x
Out of range value for column Time_OFF at row x

with x ranging from 1 to 32.

I was also able to execute the query:

SELECT SCADA, TAG, ID_del_AEG, Es_Ultima FROM tell
WHERE Descripcion = '534 Maximum slip rings heating time';

returning:

SCADA TAG ID_del_AEG Es_Ultima
W1061 W1061_026A18046 BD-26 True
W1061 W1061_031A18046 BD-31 True
W1061 W1061_034A18046 BD-34 False
W1061 W1061_037A18046 BD-37 True
W1063 W1063_004A18046 E07 False
W1063 W1063_004A18046 E07 True
W1081 W1081_005A18046 B-01 True
W1081 W1081_008A18046 B-72 True
W1081 W1081_012A18046 B-80 True
W1081 W1081_020A18046 B-84 True
W1081 W1081_022A18046 B-89 True
W1081 W1081_018A18046 B-91 True
W1081 W1081_002A18046 T-02 True
W1085 W1085_001A18046 G97-003 True
W1085 W1085_003A18046 G97-010 False
W1085 W1085_012A18046 G97-065 True
W1085 W1085_015A18046 G97-082 True
W1085 W1085_031A18046 G97-211 True
W1102 W1102_041A18046 T-41 True
W1102 W1102_043A18046 T-43 True
W1107 W1107_001A18046 CA501 True
Comment by Juan Telleria [ 2017-05-09 ]

It seems that the bug has been fixed.

Thank you

Really looking forward to try the new "CONNECT CSV Zipped File Table" Engine!

Comment by Olivier Bertrand [ 2017-05-09 ]

SELECT SCADA, COUNT(*) FROM tell GROUP BY SCADA HAVING COUNT(*) >= 10000;

SCADA COUNT
GA259 16269
GA260 12610
GA382 24981
GA490 24114
GA526 10647
GA561 18901
GA582 40583
GA583 21924
GA589 38923
GA591 14940
GA638 13796
GA654 13672
GA944 38843
MAETT 27189
MATKC 13946
MATME 24843
W0807 16240
W0876 26471
W1102 10277
Comment by Juan Telleria [ 2017-05-09 ]

Works perfectly!

Comment by Juan Telleria [ 2017-05-11 ]

For final check, what about:

CREATE OR REPLACE TABLE MyTable AS 
     SELECT SCADA, TAG, ID_del_AEG, Es_Ultima FROM tell
     WHERE Descripcion = '534 Maximum slip rings heating time';

¿This shall throw an error because no table definition and using Multiple = 1 or Multiple = 3?

¿When will it be available new CONNECT Version?

Comment by Olivier Bertrand [ 2017-05-11 ]

I don't think so. You are creating a InnoDB (or MyISAM) table with standard MySQL syntax (column definitions come from the selected table) and multiple applies to tell, not MyTable.

When will it be available new CONNECT Version
Good question. I wish I could know.

Comment by Juan Telleria [ 2017-05-11 ]

ok. Thank you

Comment by Juan Telleria [ 2017-05-22 ]

Does Multiple = 3 (For subfolders) already work for the actual version of CONNECT?

The currently Generally Available distribution of CONNECT (Connect 1.05.0003) does still not support this feature.

Thank you.

Looking forward for MariaDB 10.2.6

Generated at Thu Feb 08 07:57:04 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.