[MCOL-1335] Data load through PDi adapter appears corrupted in the database Created: 2018-04-12  Updated: 2023-10-26  Resolved: 2023-10-25

Status: Closed
Project: MariaDB ColumnStore
Component/s: None
Affects Version/s: 1.1.4
Fix Version/s: Icebox

Type: Bug Priority: Major
Reporter: Elena Kotsinova (Inactive) Assignee: Leonid Fedorov
Resolution: Won't Fix Votes: 7
Labels: None
Environment:

CentOS 7


Epic Link: Consolidate & Redevelop All Columnstore Tools (SDK, Adapters, Backup, Restore, mcsimport)
Sprint: 2018-14, 2018-15, 2018-16, 2018-17, 2018-18, 2018-19

 Description   

1. Start bulk load of a large CSV file with pentaho bulk load adapter. File contains 121 million rows (9 GB in size).
2. The load inserts data into table where all columns are of type = Varchar (256)

CREATE TABLE F_TRANS_DELTA (
trans_id       VARCHAR(256) NULL,
customer_id    VARCHAR(256) NULL,
merchant_id    VARCHAR(256) NULL,
card_id        VARCHAR(256) NULL,
trans_datetime VARCHAR(256) NULL,
trans_type     VARCHAR(256) NULL,
amount         VARCHAR(256) NULL,
reversed       VARCHAR(256) NULL,
response_code  VARCHAR(256) NULL,
is_first_topup VARCHAR(256) NULL
) ENGINE=Columnstore DEFAULT CHARSET=utf8;

3. The pentaho job finished successfully and the data seems to be inserted into the database.
4. The following queries can be executed succesfully:

select * from f_trans_delta limit 10;
select count(*) from f_trans_delta;

5. Run query:

select * from f_trans_delta where merchant_id='12558';

Result:
1. Error in mcsmysql client:
ERROR 1815 (HY000): Internal error: IDB-2035: An internal error occurred. Check the error log file & contact support.
2. Error in messages log file on PM1:
Apr 12 13:12:04 mariadb-59f24c1f-215-1 Calpont[801]: 04.760635 |0|0|0| E 00 CAL0000: /data/buildbot/bb-worker/nightly-rpm-centos7/build/mariadb-columnstore-server/mariadb-columnstore-engine/primitives/primproc/dictstep.cpp@353: assertion 'pt[primMsg->NVALS].offsetIndex != 0' failed
Apr 12 13:12:04 mariadb-59f24c1f-215-1 PrimProc[801]: 04.761228 |0|0|0| W 28 CAL0000: IDB-2035: An internal error occurred. Check the error log file & contact support.

Expected:
Successful query execution.



 Comments   
Comment by Elena Kotsinova (Inactive) [ 2018-04-13 ]

Executed more tests. The issue is related to the data volume loaded by pdi bulk adapter.

Comment by GUIDI [ 2018-06-22 ]

The Cache clearing (select calFlushCache() can be useful. That work for me. The table is no more appears corrupted.

Comment by Andrew Hutchings (Inactive) [ 2018-06-22 ]

elena.kotsinova can you please re-test with 1.1.5? I think this could be related to MCOL-1408.

Comment by Elena Kotsinova (Inactive) [ 2018-07-05 ]

The issue is not fixed in the 1.1.5
I have tested on upgraded and fresh installed environments.
CS 1.1.5,
https://downloads.mariadb.com/Data-Adapters/mariadb-streaming-data-adapters/kettle-data-adapter/1.1.5/centos-7/mariadb-columnstore-kettle-bulk-exporter-plugin-1.1.5.zip

calFlushCache() dosn't repair the table in this case.

Comment by Assen Totin (Inactive) [ 2019-04-09 ]

Confirming, this is fully reproducible with MCS 1.2.2 and corresponding Pentaho bulk load component (Version: 1.2.2, Revision: fddec8b).

Importing 25M rows into an empty using cpimport results in fully functional table. Importing the same data to same empty table over PDI results in partially broken table - some queries work (consistently), others fail (consistently) with IDB-2035.

Flushing the cache or restarting the MCS does not help. No alarms are observed during/post the error.

The only relevant entries down to the debug log are:

Apr 9 15:37:48 p2w5 ExeMgr[13750]: 48.454810 |22|0|0| D 16 CAL0041: Start SQL statement: select count from ebi.etl_measures_actions where act_name = '3.2 TEL: AMBERG ILS' LIMIT 0, 1000; |ebi|
Apr 9 15:37:48 p2w5 Calpont[13641]: 48.624945 |0|0|0| E 00 CAL0000: /data/buildbot/bb-worker/centos7/mariadb-columnstore-engine/primitives/primproc/dictstep.cpp@393: assertion 'pt[primMsg->NVALS].offsetIndex != 0' failed

The PID of the process that raises the assertion corresponds to PrimProc. Enabling per-process log for PrimProc does not give anything more - its log file contains the very same line as above.

This is affecting a customer of ours - how can we help this to be traced/resolved faster? I have a fully working setup and the broken data is available for inspection.

Comment by Assen Totin (Inactive) [ 2019-04-09 ]

The same issue present in mcsimport, so it is not limited to the Pentaho adapter, but rather is somewhere in the MCS API code.

LinuxJedi May ask you to take a look? I can open a separate ticket if preferred. I have a dataset available for testing.

Comment by Andrew Hutchings (Inactive) [ 2019-04-10 ]

assen.totin can you give me a test case using mcsimport? This will be a lot easier for me to debug.

Comment by Assen Totin (Inactive) [ 2019-04-10 ]

Here is the sample table and the queries I run. When run in this order, the last one fails with the IDB-2035. When same data is loaded with cpimport, all queries succeed.

Database schema name is 'ebi'.

create table if not exists etl_measures_actions
(
op_id bigint,
msr_id bigint,
act_id bigint,
msr_name varchar(100),
act_name varchar(100),
msr_status varchar(100),
act_status varchar(100),
msr_prio varchar(15),
act_prio varchar(15),
msr_autostart_yn varchar(15),
msr_mandatory_yn varchar(15),
msr_wplc varchar(100),
act_wplc varchar(100),
msr_cyclical_yn varchar(15),
msr_cycles_no varchar(15),
msr_document_yn varchar(15),
msr_feedback_yn varchar(15),
digit_radio_yn varchar(15),
msr_start_time datetime,
act_start_time datetime,
msr_end_time datetime,
act_end_time datetime
) engine=columnstore default charset=utf8;

SELECT * FROM ebi.etl_measures_actions LIMIT 0, 1000;
SELECT * FROM ebi.etl_measures_actions WHERE msr_name = '3.2 TEL: FREMD ILS / RLST' LIMIT 0, 1000;
SELECT * FROM ebi.etl_measures_actions LIMIT 0, 1000;
SELECT * FROM ebi.etl_measures_actions WHERE act_name = '3.2 TEL: AMBERG ILS' LIMIT 0, 1000;
SELECT count FROM ebi.etl_measures_actions WHERE act_name = '3.2 TEL: AMBERG ILS' LIMIT 0, 1000;

I'll send you a link to download the data for this table.

Comment by Abie Reifer [ 2019-12-25 ]

Wondering if there is a fix or workaround for this issue. I seem to be running into it. Thanks

Generated at Thu Feb 08 02:27:58 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.