[MCOL-4974] Importing data for a 600-column table caused core dump and errors Created: 2022-01-27  Updated: 2022-03-30  Resolved: 2022-03-30

Status: Closed
Project: MariaDB ColumnStore
Component/s: cpimport
Affects Version/s: 6.2.2
Fix Version/s: 6.3.1

Type: Bug Priority: Critical
Reporter: Daniel Lee (Inactive) Assignee: Sergey Zefirov
Resolution: Fixed Votes: 0
Labels: None
Environment:

Centos 8 VM, 32GB memory


Attachments: File cpimport.sh     File createTables.sh     File create_t1.sql     File genData.sh     File t.dat     File widetable.sql    

 Description   

Build tested: 6.2.2-1

I have attached scripts and data for reproducing the reported issues

Run genData.sh to produce a 1 million row dataset.

Scenario #1 (core dump)
Steps to reproduce

1. Use the create_t1.sql to create a 600-column table
2. Use the cpimport.sh to load 1 million row dataset 50 times

On the 3rd cpimport run, core dump occurred. I tried the test twice and got two different malloc() errors.

malloc(): mismatching next->prev_size (unsorted)

iteration=3
Locale = en_US.UTF-8
Using table OID 3837 as the default JOB ID
Input file(s) will be read from : /root
Job description file : /var/log/mariadb/columnstore/data/bulk/tmpjob/3837_D20220127_T142723_S371596_Job_3837.xml
Log file for this job: /var/log/mariadb/columnstore/cpimport/Job_3837.log
2022-01-27 14:27:23 (5697) INFO : successfully loaded job file /var/log/mariadb/columnstore/data/bulk/tmpjob/3837_D20220127_T142723_S371596_Job_3837.xml
2022-01-27 14:27:23 (5697) INFO : Job file loaded, run time for this step : 0.141156 seconds
2022-01-27 14:27:23 (5697) INFO : PreProcessing check starts
2022-01-27 14:27:23 (5697) INFO : input data file /tmp/1milRows.dat
2022-01-27 14:27:25 (5697) INFO : PreProcessing check completed
2022-01-27 14:27:25 (5697) INFO : preProcess completed, run time for this step : 2.37146 seconds
2022-01-27 14:27:25 (5697) INFO : No of Read Threads Spawned = 1
2022-01-27 14:27:25 (5697) INFO : No of Parse Threads Spawned = 3
malloc(): mismatching next->prev_size (unsorted)
./cpimport.sh: line 12:  5697 Aborted                 (core dumped) cpimport mytest t1 /tmp/1milRows.dat

malloc(): unsorted double linked list corrupted

iteration=3
Locale = en_US.UTF-8
Using table OID 4637 as the default JOB ID
Input file(s) will be read from : /root
Job description file : /var/log/mariadb/columnstore/data/bulk/tmpjob/4637_D20220127_T143953_S265583_Job_4637.xml
Log file for this job: /var/log/mariadb/columnstore/cpimport/Job_4637.log
2022-01-27 14:39:53 (6488) INFO : successfully loaded job file /var/log/mariadb/columnstore/data/bulk/tmpjob/4637_D20220127_T143953_S265583_Job_4637.xml
2022-01-27 14:39:53 (6488) INFO : Job file loaded, run time for this step : 0.137996 seconds
2022-01-27 14:39:53 (6488) INFO : PreProcessing check starts
2022-01-27 14:39:53 (6488) INFO : input data file /tmp/1milRows.dat
2022-01-27 14:39:55 (6488) INFO : PreProcessing check completed
2022-01-27 14:39:55 (6488) INFO : preProcess completed, run time for this step : 2.4601 seconds
2022-01-27 14:39:55 (6488) INFO : No of Read Threads Spawned = 1
2022-01-27 14:39:55 (6488) INFO : No of Parse Threads Spawned = 3
malloc(): unsorted double linked list corrupted
./cpimport.sh: line 12:  6488 Aborted                 (core dumped) cpimport mytest t1 /tmp/1milRows.dat

Scenario #2

1. Run createTables.sh to create 300 tables (t1-t300, 600 columns each). The script will insert two rows into each table.
2. Run cpimport.sh to load data into t1 50 times.

Eventually, error similar to the following will occur. The table will not be usable after that

Locale = en_US.UTF-8
Using table OID 3037 as the default JOB ID
Input file(s) will be read from : /root
Job description file : /var/log/mariadb/columnstore/data/bulk/tmpjob/3037_D20220127_T003300_S225960_Job_3037.xml
Log file for this job: /var/log/mariadb/columnstore/cpimport/Job_3037.log
2022-01-27 00:33:00 (20908) INFO : successfully loaded job file /var/log/mariadb/columnstore/data/bulk/tmpjob/3037_D20220127_T003300_S225960_Job_3037.xml
2022-01-27 00:33:00 (20908) INFO : Job file loaded, run time for this step : 0.319326 seconds
2022-01-27 00:33:00 (20908) INFO : PreProcessing check starts
2022-01-27 00:33:00 (20908) INFO : input data file /tmp/1milRows.dat
2022-01-27 00:33:05 (20908) INFO : PreProcessing check completed
2022-01-27 00:33:05 (20908) INFO : preProcess completed, run time for this step : 4.63784 seconds
2022-01-27 00:33:05 (20908) INFO : No of Read Threads Spawned = 1
2022-01-27 00:33:05 (20908) INFO : No of Parse Threads Spawned = 3
2022-01-27 00:33:05 (20908) ERR  : Error uncompressing HWM chunk for: OID-3534; DBRoot-1; part-0; seg-1; hwm-5268; 
cpimport.bin[20908]: 05.692205 |0|0|0| E 34 CAL0087: BulkLoad Error: Error uncompressing HWM chunk for: OID-3534; DBRoot-1; part-0; seg-1; hwm-5268; 
2022-01-27 00:33:05 (20908) ERR  : writeToFile: error initializing to-be-compressed buffer for OID 3534;  Error in uncompressing data.  [1652]
cpimport.bin[20908]: 05.692339 |0|0|0| E 34 CAL0087: BulkLoad Error: writeToFile: error initializing to-be-compressed buffer for OID 3534;  Error in uncompressing data. 
2022-01-27 00:33:05 (20908) ERR  : writeToFileExtentCheck: write1 extent failed:  Error in uncompressing data.  [1652]
cpimport.bin[20908]: 05.692428 |0|0|0| E 34 CAL0087: BulkLoad Error: writeToFileExtentCheck: write1 extent failed:  Error in uncompressing data. 
2022-01-27 00:33:05 (20908) CRIT : Bulkload Parse (thread 2) Failed for Table mytest.t1 during parsing.  Terminating this job. [1652]
cpimport.bin[20908]: 05.692504 |0|0|0| E 34 CAL0087: BulkLoad Error: Bulkload Parse (thread 2) Failed for Table mytest.t1 during parsing.  Terminating this job.
2022-01-27 00:33:05 (20908) INFO : Bulkload Read (thread 0) Stopped reading Table mytest.t1.  TableInfo::readTableData(2) responding to job termination
2022-01-27 00:33:05 (20908) INFO : Bulkload Parse (thread 0) Stopped parsing Tables. BulkLoad::parse() responding to job termination
2022-01-27 00:33:05 (20908) INFO : Bulkload Parse (thread 1) Stopped parsing Tables. BulkLoad::parse() responding to job termination
2022-01-27 00:33:06 (20908) INFO : Table mytest.t1 (OID-3037) was not successfully loaded.  Rolling back.
Rollback flushing: 3038, 3039, 3040, 3041, 3042, 3043, 3044, 3045, 3046, 3047, 3048, 3049, 3050, 3051, 3052, 3053, 3054, 3055, 3056, 3057, 3058, 3059, 3060, 3061, 3062, 3063, 3064, 3065, 3066, 3067, 3068, 3069, 3070, 3071, 3072, 3073, 3074, 3075, 3076, 3077, 3078, 3079, 3080, 3081, 3082, 3083, 3084, 3085, 3086, 3087, 3088, 3089, 3090, 3091, 3092, 3093, 3094, 3095, 3096, 3097, 3098, 3099, 3100, 3101, 3102, 3103, 3104, 3105, 3106, 3107, 3108, 3109, 3110, 3111, 3112, 3113, 3114, 3115, 3116, 3117, 3118, 3119, 3120, 3121, 3122, 3123, 3124, 3125, 3126, 3127, 3128, 3129, 3130, 3131, 3132, 3133, 3134, 3135, 3136, 3137, 3138, 3139, 3140, 3141, 3142, 3143, 3144, 3145, 3146, 3147, 3148, 3149, 3150, 3151, 3152, 3153, 3154, 3155, 3156, 3157, 3158, 3159, 3160, 3161, 3162, 3163, 3164, 3165, 3166, 3167, 3168, 3169, 3170, 3171, 3172, 3173, 3174, 3175, 3176, 31



 Comments   
Comment by Sergey Zefirov [ 2022-03-04 ]

I've used recent 10.6 with recent develop on centos 8.

The sequence of commands was:

# bash genData.sh
... some 500000 numbers
# echo "create database mytest;" | mysql
# mysql mytest <create_t1.sql
# bash cpimport.sh

The execution of ```bash cpimport.sh``` went well, without any problems - all 50 iterations of cpimport invocation were successful.

Comment by Daniel Lee (Inactive) [ 2022-03-29 ]

This ticket was changed to "IN TESTING" more than 3 weeks ago but not assigned to me. Is his meant for QA to test? If it is meant for QA, please update the assignee correctly so that it would be in my radar. Anyway, I will retest with the latest 6.3.1.

Comment by Daniel Lee (Inactive) [ 2022-03-30 ]

Build verified: 6.3.1-1 (#4101)

Retested both scenarios in 6.3.1-1 and the reported issue no longer occurring.

Generated at Thu Feb 08 02:54:25 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.