[MCOL-4741] in/like/equal(=) returns different result Created: 2021-06-02  Updated: 2021-09-03  Resolved: 2021-07-06

Status: Closed
Project: MariaDB ColumnStore
Component/s: None
Affects Version/s: 5.5.2
Fix Version/s: 6.1.1

Type: Bug Priority: Blocker
Reporter: Allen Lee (Inactive) Assignee: Daniel Lee (Inactive)
Resolution: Fixed Votes: 1
Labels: None

Attachments: File client.cnf     File columnstore.cnf     File mariadb-enterprise.cnf     File mysql-clients.cnf     File repro.sql     File repro.sql     File server.cnf     File z-custom-my.cnf    
Sprint: 2021-8, 2021-9

 Description   

User reported that they are getting the different result when using in or like. Here is the example and table definition.

MariaDB > SELECT Call_Date,Hub_Site,Site_Code,Call_Direction FROM tb1 WHERE Site_Code IN ('stp') ORDER BY Call_Date DESC LIMIT 1\G;
*************************** 1. row ***************************
Call_Date: 2021-05-12 16:44:56
Hub_Site: _ai
Site_Code: stp
Call_Direction: Outbound
 
MariaDB > SELECT Call_Date,Hub_Site,Site_Code,Call_Direction FROM tb1 WHERE Site_Code LIKE 'stp' ORDER BY Call_Date DESC LIMIT 1\G;
*************************** 1. row ***************************
Call_Date: 2021-05-26 16:48:36
Hub_Site: _ai
Site_Code: stp
Call_Direction: Outbound
 
Create Table: CREATE TABLE `tb1` (
  `CallID` bigint(25) NOT NULL,
  `pkid` varchar(120) NOT NULL DEFAULT ' ',
  `Hub_Site` varchar(5) DEFAULT '',
  `Site_Code` varchar(5) DEFAULT '',
  `Call_Direction` varchar(25) DEFAULT '',
  `Call_Date` datetime DEFAULT NULL
) ENGINE=Columnstore DEFAULT CHARSET=latin1

When checking length of that

MariaDB> SELECT MIN(lengthb(Site_Code)), MAX(lengthb(Site_Code)) FROM tb1\G;
*************************** 1. row ***************************
MIN(lengthb(Site_Code)): 3
MAX(lengthb(Site_Code)): 3
1 row in set (0.504 sec)

User created the same table and imported data once again, but it did happen sometime after.
This also happened another table as below. FYI, tb2 table has the same structure as tb1 above.

MariaDB > SELECT Call_Date,Site_Code,code_dial FROM tb2 WHERE code_dial = '9' ORDER BY Call_Date DESC LIMIT 5\G;
*************************** 1. row ***************************
Call_Date: 2021-05-15 14:51:18
Site_Code: ahk
code_dial: 9
 
MariaDB > SELECT Call_Date,Site_Code,code_dial FROM tb2 WHERE code_dial LIKE '9' ORDER BY Call_Date DESC LIMIT 5\G;
*************************** 1. row ***************************
Call_Date: 2021-06-01 23:49:56
Site_Code: asc
code_dial: 9



 Comments   
Comment by Sergey Zefirov [ 2021-06-18 ]

I've run this on 10.6.

The reproduction script is this:

DROP DATABASE IF EXISTS no_results_bug;
CREATE DATABASE no_results_bug;
 
USE no_results_bug;
 
CREATE TABLE `avaya` (
`CallID` int(11) NOT NULL,
`callDate` datetime DEFAULT NULL,
`time` int(11) DEFAULT NULL,
`sec_dur` int(11) NOT NULL,
`cond_code` varchar(1) DEFAULT '',
`code_dial` varchar(4) DEFAULT '',
`code_used` varchar(4) DEFAULT '',
`dialed_num` varchar(23) DEFAULT '',
`calling_num` varchar(15) DEFAULT '',
`acct_code` varchar(15) DEFAULT '',
`auth_code` varchar(13) DEFAULT '',
`vdn` varchar(13) DEFAULT '',
`location_to` varchar(3) DEFAULT '',
`location_from` varchar(3) DEFAULT '',
`in_trk_code` varchar(4) DEFAULT '',
`frl` varchar(1) DEFAULT '',
`in_crt_id` varchar(3) DEFAULT '',
`out_crt_id` varchar(3) DEFAULT '',
`feat_flag` varchar(1) DEFAULT '',
`attd_console` varchar(2) DEFAULT '',
`node_num` varchar(2) DEFAULT '',
`ins` varchar(3) DEFAULT '',
`inx` varchar(3) DEFAULT '',
`ixc_code` varchar(3) DEFAULT '',
`bcc` varchar(1) DEFAULT '',
`ma_uui` varchar(1) DEFAULT '',
`res_flag` varchar(1) DEFAULT '',
`tsc_ct` varchar(4) DEFAULT '',
`packet_ct` varchar(4) DEFAULT '',
`tsc_flag` varchar(4) DEFAULT '',
`Hub_Site` varchar(5) DEFAULT '',
`Site_Code` varchar(5) DEFAULT '',
`Location_Code` varchar(55) DEFAULT '',
`Call_Type` varchar(25) DEFAULT '',
`Call_Direction` varchar(25) DEFAULT '',
`City` varchar(55) DEFAULT '',
`State` varchar(55) DEFAULT '',
`Country_Code` varchar(5) DEFAULT '',
`NPA` varchar(5) DEFAULT '',
`NXX` varchar(5) DEFAULT '',
`Tax` double(13,4) DEFAULT NULL,
`Cost` double(13,4) DEFAULT NULL,
`Call_Date` datetime DEFAULT NULL,
`Call_Time` time DEFAULT NULL,
`Incoming_Trunk` varchar(255) DEFAULT '',
`Outgoing_Trunk` varchar(255) DEFAULT '',
`Dialed_Digits` varchar(255) DEFAULT '',
`ANI_Digits` varchar(255) DEFAULT '',
`Call_Duration` int(11) DEFAULT NULL,
`Extension` varchar(255) DEFAULT '',
`Site_Timezone` varchar(15) DEFAULT ''
) ENGINE=Columnstore DEFAULT CHARSET=latin1;
 
LOAD DATA LOCAL infile '423_cdr_avaya.txt' INTO TABLE `avaya`;
SELECT COUNT(*) FROM `avaya` WHERE Site_Code='kch';
 
LOAD DATA LOCAL infile '423_avaya.problem' INTO TABLE `avaya`;
SELECT COUNT(*) FROM `avaya` WHERE Site_Code='kch';

The results of running "mysql <repro.sql" are as follows:

# mysql <repro.sql 
COUNT(*)
3465599
COUNT(*)
3232948

As you can see, after second cpimport run, the number of rows returned is smaller than after first run. We have lost some records.

Comment by Sergey Zefirov [ 2021-06-22 ]

Relevant PR: https://github.com/mariadb-corporation/mariadb-columnstore-engine/pull/1999

Comment by Sergey Zefirov [ 2021-06-22 ]

I've made a PR and asked tntnatbry to review it/ Thus, formally, this ticket goes to review.

Comment by Gagan Goel (Inactive) [ 2021-07-05 ]

sergey.zefirov Can you please share with Daniel the data and queries to reproduce the original customer issue and confirm the fix works?

Comment by Daniel Lee (Inactive) [ 2021-07-06 ]

Build verified: 6.1.1 (#2727)

Generated at Thu Feb 08 02:52:38 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.