[MCOL-4493] Wrong query results are produced when impossible conditions are used in ON join predicates Created: 2021-01-12  Updated: 2021-02-05  Resolved: 2021-02-05

Status: Closed
Project: MariaDB ColumnStore
Component/s: Columnstore Select Handler
Affects Version/s: 5.5.1
Fix Version/s: 5.5.2

Type: Bug Priority: Blocker
Reporter: Todd Stoffel (Inactive) Assignee: Daniel Lee (Inactive)
Resolution: Fixed Votes: 0
Labels: None

Sprint: 2021-1, 2021-2

 Description   

Queries involving joins and aggregation could return incorrect results.

A work around is to

SET columnstore_select_handler = OFF;

However, this could lead to performance degradation. Below is a reproducible example:

drop schema if exists poc;
create schema poc;
use poc;
drop table if exists holiday_config;
create table holiday_config(
	region_id varchar(32) NOT NULL,
	metric_name varchar(64) NOT NULL,
	holiday_type varchar(32) NOT NULL
)ENGINE=innodb DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci;
drop table if exists surgery_holiday;
create table surgery_holiday(
	region_id varchar(32) NOT NULL,
	holiday_date date NOT NULL,
	holiday_type varchar(32) NOT NULL
)ENGINE=innodb DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci;
drop table if exists location_master;
create table location_master(
	region_id varchar(32) NOT NULL,
	location_id varchar(32) NOT NULL
)ENGINE=innodb DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci;
drop table if exists room_master;
create table room_master(
	room_id varchar(32) NOT NULL,
	location_id varchar(32) NOT NULL
)ENGINE=innodb DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci;
drop table if exists orcase_essentials;
CREATE TABLE orcase_essentials (
  surgery_date date NOT NULL,
  location varchar(32) DEFAULT NULL,
  room_id varchar(32) DEFAULT NULL
) ENGINE=innodb DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci;
drop table if exists orcase_essentials_cs;
CREATE TABLE orcase_essentials_cs (
  surgery_date date NOT NULL,
  location varchar(32) DEFAULT NULL,
  room_id varchar(32) DEFAULT NULL
) ENGINE=columnstore DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci;
truncate table holiday_config;
truncate table surgery_holiday;
truncate table location_master;
truncate table room_master;
truncate table orcase_essentials;
truncate table orcase_essentials_cs;
INSERT INTO holiday_config (region_id,metric_name,holiday_type) VALUES 
('R1','ADD_ON_CASE_RATIO','REGULAR')
,('R1','BLOCK_UTILIZATION','COVID')
,('R1','BLOCK_UTILIZATION','REGULAR')
,('R1','FIRST_CASE_DELAY_MINUTES','REGULAR')
,('R1','FIRST_CASE_ON_TIME_PERCENT','REGULAR')
,('R1','ROOM_UTILIZATION','COVID')
,('R1','ROOM_UTILIZATION','REGULAR')
,('R1','STAFFED_ROOM_UTILIZATION','COVID')
,('R1','STAFFED_ROOM_UTILIZATION','REGULAR')
,('R1','TURNOVER_MINUTES','REGULAR')
;
INSERT INTO location_master (region_id,location_id) VALUES 
('R1','Location_X')
;
INSERT INTO room_master (room_id,location_id) VALUES 
('room_1','Location_X')
;
INSERT INTO surgery_holiday (region_id,holiday_date,holiday_type) VALUES 
('R1','2020-05-07','COVID')
;
INSERT INTO orcase_essentials (surgery_date,location,room_id) VALUES 
('2020-05-07','Location_X','room_1')
,('2020-05-07','Location_X','room_1')
,('2020-05-07','Location_X','room_1')
;
INSERT INTO orcase_essentials_cs (surgery_date,location,room_id) VALUES 
('2020-05-07','Location_X','room_1')
,('2020-05-07','Location_X','room_1')
,('2020-05-07','Location_X','room_1')
;

This produces expected results:

SELECT *
  FROM orcase_essentials _e
  INNER JOIN location_master _lm 
    ON _e.location = _lm.location_id
  LEFT JOIN (
    SELECT DISTINCT rm.room_id, _sh.holiday_date
    FROM surgery_holiday _sh
    INNER JOIN holiday_config _hc 
      ON _sh.region_id = _hc.region_id 
    AND _hc.metric_name = 'CASE_LENGTH_ACCURACY' 
    AND _sh.holiday_type = _hc.holiday_type
    INNER JOIN location_master lm 
      ON _sh.region_id = lm.region_id
    INNER JOIN room_master rm
      ON lm.location_id = rm.location_id
    ) _h
  ON _e.room_id = _h.room_id 
  AND _e.surgery_date = _h.holiday_date
  WHERE _e.surgery_date = '2020-05-07' 
;

+--------------+------------+---------+-----------+-------------+---------+--------------+
| surgery_date | location   | room_id | region_id | location_id | room_id | holiday_date |
+--------------+------------+---------+-----------+-------------+---------+--------------+
| 2020-05-07   | Location_X | room_1  | R1        | Location_X  | NULL    | NULL         |
| 2020-05-07   | Location_X | room_1  | R1        | Location_X  | NULL    | NULL         |
| 2020-05-07   | Location_X | room_1  | R1        | Location_X  | NULL    | NULL         |
+--------------+------------+---------+-----------+-------------+---------+--------------+

This version does not:

SELECT *
  FROM orcase_essentials_cs _e
  INNER JOIN location_master _lm 
    ON _e.location = _lm.location_id
  LEFT JOIN (
    SELECT DISTINCT rm.room_id, _sh.holiday_date
    FROM surgery_holiday _sh
    INNER JOIN holiday_config _hc 
      ON _sh.region_id = _hc.region_id 
    AND _hc.metric_name = 'CASE_LENGTH_ACCURACY' 
    AND _sh.holiday_type = _hc.holiday_type
    INNER JOIN location_master lm 
      ON _sh.region_id = lm.region_id
    INNER JOIN room_master rm
      ON lm.location_id = rm.location_id
    ) _h
  ON _e.room_id = _h.room_id 
  AND _e.surgery_date = _h.holiday_date
  WHERE _e.surgery_date = '2020-05-07' 
;

+--------------+------------+---------+-----------+-------------+---------+--------------+
| surgery_date | location   | room_id | region_id | location_id | room_id | holiday_date |
+--------------+------------+---------+-----------+-------------+---------+--------------+
| 2020-05-07   | Location_X | room_1  | R1        | Location_X  | room_1  | 2020-05-07   |
| 2020-05-07   | Location_X | room_1  | R1        | Location_X  | room_1  | 2020-05-07   |
| 2020-05-07   | Location_X | room_1  | R1        | Location_X  | room_1  | 2020-05-07   |
+--------------+------------+---------+-----------+-------------+---------+--------------+



 Comments   
Comment by Gagan Goel (Inactive) [ 2021-02-01 ]

For QA: Queries to verify the fix are in the issue description.

Comment by Daniel Lee (Inactive) [ 2021-02-05 ]

Build verified: 5.5.2-1 (Drone #1627)

The mentioned query is now working as expected.

Generated at Thu Feb 08 02:50:45 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.