|
Interesting is, that with adding a limit, it works fine, so:
SELECT distinct col2 FROM spider_tab WHERE col2="00999" LIMIT 9999999;
|
|
|
The bug is reproducible on 10.4.20, 10.5.10, but not on 10.3.30. This is possibly regression.
|
|
Reproducible on 10.4.7, not reproducible on 10.4.6.
|
|
I created a simpler test case, which fails on 10.4 HEAD (78735dcaf757cd71c8f0ff3d21071b0f89018150) and 10.5 HEAD (35294053b281294bb49a6217da07b93eb6c2595b).
CREATE TABLE IF NOT EXISTS `spider_tab` (
|
`col1`int,
|
`col2`int,
|
PRIMARY KEY (`col1`)
|
) ENGINE=SPIDER DEFAULT CHARSET=utf8 COMMENT='table "spider_data_tab"'
|
PARTITION BY LIST COLUMNS(`col1`)
|
(PARTITION `pt1` DEFAULT COMMENT = 'srv "data1"' ENGINE = SPIDER);
|
|
CREATE TABLE IF NOT EXISTS `spider_data_tab` (
|
`col1` int,
|
`col2` int,
|
PRIMARY KEY (`col1`)
|
) ENGINE=InnoDB DEFAULT CHARSET=utf8;
|
|
insert into `spider_data_tab` VALUES (1,999), (2,999);
|
|
SELECT distinct col2 FROM spider_tab WHERE col2=999;
|
SELECT distinct col2 FROM spider_data_tab WHERE col2=999;
|
|
|
OK: fa7051c419a653a002282eaca60f8202c1066d61
not OK: 9d6b601e797dd8333340dadaefae09ebafc787db
The bug is introduced by the commit 9d6b601e797dd8333340dadaefae09ebafc787db.
commit 9d6b601e797dd8333340dadaefae09ebafc787db (HEAD)
|
Author: Kentoku SHIBA <kentokushiba@gmail.com>
|
Date: Sat Jul 6 23:54:53 2019 +0900
|
|
MDEV-19866 With a Spider table, a SELECT with WHERE involving primary key breaks following SELECTs (#1356)
|
|
Change checking scanning partitions from part_spec to part_info->read_partitions
|
|
|
At first glance, 9d6b601 does not that related to the handling of the DISTINCT clause. I need to look into it further.
|
|
I compared debug traces of OK commit and NOT OK commit. The diff of 9d6b601 is not so large but difference between debug traces are significant.
OK:
vagrant@bld:~/repo
|
> cat OK_trace | grep distinct | grep -v 'distinct: 0'
|
T@20 : | | query: SELECT distinct b FROM tbl_a WHERE b=999
|
T@20 : | | | | | | | info: Query: SELECT distinct b FROM tbl_a WHERE b=999
|
T@20 : | | | | | | | | | | info: Query: SELECT distinct b FROM tbl_a WHERE b=999
|
T@20 : | | | | | | | | | | | | | | | | info: spider with distinct
|
T@20 : | | | | | | | | | | | | | | | | info: spider query=select distinct `b` from `auto_test_remote`.`tbl_a` where (`b` = 999)
|
T@20 : | | | | | | | | | | | | | | | | | | query: Query = 'select distinct `b` from `auto_test_remote`.`tbl_a` where (`b` = 999)'
|
NOT OK:
vagrant@bld:~/repo
|
> cat NOT_OK_trace | grep distinct | grep -v 'distinct: 0'
|
T@20 : | | query: SELECT distinct b FROM tbl_a WHERE b=999
|
T@20 : | | | | | | | info: Query: SELECT distinct b FROM tbl_a WHERE b=999
|
T@20 : | | | | | | | | | | info: Query: SELECT distinct b FROM tbl_a WHERE b=999
|
|
|
OK:
T@20 : | | | | | | | >do_select
|
T@20 : | | | | | | | | >sub_select
|
T@20 : | | | | | | | | | >join_tab_execution_startup
|
T@20 : | | | | | | | | | <join_tab_execution_startup
|
T@20 : | | | | | | | | | >init_read_record
|
...
|
T@20 : | | | | | | | | | >handler::ha_rnd_next
|
T@20 : | | | | | | | | | | >ha_partition::rnd_next
|
...
|
T@20 : | | | | | | | | | | | | | | >ha_spider::check_direct_order_limit
|
NOT OK:
T@20 : | | | | | | | >do_select
|
T@20 : | | | | | | | | >Pushdown_query::execute
|
T@20 : | | | | | | | | | >spider_group_by_handler::init_scan
|
T@20 : | | | | | | | | | | info: spider field_name=b
|
T@20 : | | | | | | | | | | >ha_spider::get_top_table
|
T@20 : | | | | | | | | | | <ha_spider::get_top_table
|
T@20 : | | | | | | | | | | >spider_get_parent_table_list
|
|
|
9d6b601 changes spider_create_group_by_handler(),
OK:
T@20 : | | | | | | | | | >spider_create_group_by_handler
|
T@20 : | | | | | | | | | | info: spider from=0x7f7654012778
|
T@20 : | | | | | | | | | | info: spider partition handler
|
T@20 : | | | | | | | | | | info: spider part_spec->start_part=4294967295
|
T@20 : | | | | | | | | | | info: spider part_spec->end_part=4294967295
|
T@20 : | | | | | | | | | | info: spider using multiple partitions is not supported by this feature yet
|
T@20 : | | | | | | | | | <spider_create_group_by_handler
|
NOT OK:
T@20 : | | | | | | | | | >spider_create_group_by_handler
|
T@20 : | | | | | | | | | | info: spider from=0x7f8388012778
|
T@20 : | | | | | | | | | | info: spider partition handler
|
T@20 : | | | | | | | | | | info: spider bits=1
|
T@20 : | | | | | | | | | | >spider_mysql_support_direct_join
|
T@20 : | | | | | | | | | | <spider_mysql_support_direct_join
|
T@20 : | | | | | | | | | | >spider_param_skip_default_condition
|
T@20 : | | | | | | | | | | | >intern_sys_var_ptr
|
T@20 : | | | | | | | | | | | <intern_sys_var_ptr
|
T@20 : | | | | | | | | | | <spider_param_skip_default_condition
|
T@20 : | | | | | | | | | | >spider_fields::spider_fields
|
T@20 : | | | | | | | | | | | info: spider this=0x7f838813aae0
|
T@20 : | | | | | | | | | | <spider_fields::spider_fields
|
T@20 : | | | | | | | | | | info: spider select item=0x7f8388012620
|
T@20 : | | | | | | | | | | >spider_db_print_item_type
|
T@20 : | | | | | | | | | | | info: spider COND type=0
|
...
|
|
- spider_group_by_handler::init_scan() generates a query for a data node.
- The function adds DISTINCT if and only if spider_group_by_handler::query::distinct is TRUE.
- spider_group_by_handler::query::distinct is set to the value of JOIN::select_distinct in JOIN::make_aggr_tables_info().
- In the test case, DISTINCT is not added because JOIN::select_distinct is FALSE at the call of JOIN::make_aggr_tables_info().
|
|
Even though JOIN::select_distinct is FALSE at the call of JOIN::make_aggr_tables_info(), there seems to be no problem for most storage engines that do not create group by handlers (e.g., InnoDB). However, this is a big problem for the Spider SE.
|
|
This might be an ugly fix but at least it works.
diff --git a/storage/spider/spd_group_by_handler.cc b/storage/spider/spd_group_by_handler.cc
|
index de041897239..b91732441db 100644
|
--- a/storage/spider/spd_group_by_handler.cc
|
+++ b/storage/spider/spd_group_by_handler.cc
|
@@ -1302,7 +1302,11 @@ int spider_group_by_handler::init_scan()
|
while ((dbton_id = fields->get_next_dbton_id()) < SPIDER_DBTON_SIZE)
|
{
|
dbton_hdl = spider->dbton_handler[dbton_id];
|
- result_list->direct_distinct = query.distinct;
|
+ if (select_lex && (select_lex->options & SELECT_DISTINCT))
|
+ {
|
+ DBUG_PRINT("info",("spider with distinct"));
|
+ result_list->direct_distinct = TRUE;
|
+ }
|
fields->set_pos_to_first_field_chain();
|
if ((error_num = dbton_hdl->reset_sql(SPIDER_SQL_TYPE_SELECT_SQL)))
|
{
|
|
|
The above fix is just to cover the wrong handling (at least wrong for federated storage engines) of JOIN::select_distinct in the MariaDB Server. It would be better to fix the server itself.
|
|
I dug the code of 10.4 HEAD (fe9a0bf). There are a few assignments to JOIN::select_distinct and most of them substitute FALSE to the variable. The variable can be set to TRUE only by JOIN::init()
void init(THD *thd_arg, List<Item> &fields_arg, ulonglong select_options_arg,
|
select_result *result_arg)
|
{
|
...
|
select_distinct= MY_TEST(select_options & SELECT_DISTINCT);
|
...
|
}
|
|
|
In fact, JOIN::select_distinct is TRUE at JOIN::init(). The variable is set to FALSE in JOIN::optimize_stage2().
(gdb) watch -location select_distinct
|
Hardware watchpoint 2: -location select_distinct
|
(gdb) c
|
Continuing.
|
...
|
Thread 27 "mysqld" hit Hardware watchpoint 2: -location select_distinct
|
|
Old value = true
|
New value = false
|
JOIN::optimize_stage2 (this=0x7fb798016258) at /home/vagrant/repo/mariadb-server/sql/sql_select.cc:2737
|
2735 /* Change DISTINCT to GROUP BY */
|
2736 select_distinct= 0;
|
2737 no_order= !order;
|
2738 if (all_order_fields_used
|
|
|
I think I finally understand the problem. The function JOIN::optimize_stage2() optimizes away DISTINCT (i.e., sets select_distinct = 0) and sets no_order = 1 but nullifies group_list (sql_select.cc:2765). This could result in a wrong handling of DISTINCT clause in group_by_handlers. So, a possible fix would be the following:
diff --git a/sql/sql_select.cc b/sql/sql_select.cc
|
index d8116b20cbe..72b5be0f63e 100644
|
--- a/sql/sql_select.cc
|
+++ b/sql/sql_select.cc
|
@@ -3275,7 +3275,7 @@ bool JOIN::make_aggr_tables_info()
|
if (ht && ht->create_group_by)
|
{
|
/* Check if the storage engine can intercept the query */
|
- Query query= {&all_fields, select_distinct, tables_list, conds,
|
+ Query query= {&all_fields, (select_distinct || (no_order && group_optimized_away)), tables_list, conds,
|
group_list, order ? order : group_list, having};
|
group_by_handler *gbh= ht->create_group_by(thd, &query);
|
|
|
serg I pushed my fix. Please review: https://github.com/MariaDB/server/commit/5e16a49f5d95488bf3297cd51a1a0772612a2e35
|
|
serg I replied to the last email from you. Please take another look at https://github.com/MariaDB/server/commit/ee59187906d87a71106080d1c5728f32cdf59382
|