[MDEV-18994] Fix the bug encountered when the size of (v1,v2,v3...) value list in “select * from tb where id in (v1, v2, v3...)” is greater than 1000 Created: 2019-03-21  Updated: 2022-04-25

Status: Stalled
Project: MariaDB Server
Component/s: Storage Engine - ColumnStore, Storage Engine - Connect, Storage Engine - Spider
Affects Version/s: 10.4, 10.5, 10.6
Fix Version/s: 10.4, 10.5, 10.6

Type: Bug Priority: Major
Reporter: willhan Assignee: Sergei Golubchik
Resolution: Unresolved Votes: 0
Labels: None


 Description   

Fix the bug encountered when the size of (v1,v2,v3...) value list in “select * from tb where id in (v1, v2, v3...)” is greater than 1000 (IN_SUBQUERY_CONVERSION_THRESHOLD‘s default value)
Set IN_SUBQUERY_CONVERSION_THRESHOLD's to uint_max to disable this optimization on spider



 Comments   
Comment by Kentoku Shiba (Inactive) [ 2019-05-02 ]

Some engines (like Spider) need to avoid this optimization. Would you please review the following commit especially about Item_cache_row::result_type()?
e91d075

Comment by willhan [ 2019-05-11 ]

kentoku
It's ok

Comment by Alexander Barkov [ 2019-05-13 ]

Kentoku, please don't change Item::result_type() to virtual.

Instead, you need to set type_handler_row as a handler for Item_cache_row in its constructor, like this:

  Item_cache_row(THD *thd):
    Item_cache(thd, &type_handler_row), values(0), item_count(2),
    save_array(0) {}

Comment by Alexander Barkov [ 2019-05-13 ]

Please also remove the constructor which get unused after changing the code:

--- a/sql/item.h
+++ b/sql/item.h
@@ -6442,17 +6442,6 @@ class Item_cache: public Item,
   bool value_cached;
 
   table_map used_table_map;
-public:
-  Item_cache(THD *thd):
-    Item(thd),
-    Type_handler_hybrid_field_type(&type_handler_string),
-    example(0), cached_field(0),
-    value_cached(0),
-    used_table_map(0)
-  {
-    maybe_null= 1;
-    null_value= 1;
-  }
 protected:
   Item_cache(THD *thd, const Type_handler *handler):
     Item(thd),
@@ -6883,7 +6872,7 @@ class Item_cache_row: public Item_cache
   bool save_array;
 public:
   Item_cache_row(THD *thd):
-    Item_cache(thd), values(0), item_count(2),
+    Item_cache(thd, &type_handler_row), values(0), item_count(2),
     save_array(0) {}
   
   /*

Comment by Kentoku Shiba (Inactive) [ 2019-05-23 ]

bar
I just fixed about above and pushed. Would you please review this again?
47e02ca

Comment by Sergei Golubchik [ 2022-01-08 ]

I think, ideally in→subquery should be happening after engine's cond_push() had change to accept the unmodified condition.

The problem is that cond_push() is done rather late, in make_join_select() after make_cond_for_table() which can only be done after the join order if fixed.

While in→subquery is done rather early, still in JOIN::prepare(). And anyway, even if it can be done later, it cannot be done as late as after make_cond_for_table().

Comment by Sergei Golubchik [ 2022-01-08 ]

The patch is not optimal, it's of course better than what columnstore does, but still not a proper solution.

Comment by Sergei Golubchik [ 2022-01-13 ]

it seems that make_cond_for_table() for cond_push() extracts the condition for one table only, so it's independent from the join order and can be done before creating a join order.

so may be it's possible to move cond_push() up and also move in→subquery down until they swap

Generated at Thu Feb 08 08:48:17 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.