[MDEV-20371] Invalid reads at plan refinement stage: join->positions instead of best_positions Created: 2019-08-17  Updated: 2020-10-20  Resolved: 2019-09-11

Status: Closed
Project: MariaDB Server
Component/s: Optimizer
Affects Version/s: 10.2, 10.3, 10.4
Fix Version/s: 10.2.28, 10.3.19, 10.4.9

Type: Bug Priority: Major
Reporter: Sergei Petrunia Assignee: Sergei Petrunia
Resolution: Fixed Votes: 0
Labels: None

Issue Links:
Relates
relates to MDEV-9968 Subquery not optimized when join is b... Closed
relates to MDEV-20364 Optimizer chooses wrong access access... Closed

 Description   

The semi-join optimizer has two phases:

  • When building the join order, advance_sj_state() is invoked after a table has been added to the join prefix.
  • When the join order is picked, fix_semijoin_strategies_for_picked_join_order() is called to make the final fixes in semi-join optimization

Both function share some code (e.g. calls optimize_wo_join_buffering()). The issue is:

  • the first phase operates on current join prefix: join->positions.
  • the second phase operates on the picked join order, join->best_positions.

but some code just refers to the join->positions. That array contains the last-considered query plan. (Sometimes, it's the same as the picked query plan. In general case, it is not).

This can lead to hard-to-catch bugs.

Let's apply this patch to make the issue visible:

diff --git a/sql/opt_subselect.cc b/sql/opt_subselect.cc
index 599642b3a26..f0e74157f9a 100644
--- a/sql/opt_subselect.cc
+++ b/sql/opt_subselect.cc
@@ -2786,6 +2786,18 @@ void advance_sj_state(JOIN *join, table_map remaining_tables, uint idx,
     &pos->dups_weedout_picker,
     NULL,
   };
+  
+  MEM_UNDEFINED(&pos->firstmatch_picker, sizeof(pos->firstmatch_picker));
+  MEM_UNDEFINED(&pos->loosescan_picker, sizeof(pos->loosescan_picker));
+  MEM_UNDEFINED(&pos->sjmat_picker, sizeof(pos->sjmat_picker));
+  MEM_UNDEFINED(&pos->dups_weedout_picker, sizeof(pos->dups_weedout_picker));
+
+  // The following classes use virtual functions so their memory needs to
+  // be properly initialized
+  new (&pos->firstmatch_picker) Firstmatch_picker;
+  new (&pos->loosescan_picker) LooseScan_picker;
+  new (&pos->sjmat_picker) Sj_materialization_picker;
+  new (&pos->dups_weedout_picker) Duplicate_weedout_picker;
 
   if (join->emb_sjm_nest)
   {
diff --git a/sql/sql_select.cc b/sql/sql_select.cc
index ff07a7aea89..b01769209fe 100644
--- a/sql/sql_select.cc
+++ b/sql/sql_select.cc
@@ -8146,6 +8146,9 @@ choose_plan(JOIN *join, table_map join_tables)
     if (greedy_search(join, join_tables, search_depth, prune_level,
                       use_cond_selectivity))
       DBUG_RETURN(TRUE);
+    for (uint k=0; k < join->table_count; k++) {
+      MEM_UNDEFINED(&join->positions[k], sizeof(join->positions[k]));
+    }
   }
 
   /* 

Running

./mysql-test-run --valgrind --mem --parallel=6 --force main/subselect*test

Produces a lot of valgrind failures.
Example error

main.subselect_exists2in                 w2 [ fail ]  Found warnings/errors in server log file!
        Test ended at 2019-08-17 21:45:39
line
==20460== Thread 6:
==20460== Use of uninitialised value of size 8
==20460==    at 0x95378C: prev_record_reads(st_position*, unsigned int, unsigned long long) (sql_select.cc:9775)
==20460==    by 0x94C2DC: best_access_path(JOIN*, st_join_table*, unsigned long long, unsigned int, bool, double, st_position*, st_position*) (sql_select.cc:7303)
==20460==    by 0xAD1970: fix_semijoin_strategies_for_picked_join_order(JOIN*) (opt_subselect.cc:3773)
==20460==    by 0x95419A: JOIN::get_best_combination() (sql_select.cc:10163)
==20460==    by 0x93B8CF: JOIN::optimize_stage2() (sql_select.cc:2241)
==20460==    by 0x93B732: JOIN::optimize_inner() (sql_select.cc:2220)
==20460==    by 0x9391D9: JOIN::optimize() (sql_select.cc:1563)
==20460==    by 0x94417B: mysql_select(THD*, TABLE_LIST*, unsigned int, List<Item>&, Item*, unsigned int, st_order*, st_order*, Item*, st_order*, unsigned long long, select_result*, st_select_lex_unit*, st_select_lex*) (sql_select.cc:4592)
==20460==    by 0x93463C: handle_select(THD*, LEX*, select_result*, unsigned long) (sql_select.cc:413)
==20460==    by 0x8FD029: execute_sqlcom_select(THD*, TABLE_LIST*) (sql_parse.cc:6356)
==20460==    by 0x8F2D64: mysql_execute_command(THD*) (sql_parse.cc:3898)
==20460==    by 0x900DBE: mysql_parse(THD*, char*, unsigned int, Parser_state*, bool, bool) (sql_parse.cc:7908)
==20460==    by 0x8ECF02: dispatch_command(enum_server_command, THD*, char*, unsigned int, bool, bool) (sql_parse.cc:1842)
==20460==    by 0x8EB647: do_command(THD*) (sql_parse.cc:1359)
==20460==    by 0xA6B307: do_handle_one_connection(CONNECT*) (sql_connect.cc:1404)
==20460==    by 0xA6B056: handle_one_connection (sql_connect.cc:1306)



 Comments   
Comment by Sergei Petrunia [ 2019-08-17 ]

Another example:

==20241== Thread 27:
==20241== Use of uninitialised value of size 8
==20241==    at 0xB348F4: JOIN::fix_all_splittings_in_plan() (opt_split.cc:1148)
==20241==    by 0x93B6B6: JOIN::optimize_inner() (sql_select.cc:2211)
==20241==    by 0x9391D9: JOIN::optimize() (sql_select.cc:1563)
==20241==    by 0x94417B: mysql_select(THD*, TABLE_LIST*, unsigned int, List<Item>&, Item*, unsigned int, st_order*, st_order*, Item*, st_order*, unsigned long long, select_result*, st_select_lex_unit*, st_select_lex*) (sql_select.cc:4592)
==20241==    by 0x93463C: handle_select(THD*, LEX*, select_result*, unsigned long) (sql_select.cc:413)
==20241==    by 0x8FD029: execute_sqlcom_select(THD*, TABLE_LIST*) (sql_parse.cc:6356)
==20241==    by 0x8F2D64: mysql_execute_command(THD*) (sql_parse.cc:3898)
==20241==    by 0x900DBE: mysql_parse(THD*, char*, unsigned int, Parser_state*, bool, bool) (sql_parse.cc:7908)
==20241==    by 0x8ECF02: dispatch_command(enum_server_command, THD*, char*, unsigned int, bool, bool) (sql_parse.cc:1842)
==20241==    by 0x8EB647: do_command(THD*) (sql_parse.cc:1359)
==20241==    by 0xA6B307: do_handle_one_connection(CONNECT*) (sql_connect.cc:1404)
==20241==    by 0xA6B056: handle_one_connection (sql_connect.cc:1306)
==20241==    by 0x13EE80F: pfs_spawn_thread (pfs.cc:1862)
==20241==    by 0x662D6DA: start_thread (pthread_create.c:463)
==20241==    by 0x74A988E: clone (clone.S:95)

Comment by Sergei Petrunia [ 2019-08-18 ]

The initial valgrind patch is incorrect ( join->positions[0...const_tables] must not be marked as invalid), but the issue remains - Semi-join optimization code does read join->positions[...] from inside fix_semijoin_stategies_for_picked_join order.

Which is essentially reading random data, because there's no guarantee that last considered join prefix has anything to do with the plan in join->best_positions.

Comment by Sergei Petrunia [ 2019-08-30 ]

A clean patch to demonstrate the problem:

diff --git a/sql/opt_subselect.cc b/sql/opt_subselect.cc                                                                                                  
index 599642b3a26..1562306cf0a 100644                                                                                                                     
--- a/sql/opt_subselect.cc                                                                                                                                
+++ b/sql/opt_subselect.cc                                                                                                                                
@@ -2787,6 +2787,11 @@ void advance_sj_state(JOIN *join, table_map remaining_tables, uint idx,                                                            
     NULL,                                                                                                                                                
   };                                                                                                                                                     
                                                                                                                                                          
+  new (&pos->firstmatch_picker) Firstmatch_picker;                                                                                                       
+  new (&pos->loosescan_picker) LooseScan_picker;                                                                                                         
+  new (&pos->sjmat_picker) Sj_materialization_picker;                                                                                                    
+  new (&pos->dups_weedout_picker) Duplicate_weedout_picker;                                                                                              
+                                                                                                                                                         
   if (join->emb_sjm_nest)                                                                                                                                
   {
     /* 
diff --git a/sql/sql_select.cc b/sql/sql_select.cc
index ff07a7aea89..4d22a9a1ee7 100644
--- a/sql/sql_select.cc
+++ b/sql/sql_select.cc
@@ -5481,6 +5481,11 @@ make_join_statistics(JOIN *join, List<TABLE_LIST> &tables_list,
     {
       if (choose_plan(join, all_table_map & ~join->const_table_map))
         goto error;
+      // JOIN::positions holds the current query plan. We've already 
+      // made the plan choice, so we should only use JOIN::best_positions
+      for (uint k=join->const_tables; k < join->table_count; k++) {
+        MEM_UNDEFINED(&join->positions[k], sizeof(join->positions[k]));
+      }
     }
     else
     {

Comment by Sergei Petrunia [ 2019-08-31 ]

Applying the patch and running main/subselect*.test produces a lot of failures
that look like so:

==27345== Use of uninitialised value of size 8
==27345==    at 0x9537E8: prev_record_reads(st_position*, unsigned int, unsigned long long) (sql_select.cc:9777)
==27345==    by 0x94C408: best_access_path(JOIN*, st_join_table*, unsigned long long, unsigned int, bool, double, st_position*, st_position*) (sql_select.cc:7308)
==27345==    by 0xAD16CE: fix_semijoin_strategies_for_picked_join_order(JOIN*) (opt_subselect.cc:3766)
==27345==    by 0x9541F6: JOIN::get_best_combination() (sql_select.cc:10165)
==27345==    by 0x93B94F: JOIN::optimize_stage2() (sql_select.cc:2241)
==27345==    by 0x93B7B2: JOIN::optimize_inner() (sql_select.cc:2220)
==27345==    by 0x939259: JOIN::optimize() (sql_select.cc:1563)
==27345==    by 0x9441FB: mysql_select(THD*, TABLE_LIST*, unsigned int, List<Item>&, Item*, unsigned int, st_order*, st_order*, Item*, st_order*, unsigned long long, select_result*, st_select_lex_unit*, st_select_lex*) (sql_select.cc:4592)
==27345==    by 0x9346BC: handle_select(THD*, LEX*, select_result*, unsigned long) (sql_select.cc:413)
==27345==    by 0x8FD0A9: execute_sqlcom_select(THD*, TABLE_LIST*) (sql_parse.cc:6356)
==27345==    by 0x8F2DE4: mysql_execute_command(THD*) (sql_parse.cc:3898)
==27345==    by 0x900E3E: mysql_parse(THD*, char*, unsigned int, Parser_state*, bool, bool) (sql_parse.cc:7908)
==27345==    by 0x8ECF82: dispatch_command(enum_server_command, THD*, char*, unsigned int, bool, bool) (sql_parse.cc:1842)
==27345==    by 0x8EB6C7: do_command(THD*) (sql_parse.cc:1359)
==27345==    by 0xA6B363: do_handle_one_connection(CONNECT*) (sql_connect.cc:1404)
==27345==    by 0xA6B0B2: handle_one_connection (sql_connect.cc:1306)

The top 2-3 frames vary slightly, but all stack traces include
fix_semijoin_strategies_for_picked_join_order(), at these locations:

  • (opt_subselect.cc:3726)
  • (opt_subselect.cc:3766)
  • (opt_subselect.cc:3804)

These locations are:

(opt_subselect.cc:3726)
This is best_access_path() call inside if (pos->sj_strategy == SJ_OPT_MATERIALIZE_SCAN)

(opt_subselect.cc:3766)
This is best_access_path() call inside if (pos->sj_strategy == SJ_OPT_FIRST_MATCH)

(opt_subselect.cc:3804)
This is best_access_path() call inside if (pos->sj_strategy == SJ_OPT_LOOSE_SCAN)

Comment by Sergei Petrunia [ 2019-09-10 ]

https://github.com/mariadb/server/commits/bb-10.2-mdev20371

Generated at Thu Feb 08 08:58:57 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.