[MDEV-30142] rowid_filter: negative access cost factor values Created: 2022-12-01  Updated: 2022-12-13  Resolved: 2022-12-13

Status: Closed
Project: MariaDB Server
Component/s: Optimizer
Affects Version/s: None
Fix Version/s: N/A

Type: Bug Priority: Major
Reporter: Sergei Petrunia Assignee: Sergei Petrunia
Resolution: Duplicate Votes: 1
Labels: rowid_filtering

Issue Links:
PartOf
is part of MDEV-30218 Incorrect optimization for rowid_filt... Closed
Problem/Incident
is caused by MDEV-28846 Poor performance when rowid filter co... Closed

 Description   

The issue appears after this commit (further "The Commit"):

commit 58cd0bd59ef011be54f162237f2ff017c3148e7b (HEAD -> 10.4)
Author: Igor Babaev <igor@askmonty.org>
Date:   Mon Oct 17 16:44:10 2022 -0700
 
    MDEV-28846 Poor performance when rowid filter contains no elements
    
    When a range rowid filter was used with an index ref access the cost of
    accessing the index entries for the records rejected by the filter was not
    taken into account. For a ref access by an index with big average number
    of records per key this led to poor execution plans if selectivity of the
    used filter was high.

Consider this patch:

diff --git a/sql/rowid_filter.cc b/sql/rowid_filter.cc
index 75f2ab71ecd..e0673640deb 100644
--- a/sql/rowid_filter.cc
+++ b/sql/rowid_filter.cc
@@ -468,6 +468,9 @@ TABLE::best_range_rowid_filter_for_partial_join(uint access_key_no,
                                                 double records,
                                                 double access_cost_factor)
 {
+  if (access_cost_factor < 0.0)
+    fprintf(stderr, "access_cost_factor=%g query:%s\n", access_cost_factor,
+    current_thd->query());
   if (range_rowid_filter_cost_info_elems == 0 ||
       covering_keys.is_set(access_key_no))
     return 0;

If I apply that and run

./mysql-test-run main/*filter*test
grep -rn access_cost_factor var/

grep produces no output for the tree before The Commit.

After The Commit, grep produces a lot of output:

var/log/bootstrap.log:70:access_cost_factor=-0.00954473 query:SET @broken_routines = (select count(*) from mysql.proc where db='performance_schema');
var/log/mysqld.1.err:51:access_cost_factor=-0.00760976 query:select 1
var/log/mysqld.1.err:57:access_cost_factor=-0.00214634 query:select 1
var/log/mysqld.1.err:63:access_cost_factor=-0.392596 query:select 1
var/log/mysqld.1.err:69:access_cost_factor=-0.00760976 query:explain extended select 1
var/log/mysqld.1.err:75:access_cost_factor=-0.00214634 query:explain extended select 1
var/log/mysqld.1.err:81:access_cost_factor=-0.392596 query:explain extended select 1
var/log/mysqld.1.err:87:access_cost_factor=-0.00214634 query:SELECT * FROM t1 HAVING (7, 9) IN (SELECT t2.i1, t2.i2 FROM t2 WHERE t2.i1 = 3)
var/log/mysqld.1.err:89:access_cost_factor=-0.00214634 query:EXPLAIN EXTENDED
var/log/mysqld.1.err:91:access_cost_factor=-0.00214634 query:EXPLAIN EXTENDED
var/log/mysqld.1.err:93:access_cost_factor=-0.301951 query:SELECT * FROM t1 INNER JOIN t2 ON ( pk1+1 = pk2+2 AND a1 = a2 )
var/log/mysqld.1.err:95:access_cost_factor=-0.0263959 query:SELECT * FROM t1 INNER JOIN t2 ON ( pk1+1 = pk2+2 AND a1 = a2 )
var/log/mysqld.1.err:97:access_cost_factor=-0.301951 query:EXPLAIN EXTENDED SELECT * FROM t1 INNER JOIN t2 ON ( pk1+1 = pk2+2 AND a1 = a2 )
var/log/mysqld.1.err:99:access_cost_factor=-0.0263959 query:EXPLAIN EXTENDED SELECT * FROM t1 INNER JOIN t2 ON ( pk1+1 = pk2+2 AND a1 = a2 )



 Comments   
Comment by Sergei Petrunia [ 2022-12-01 ]

Suggested fix (Monty has other suggestions):

diff --git a/sql/rowid_filter.cc b/sql/rowid_filter.cc
index 75f2ab71ecd..f616881daef 100644
--- a/sql/rowid_filter.cc
+++ b/sql/rowid_filter.cc
@@ -468,6 +468,8 @@ TABLE::best_range_rowid_filter_for_partial_join(uint access_key_no,
                                                 double records,
                                                 double access_cost_factor)
 {
+  if (access_cost_factor < 0.0)
+    return 0;
 
   if (range_rowid_filter_cost_info_elems == 0 ||
       covering_keys.is_set(access_key_no))
     return 0;

The idea is:
if we got access_cost_factor<0.0, this means index-only scan was more expensive than non-index only. Well, don't apply rowid filter in this case.

Comment by Sergei Petrunia [ 2022-12-09 ]

Input from igor:

@spetrunia
: what's the problem with negative cost_access_factor? yes,  the access gain is negative 
with such factor, but still there is gain with less condition checking.

Comment by Sergei Petrunia [ 2022-12-09 ]

But the comments in the code say otherwise:

/*
 ...
   @note
     Currently we require that access_cost_factor should be a number between
     0.0 and 1.0
*/
 
inline
double Range_rowid_filter_cost_info::avg_adjusted_gain_per_row(
                                         double access_cost_factor)

Comment by Sergei Petrunia [ 2022-12-09 ]

If I take the tree with the tip cset

commit 58cd0bd59ef011be54f162237f2ff017c3148e7b (HEAD -> 10.4)
Author: Igor Babaev <igor@askmonty.org>
Date:   Mon Oct 17 16:44:10 2022 -0700
 
    MDEV-28846 Poor performance when rowid filter contains no elements

and add code to Range_rowid_filter_cost_info::avg_adjusted_gain_per_row that print access_cost_factor when it is negative, I get plenty of printouts when running mtr --suite=main.

Comment by Sergei Petrunia [ 2022-12-13 ]

Addressed in MDEV-30218.

Generated at Thu Feb 08 10:13:58 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.