[MDEV-27230] Estimation for filtered rows less precise with JSON histogram comparing to DOUBLE_PREC (#6) Created: 2021-12-11  Updated: 2022-01-19  Resolved: 2021-12-13

Status: Closed
Project: MariaDB Server
Component/s: Optimizer
Affects Version/s: N/A
Fix Version/s: 10.8.0

Type: Bug Priority: Critical
Reporter: Elena Stepanova Assignee: Sergei Petrunia
Resolution: Fixed Votes: 0
Labels: None

Issue Links:
Problem/Incident
is caused by MDEV-26519 JSON Histograms: improve histogram co... Closed

 Description   

--source include/have_sequence.inc
 
create table t1 (a char(1)) engine=MyISAM;
insert into t1 select chr(seq%26+97) from seq_1_to_50;
insert into t1 select ':' from t1;
 
select count(*) from t1 where a <> 'a';
 
set histogram_type=JSON_HB;
analyze table t1 persistent for all;
 
explain format=json select COUNT(*) FROM t1 WHERE a <> 'a';
 
set histogram_type=DOUBLE_PREC_HB;
analyze table t1 persistent for all;
 
explain format=json select COUNT(*) FROM t1 WHERE a <> 'a';
 
# Cleanup
drop table t1;

The result of the query is 99 rows:

select count(*) from t1 where a <> 'a';
count(*)
99

So, the estimation with DOUBLE_PREC_HB is quite close:

preview-10.8-MDEV-26519-json-histograms 98cb4351

  "query_block": {
    "select_id": 1,
    "nested_loop": [
      {
        "table": {
          "table_name": "t1",
          "access_type": "ALL",
          "rows": 100,
          "filtered": 100,
          "attached_condition": "t1.a <> 'a'"
        }
      }
    ]
  }

The estimation with JSON_HB is off:

  "query_block": {
    "select_id": 1,
    "nested_loop": [
      {
        "table": {
          "table_name": "t1",
          "access_type": "ALL",
          "rows": 100,
          "filtered": 52.70370102,
          "attached_condition": "t1.a <> 'a'"
        }
      }
    ]
  }



 Comments   
Comment by Sergei Petrunia [ 2021-12-13 ]

Pushed a fix for this one.
We may need a cleanup in the code that handles edge cases like this, though. It is not apparent that the current code handles all such cases.

Generated at Thu Feb 08 09:51:20 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.