Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-26901

Estimation for filtered rows less precise with JSON histogram comparing to DOUBLE_PREC (#4)

    XMLWordPrintable

Details

    • Bug
    • Status: Closed (View Workflow)
    • Critical
    • Resolution: Fixed
    • N/A
    • 10.8.1
    • Optimizer
    • None

    Description

      create or replace table t (f int);
      insert into t values
        (7),(5),(0),(5),(112),(9),(9),(7),(5),(9),
        (1),(7),(0),(6),(6),(2),(1),(6),(169),(7);
       
      select f from t where f in (77, 1, 144, 73, 14, 12);
       
      set histogram_type= JSON_HB;
      analyze table t persistent for all;
       
      explain format=json select f from t where f in (77, 1, 144, 73, 14, 12);
       
      set histogram_type= DOUBLE_PREC_HB;
      analyze table t persistent for all;
       
      explain format=json select f from t where f in (77, 1, 144, 73, 14, 12);
       
      # Cleanup
      drop table t;
      

      The result set is 2 rows (out of 20):

      preview-10.7-MDEV-26519-json-histograms 7fa0542dc6

      select f from t where f in (77, 1, 144, 73, 14, 12);
      f
      1
      1
      

      With DOUBLE_PREC_HB, filtered value is 10.98, which is quite good:

      set histogram_type= DOUBLE_PREC_HB;
      analyze table t persistent for all;
      Table	Op	Msg_type	Msg_text
      test.t	analyze	status	Engine-independent statistics collected
      test.t	analyze	status	OK
      explain format=json select f from t where f in (77, 1, 144, 73, 14, 12);
      EXPLAIN
      {
        "query_block": {
          "select_id": 1,
          "table": {
            "table_name": "t",
            "access_type": "ALL",
            "rows": 20,
            "filtered": 10.98332787,
            "attached_condition": "t.f in (77,1,144,73,14,12)"
          }
        }
      }
      

      With JSON_HB, it's 35, which isn't so good:

      set histogram_type= JSON_HB;
      analyze table t persistent for all;
      Table	Op	Msg_type	Msg_text
      test.t	analyze	status	Engine-independent statistics collected
      test.t	analyze	status	OK
      explain format=json select f from t where f in (77, 1, 144, 73, 14, 12);
      EXPLAIN
      {
        "query_block": {
          "select_id": 1,
          "table": {
            "table_name": "t",
            "access_type": "ALL",
            "rows": 20,
            "filtered": 35,
            "attached_condition": "t.f in (77,1,144,73,14,12)"
          }
        }
      }
      

      Attachments

        Issue Links

          Activity

            People

              psergei Sergei Petrunia
              elenst Elena Stepanova
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.