Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-26728

Histogram may return selectivity > 1.0

    XMLWordPrintable

    Details

    • Type: Bug
    • Status: Open (View Workflow)
    • Priority: Major
    • Resolution: Unresolved
    • Affects Version/s: 10.2, 10.3, 10.4, 10.5, 10.6
    • Fix Version/s: 10.2
    • Component/s: Optimizer
    • Labels:
      None

      Description

      I've hit this example when investigating MDEV-26718.

      Consider this testcase:

      --source include/have_sequence.inc
      CREATE OR REPLACE TABLE t (
        f2 int
      ) ENGINE=MyISAM;
       
      INSERT INTO t SELECT
        seq%8+1
      FROM seq_1_to_128;
       
      ANALYZE TABLE t PERSISTENT FOR ALL;
      

      set optimizer_trace=1;
      analyze select * from t where f2 != 5;
       
      select * from information_schema.optimizer_trace;
      

      The trace will show:

      "rows_estimation": [
                    {
                      "selectivity_for_indexes": [],
                      "selectivity_for_columns": [
                        {
                          "column_name": "f2",
                          "ranges": ["NULL < f2 < 5", "5 < f2"],
                          "selectivity_from_histogram": 1.1171875
                        }
                      ],
                      "cond_selectivity": 1
      

      Note this part:

                          "selectivity_from_histogram": 1.1171875
      

      this is because this call:

                rows= records_in_column_ranges(&param, idx, key);
      

      returns 143 rows, while the table has 128 rows (and the table statistics says so, too).

      We can see below that the incorrect selectivity doesn't spread:

                      ],
                      "cond_selectivity": 1
      

      .

      This is thanks to these lines:

                rows= records_in_column_ranges(&param, idx, key);
                if (rows != DBL_MAX)
                {
                  key->field->cond_selectivity_= rows/table_records;
                  selectivity_for_column.add("selectivity_from_histogram",
                                             key->field->cond_selectivity);
                }
              }
            }
          }
       
          for (Field **field_ptr= table->field; *field_ptr; field_ptr++)
          {
            Field *table_field= *field_ptr;   
            if (bitmap_is_set(used_fields, table_field->field_index) &&
                table_field->cond_selectivity < 1.0)
            {
              if (!bitmap_is_set(&handled_columns, table_field->field_index))
                table->cond_selectivity*= table_field->cond_selectivity;
      

      Note the table_field->cond_selectivity < 1.0 condition.

      However, field->cond_selectivity still has the value that's greater than 1, which looks dangerous.

        Attachments

          Issue Links

            Activity

              People

              Assignee:
              monty Michael Widenius
              Reporter:
              psergei Sergei Petrunia
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

                Dates

                Created:
                Updated:

                  Git Integration