Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-18758

Test histograms precision

    XMLWordPrintable

Details

    Description

      This is to track the work being done on testing precision of histograms. (Some tests were already done, will post the results here)

      We are going to measure the precision of selectivity estimate for equality (range predicates do not make much sense for names I guess).

      explain select * from pop1980 where firstname=$CONST
      select count(*) from pop1980 where firstname=$CONST
      

      I would like a few constants:

      • 3 different names from top-3
      • 3 different names at the end of the first quartile.

      (the first quartile is: Count the number of total different names = 17711
      Rank all names by their frequency:

      select firstname, count(*) as CNT from pop1980 group by firstname order by CNT desc


      end of quartile is 17711/4 = 4427)

      pick 4428th, 4429th, 4430th names.

      Then 3 names at the end of the second quartile.

      and 3rd and 4th.

      the repeat the above "selectivity test" for each constant.

      We need to compare:

      • MariaDB, analyze with sampling
      • MariaDB, analyze with full scan
      • MySQL, with 100 buckets
      • MySQL, with 1024 buckets
      • PostgreSQL.

      For MySQL/MariaDB, use EXPLAIN FORMAT=JSON as it prints selectivity with greater precision.

      Attachments

        Issue Links

          Activity

            People

              psergei Sergei Petrunia
              psergei Sergei Petrunia
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated:

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.