Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-32472

Test histogram precision for ANALYZE, 2023

    XMLWordPrintable

Details

    Description

      We need to test precision of histograms (both DOUBLE_PREC_HB and JSON_HB) with different analyze_sample_percentage settings.

      Goals:

      • Catch possible bugs, or just undesirable estimates.
      • Validate analyze_sample_percentage=0 before making it the default.

      Means:
      Running full benchmarks is not a good way to achieve this:

      • It is difficult to see if better query plans are caused by better/worse selectivity estimates.
      • Benchmark queries do only a few requests to histogram data.

      We reuse the approach from histogram-test [1] script:

      • Get some dataset somewhere
      • Collect histogram
      • Run small test queries and see what estimates the optimizer gets from histogram code for various ranges.
        • ask for common/uncommon values
        • ask for wide/narrow intervals
        • etc

      Things to check for

      • Height-balanced histograms should not produce errors that are larger than bucket_size (right?)
        • But for DOUBLE_PREC_HB there is also an error due to imprecise storage of bucket endpoint. Not sure what are the bounds on that.
      • Using sampling instead of the full dataset should not reduce the precision much.
        • what happens to n_distinct estimate?
        • Monitor disk space. Does it increase less with the percentage 0. How much less?

      [1] https://github.com/spetrunia/histogram-test.

      Attachments

        Issue Links

          Activity

            People

              lstartseva Lena Startseva
              psergei Sergei Petrunia
              Votes:
              1 Vote for this issue
              Watchers:
              10 Start watching this issue

              Dates

                Created:
                Updated:

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.