Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-38711

MDEV-37330 : performance evaluation

    XMLWordPrintable

Details

    • Task
    • Status: Closed (View Workflow)
    • Major
    • Resolution: Done
    • N/A
    • Optimizer
    • None
    • Q1/2026 Server Development

    Description

      The idea is to check performance improvement of MDEV-37330.

      Test Result Summary

      The speedup depends on the query pattern, the number of partitions used, and dataset size.
      We ran the tests on a relatively small dataset (~ 5G, so it was likely all in buffer pool). Testing on bigger dataset which is not necessarily in memory might bring more benefit.
      The test used 48 partitions, which is average amount.

      The results are:

      Query pattern Speedup
      MAX(parition_date_col) 3.3x
      ORDER BY partition_date_col DESC LIMIT 1 1.63x
      ordered scan 1.094x

      Details can be found below.


      Details

      Test 1

      Take the query from its description, a mock dataset jan27-sales-12partitions.sql

      select * from t1 where user_id=... ORDER BY date_created DESC LIMIT 1
      

      and run it.

      Result 1 : not much improvement

      According to mariadb-pavithrapandith, there wasn't much improvement.

      The only big difference in ANALYZE FORMAT=JSON output was:

      On MDEV-37330 branch : "pages_accessed": 46
      On main branch : "pages_accessed": 90
      

      Note that the query is supposed to access 1 partition instead of 12. But there is still only 2x savings in pages_accessed.

      Test1, experiment1: disable range optimization

      Tried this:

      analyze format=json 
      select /*+ NO_RANGE_OPTIMIZATION(sales) */ * from sales where user_id=5 ORDER BY sales_date desc limit 1;
      

      Got:

      On MDEV-37330 branch: "pages_accessed": 4
      On main branch: "pages_accessed": 48
      

      This confirms the theory that Range Optimization was doing about half of the page accesses.
      On main branch, we get 90 -> 48 page accesses.
      On the MDEV-37330 code we get 45 - > 4 page accesses.
      The query speeds are :

        "r_total_time_ms": 0.087955664
        "r_total_time_ms": 0.204567552,
      

      about 2x speedup now.

      Result 2: Larger artificial query

      Take the same table with 12 partitions but construct a query that

      • would not benefit from range optimizer pre-reading data in all partitions.
      • would lookup "last date from some partition" repeatedly
        This is jan30-big-artifical-query.sql

      With subquery cache it makes about 100 lookups (the dataset has 100 users):

      "r_engine_stats": {  "pages_accessed": 400  },
      "r_engine_stats": { "pages_accessed": 4804},
      

      "r_total_time_ms": 6.166011166,
      "r_total_time_ms": 9.675008024,
      

      With disabled subquery cache, subquery will be executed 10K times and we will get:

      "pages_accessed":  40,000
      "pages_accessed": 480,419
      

      and

      "r_total_time_ms": 57.19026293,
      "r_total_time_ms": 218.0788524,
      

      No result for customer ticket query

      A related customer ticket doesn't have enough info to reconstruct their case.

      Result 3: Query with MAX

      link
      12 partitions: 1.5 or 1.8x speedup depending on data size
      48 partitions: 2.2 or 3.3x speedup depending on data size (3.6x on another machine)

      Result 4: ORDER BY ... LIMIT 1 again, on a larger dataset

      link
      48 partitions: 1.63x speedup.

      Result 5: Long ordered scan

      A query doing a long ordered scan: link
      1.094x speedup

      Attachments

        Issue Links

          Activity

            People

              mariadb-pavithrapandith Pavithra Pandith
              psergei Sergei Petrunia
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.