Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-24820

benchmark performance of FLUSH TABLES

Details

    • Bug
    • Status: Closed (View Workflow)
    • Major
    • Resolution: Cannot Reproduce
    • None
    • N/A
    • Locking
    • None

    Description

      There is a suspected regression with FLUSH TABLES with many (some 100.000) tables from 10.3 to 10.4

      Attachments

        1. flush_benchmark_1.test
          1 kB
          Sergei Golubchik
        2. log2.pdf
          37 kB
          Axel Schwenke

        Activity

          I did a check of malloc() calls, as these could in theory have an affect on the OS.

          Number of my_malloc calls in 10.3 was 7186304 and in 10.4 7219090. The difference mainly comes from the optimizer trace in 10.4,
          but should not have any notable effect. I was not able to get the number of malloc() calls.

          monty Michael Widenius added a comment - I did a check of malloc() calls, as these could in theory have an affect on the OS. Number of my_malloc calls in 10.3 was 7186304 and in 10.4 7219090. The difference mainly comes from the optimizer trace in 10.4, but should not have any notable effect. I was not able to get the number of malloc() calls.
          serg Sergei Golubchik added a comment - - edited

          Looking at the code there is no possible way that larger table definition cache would behave in any way differently (slower or faster) than a smaller table definition cache as long as all tables fit into the cache.
          That is, flush_benchmark_1.test makes little sense, as it compares two cache sizes (50,000 and 500,000) with 50,000 tables, that is, in both cases the cache is large enough to fit all tables.

          To make it meaningful, change the test to compare table_definition_cache=5000 with 500000. Flush with a smaller cache indeed is faster, on my benchmarks on Windows it's 0.25s to flush 5k tables vs 1.5s to flush 50k tables. This isn't unexpected. On the other hand, updating of 50k tables is faster when they all fit into the cache (1300s vs 850s).

          These are 10.4 numbers, 10.3 is about the same (within 10%).

          serg Sergei Golubchik added a comment - - edited Looking at the code there is no possible way that larger table definition cache would behave in any way differently (slower or faster) than a smaller table definition cache as long as all tables fit into the cache . That is, flush_benchmark_1.test makes little sense, as it compares two cache sizes (50,000 and 500,000) with 50,000 tables, that is, in both cases the cache is large enough to fit all tables. To make it meaningful, change the test to compare table_definition_cache=5000 with 500000. Flush with a smaller cache indeed is faster, on my benchmarks on Windows it's 0.25s to flush 5k tables vs 1.5s to flush 50k tables. This isn't unexpected. On the other hand, updating of 50k tables is faster when they all fit into the cache (1300s vs 850s). These are 10.4 numbers, 10.3 is about the same (within 10%).

          just for the record:

          10.3, 1e859d4abcfd7e3b2c238e5dc8c909254661082a

          set global table_definition_cache=5000||
          for i in 1..50000 do execute immediate concat('update t_', i,' set a=a+1'); end for||
          --> 1281.5512
          for i in 1..50000 do execute immediate concat('update t_', i,' set a=a+1'); end for||
          --> 1331.8555
          flush tables||
          --> 0.2177
          set global table_definition_cache=500000||
          for i in 1..50000 do execute immediate concat('update t_', i,' set a=a+1'); end for||
          --> 1100.1275
          for i in 1..50000 do execute immediate concat('update t_', i,' set a=a+1'); end for||
          --> 870.1276
          flush tables||
          --> 1.6023
          

          10.4, c62843a055f52b27230926b12d9ee4f7aa68e1a0

          set global table_definition_cache=5000||
          for i in 1..50000 do execute immediate concat('update t_', i,' set a=a+1'); end for||
          --> 1335.7895
          for i in 1..50000 do execute immediate concat('update t_', i,' set a=a+1'); end for||
          --> 1298.4948
          flush tables||
          --> 0.2585
          set global table_definition_cache=500000||
          for i in 1..50000 do execute immediate concat('update t_', i,' set a=a+1'); end for||
          --> 1134.9305
          for i in 1..50000 do execute immediate concat('update t_', i,' set a=a+1'); end for||
          --> 847.3583
          flush tables||
          --> 1.4667
          

          serg Sergei Golubchik added a comment - just for the record: 10.3, 1e859d4abcfd7e3b2c238e5dc8c909254661082a set global table_definition_cache=5000|| for i in 1..50000 do execute immediate concat('update t_', i,' set a=a+1'); end for|| --> 1281.5512 for i in 1..50000 do execute immediate concat('update t_', i,' set a=a+1'); end for|| --> 1331.8555 flush tables|| --> 0.2177 set global table_definition_cache=500000|| for i in 1..50000 do execute immediate concat('update t_', i,' set a=a+1'); end for|| --> 1100.1275 for i in 1..50000 do execute immediate concat('update t_', i,' set a=a+1'); end for|| --> 870.1276 flush tables|| --> 1.6023 10.4, c62843a055f52b27230926b12d9ee4f7aa68e1a0 set global table_definition_cache=5000|| for i in 1..50000 do execute immediate concat('update t_', i,' set a=a+1'); end for|| --> 1335.7895 for i in 1..50000 do execute immediate concat('update t_', i,' set a=a+1'); end for|| --> 1298.4948 flush tables|| --> 0.2585 set global table_definition_cache=500000|| for i in 1..50000 do execute immediate concat('update t_', i,' set a=a+1'); end for|| --> 1134.9305 for i in 1..50000 do execute immediate concat('update t_', i,' set a=a+1'); end for|| --> 847.3583 flush tables|| --> 1.4667

          I've extended the test, adding 10 connections doing

          send while 1 do set @a=(select sum(seq) from (select seq from seq_1_to_10000 order by seq-1 limit 9000) x); end while||
          

          This query uses filesort, so, it means 10 connections constantly doing malloc/free in parallel.
          Timing of FLUSH TABLES were varying wildly, I've run the test three times for 10.3 and for 10.4:

          10.3, 1e859d4abcfd7e3b2c238e5dc8c909254661082a

          set global table_definition_cache=5000||
          for i in 1..50000 do execute immediate concat('update t_', i,' set a=a+1'); end for||
          --> 6931.8189 6940.7756 6404.4770
          for i in 1..50000 do execute immediate concat('update t_', i,' set a=a+1'); end for||
          --> 6911.4679 6812.9442 7807.4025
          flush tables||
          --> 226.1672 29.0306 167.4894
          set global table_definition_cache=500000||
          for i in 1..50000 do execute immediate concat('update t_', i,' set a=a+1'); end for||
          --> 5974.9393 6181.5528 6156.5681
          for i in 1..50000 do execute immediate concat('update t_', i,' set a=a+1'); end for||
          --> 6798.4645 6419.2754 6597.2620
          flush tables||
          --> 55.8149 198.7723 123.2874
          

          10.4, c62843a055f52b27230926b12d9ee4f7aa68e1a0

          set global table_definition_cache=5000||
          for i in 1..50000 do execute immediate concat('update t_', i,' set a=a+1'); end for||
          --> 7011.0454 7098.4546 7223.2147
          for i in 1..50000 do execute immediate concat('update t_', i,' set a=a+1'); end for||
          --> 6959.7857 7218.5944 6988.3271
          flush tables||
          --> 468.6933 151.8975 110.3296
          set global table_definition_cache=500000||
          for i in 1..50000 do execute immediate concat('update t_', i,' set a=a+1'); end for||
          --> 6268.1949 6679.8739 6747.0793
          for i in 1..50000 do execute immediate concat('update t_', i,' set a=a+1'); end for||
          --> 6408.3037 6681.3635 6184.1227
          flush tables||
          --> 99.8943 90.8010 202.2116
          

          But even here one cannot say that "with a huge table_definition_cache, FLUSH TABLES is much slower on 10.4 compared to 10.3"

          serg Sergei Golubchik added a comment - I've extended the test, adding 10 connections doing send while 1 do set @a=( select sum (seq) from ( select seq from seq_1_to_10000 order by seq-1 limit 9000) x); end while|| This query uses filesort, so, it means 10 connections constantly doing malloc/free in parallel. Timing of FLUSH TABLES were varying wildly, I've run the test three times for 10.3 and for 10.4: 10.3, 1e859d4abcfd7e3b2c238e5dc8c909254661082a set global table_definition_cache=5000|| for i in 1..50000 do execute immediate concat('update t_', i,' set a=a+1'); end for|| --> 6931.8189 6940.7756 6404.4770 for i in 1..50000 do execute immediate concat('update t_', i,' set a=a+1'); end for|| --> 6911.4679 6812.9442 7807.4025 flush tables|| --> 226.1672 29.0306 167.4894 set global table_definition_cache=500000|| for i in 1..50000 do execute immediate concat('update t_', i,' set a=a+1'); end for|| --> 5974.9393 6181.5528 6156.5681 for i in 1..50000 do execute immediate concat('update t_', i,' set a=a+1'); end for|| --> 6798.4645 6419.2754 6597.2620 flush tables|| --> 55.8149 198.7723 123.2874 10.4, c62843a055f52b27230926b12d9ee4f7aa68e1a0 set global table_definition_cache=5000|| for i in 1..50000 do execute immediate concat('update t_', i,' set a=a+1'); end for|| --> 7011.0454 7098.4546 7223.2147 for i in 1..50000 do execute immediate concat('update t_', i,' set a=a+1'); end for|| --> 6959.7857 7218.5944 6988.3271 flush tables|| --> 468.6933 151.8975 110.3296 set global table_definition_cache=500000|| for i in 1..50000 do execute immediate concat('update t_', i,' set a=a+1'); end for|| --> 6268.1949 6679.8739 6747.0793 for i in 1..50000 do execute immediate concat('update t_', i,' set a=a+1'); end for|| --> 6408.3037 6681.3635 6184.1227 flush tables|| --> 99.8943 90.8010 202.2116 But even here one cannot say that "with a huge table_definition_cache, FLUSH TABLES is much slower on 10.4 compared to 10.3"
          serg Sergei Golubchik added a comment - - edited

          Despite all the testing I was not able to reproduce the reported issue.

          serg Sergei Golubchik added a comment - - edited Despite all the testing I was not able to reproduce the reported issue.

          People

            serg Sergei Golubchik
            axel Axel Schwenke
            Votes:
            1 Vote for this issue
            Watchers:
            11 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Git Integration

                Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.