[MDEV-24820] benchmark performance of FLUSH TABLES - Jira

Details

Type: Bug
Status: Closed (View Workflow)
Priority: Major
Resolution: Cannot Reproduce
Affects Version/s: None
Fix Version/s: N/A
Component/s: Locking
Labels:
None

Description

There is a suspected regression with FLUSH TABLES with many (some 100.000) tables from 10.3 to 10.4

Attachments

- Sort By Name
- Sort By Date
- Ascending
- Descending
- Thumbnails
- List
- Download All

flush_benchmark_1.test
2022-04-02 15:13
1 kB
Sergei Golubchik
log2.pdf
2021-02-16 11:00
37 kB
Axel Schwenke

Activity

Ascending order - Click to sort in descending order

View 12 older comments

Michael Widenius added a comment - 2022-04-04 18:16

I did a check of malloc() calls, as these could in theory have an affect on the OS.

Number of my_malloc calls in 10.3 was 7186304 and in 10.4 7219090. The difference mainly comes from the optimizer trace in 10.4,
but should not have any notable effect. I was not able to get the number of malloc() calls.

Michael Widenius added a comment - 2022-04-04 18:16 I did a check of malloc() calls, as these could in theory have an affect on the OS. Number of my_malloc calls in 10.3 was 7186304 and in 10.4 7219090. The difference mainly comes from the optimizer trace in 10.4, but should not have any notable effect. I was not able to get the number of malloc() calls.

Sergei Golubchik added a comment - 2022-04-09 11:16 - edited

Looking at the code there is no possible way that larger table definition cache would behave in any way differently (slower or faster) than a smaller table definition cache as long as all tables fit into the cache.
That is, flush_benchmark_1.test makes little sense, as it compares two cache sizes (50,000 and 500,000) with 50,000 tables, that is, in both cases the cache is large enough to fit all tables.

To make it meaningful, change the test to compare table_definition_cache=5000 with 500000. Flush with a smaller cache indeed is faster, on my benchmarks on Windows it's 0.25s to flush 5k tables vs 1.5s to flush 50k tables. This isn't unexpected. On the other hand, updating of 50k tables is faster when they all fit into the cache (1300s vs 850s).

These are 10.4 numbers, 10.3 is about the same (within 10%).

Sergei Golubchik added a comment - 2022-04-09 11:16 - edited Looking at the code there is no possible way that larger table definition cache would behave in any way differently (slower or faster) than a smaller table definition cache as long as all tables fit into the cache . That is, flush_benchmark_1.test makes little sense, as it compares two cache sizes (50,000 and 500,000) with 50,000 tables, that is, in both cases the cache is large enough to fit all tables. To make it meaningful, change the test to compare table_definition_cache=5000 with 500000. Flush with a smaller cache indeed is faster, on my benchmarks on Windows it's 0.25s to flush 5k tables vs 1.5s to flush 50k tables. This isn't unexpected. On the other hand, updating of 50k tables is faster when they all fit into the cache (1300s vs 850s). These are 10.4 numbers, 10.3 is about the same (within 10%).

Sergei Golubchik added a comment - 2022-04-09 11:20

just for the record:

10.3, 1e859d4abcfd7e3b2c238e5dc8c909254661082a
set global table_definition_cache=5000\|\|
for i in 1..50000 do execute immediate concat('update t_', i,' set a=a+1'); end for\|\|
--> 1281.5512
for i in 1..50000 do execute immediate concat('update t_', i,' set a=a+1'); end for\|\|
--> 1331.8555
flush tables\|\|
--> 0.2177
set global table_definition_cache=500000\|\|
for i in 1..50000 do execute immediate concat('update t_', i,' set a=a+1'); end for\|\|
--> 1100.1275
for i in 1..50000 do execute immediate concat('update t_', i,' set a=a+1'); end for\|\|
--> 870.1276
flush tables\|\|
--> 1.6023

10.4, c62843a055f52b27230926b12d9ee4f7aa68e1a0
set global table_definition_cache=5000\|\|
for i in 1..50000 do execute immediate concat('update t_', i,' set a=a+1'); end for\|\|
--> 1335.7895
for i in 1..50000 do execute immediate concat('update t_', i,' set a=a+1'); end for\|\|
--> 1298.4948
flush tables\|\|
--> 0.2585
set global table_definition_cache=500000\|\|
for i in 1..50000 do execute immediate concat('update t_', i,' set a=a+1'); end for\|\|
--> 1134.9305
for i in 1..50000 do execute immediate concat('update t_', i,' set a=a+1'); end for\|\|
--> 847.3583
flush tables\|\|
--> 1.4667

Sergei Golubchik added a comment - 2022-04-09 11:20 just for the record: 10.3, 1e859d4abcfd7e3b2c238e5dc8c909254661082a set global table_definition_cache=5000|| for i in 1..50000 do execute immediate concat('update t_', i,' set a=a+1'); end for|| --> 1281.5512 for i in 1..50000 do execute immediate concat('update t_', i,' set a=a+1'); end for|| --> 1331.8555 flush tables|| --> 0.2177 set global table_definition_cache=500000|| for i in 1..50000 do execute immediate concat('update t_', i,' set a=a+1'); end for|| --> 1100.1275 for i in 1..50000 do execute immediate concat('update t_', i,' set a=a+1'); end for|| --> 870.1276 flush tables|| --> 1.6023 10.4, c62843a055f52b27230926b12d9ee4f7aa68e1a0 set global table_definition_cache=5000|| for i in 1..50000 do execute immediate concat('update t_', i,' set a=a+1'); end for|| --> 1335.7895 for i in 1..50000 do execute immediate concat('update t_', i,' set a=a+1'); end for|| --> 1298.4948 flush tables|| --> 0.2585 set global table_definition_cache=500000|| for i in 1..50000 do execute immediate concat('update t_', i,' set a=a+1'); end for|| --> 1134.9305 for i in 1..50000 do execute immediate concat('update t_', i,' set a=a+1'); end for|| --> 847.3583 flush tables|| --> 1.4667

Sergei Golubchik added a comment - 2022-04-10 17:47

I've extended the test, adding 10 connections doing

send while 1 do set @a=(select sum(seq) from (select seq from seq_1_to_10000 order by seq-1 limit 9000) x); end while||

This query uses filesort, so, it means 10 connections constantly doing malloc/free in parallel.
Timing of FLUSH TABLES were varying wildly, I've run the test three times for 10.3 and for 10.4:

10.3, 1e859d4abcfd7e3b2c238e5dc8c909254661082a
set global table_definition_cache=5000\|\|
for i in 1..50000 do execute immediate concat('update t_', i,' set a=a+1'); end for\|\|
--> 6931.8189 6940.7756 6404.4770
for i in 1..50000 do execute immediate concat('update t_', i,' set a=a+1'); end for\|\|
--> 6911.4679 6812.9442 7807.4025
flush tables\|\|
--> 226.1672 29.0306 167.4894
set global table_definition_cache=500000\|\|
for i in 1..50000 do execute immediate concat('update t_', i,' set a=a+1'); end for\|\|
--> 5974.9393 6181.5528 6156.5681
for i in 1..50000 do execute immediate concat('update t_', i,' set a=a+1'); end for\|\|
--> 6798.4645 6419.2754 6597.2620
flush tables\|\|
--> 55.8149 198.7723 123.2874

10.4, c62843a055f52b27230926b12d9ee4f7aa68e1a0
set global table_definition_cache=5000\|\|
for i in 1..50000 do execute immediate concat('update t_', i,' set a=a+1'); end for\|\|
--> 7011.0454 7098.4546 7223.2147
for i in 1..50000 do execute immediate concat('update t_', i,' set a=a+1'); end for\|\|
--> 6959.7857 7218.5944 6988.3271
flush tables\|\|
--> 468.6933 151.8975 110.3296
set global table_definition_cache=500000\|\|
for i in 1..50000 do execute immediate concat('update t_', i,' set a=a+1'); end for\|\|
--> 6268.1949 6679.8739 6747.0793
for i in 1..50000 do execute immediate concat('update t_', i,' set a=a+1'); end for\|\|
--> 6408.3037 6681.3635 6184.1227
flush tables\|\|
--> 99.8943 90.8010 202.2116

But even here one cannot say that "with a huge table_definition_cache, FLUSH TABLES is much slower on 10.4 compared to 10.3"

Sergei Golubchik added a comment - 2022-04-10 17:47 I've extended the test, adding 10 connections doing send while 1 do set @a=( select sum (seq) from ( select seq from seq_1_to_10000 order by seq-1 limit 9000) x); end while|| This query uses filesort, so, it means 10 connections constantly doing malloc/free in parallel. Timing of FLUSH TABLES were varying wildly, I've run the test three times for 10.3 and for 10.4: 10.3, 1e859d4abcfd7e3b2c238e5dc8c909254661082a set global table_definition_cache=5000|| for i in 1..50000 do execute immediate concat('update t_', i,' set a=a+1'); end for|| --> 6931.8189 6940.7756 6404.4770 for i in 1..50000 do execute immediate concat('update t_', i,' set a=a+1'); end for|| --> 6911.4679 6812.9442 7807.4025 flush tables|| --> 226.1672 29.0306 167.4894 set global table_definition_cache=500000|| for i in 1..50000 do execute immediate concat('update t_', i,' set a=a+1'); end for|| --> 5974.9393 6181.5528 6156.5681 for i in 1..50000 do execute immediate concat('update t_', i,' set a=a+1'); end for|| --> 6798.4645 6419.2754 6597.2620 flush tables|| --> 55.8149 198.7723 123.2874 10.4, c62843a055f52b27230926b12d9ee4f7aa68e1a0 set global table_definition_cache=5000|| for i in 1..50000 do execute immediate concat('update t_', i,' set a=a+1'); end for|| --> 7011.0454 7098.4546 7223.2147 for i in 1..50000 do execute immediate concat('update t_', i,' set a=a+1'); end for|| --> 6959.7857 7218.5944 6988.3271 flush tables|| --> 468.6933 151.8975 110.3296 set global table_definition_cache=500000|| for i in 1..50000 do execute immediate concat('update t_', i,' set a=a+1'); end for|| --> 6268.1949 6679.8739 6747.0793 for i in 1..50000 do execute immediate concat('update t_', i,' set a=a+1'); end for|| --> 6408.3037 6681.3635 6184.1227 flush tables|| --> 99.8943 90.8010 202.2116 But even here one cannot say that "with a huge table_definition_cache, FLUSH TABLES is much slower on 10.4 compared to 10.3"

Sergei Golubchik added a comment - 2022-04-10 19:02 - edited

Despite all the testing I was not able to reproduce the reported issue.

Sergei Golubchik added a comment - 2022-04-10 19:02 - edited Despite all the testing I was not able to reproduce the reported issue.

MariaDB Server

benchmark performance of FLUSH TABLES

Details

Description

Attachments

Attachments

Activity

People

Dates

Git Integration