[MDEV-24820] benchmark performance of FLUSH TABLES Created: 2021-02-09  Updated: 2022-04-19  Resolved: 2022-04-10

Status: Closed
Project: MariaDB Server
Component/s: Locking
Affects Version/s: None
Fix Version/s: N/A

Type: Bug Priority: Major
Reporter: Axel Schwenke Assignee: Sergei Golubchik
Resolution: Cannot Reproduce Votes: 1
Labels: None

Attachments: File flush_benchmark_1.test     PDF File log2.pdf    

 Description   

There is a suspected regression with FLUSH TABLES with many (some 100.000) tables from 10.3 to 10.4



 Comments   
Comment by Axel Schwenke [ 2021-02-09 ]

Tested time for FLUSH TABLES with 2 setups

  1. create 1000 tables, flush, repeat 500 times (creating 500.000 tables)
  2. open existing tables, create 1000 tables, flush, repeat 500 times

Outcome see log2.pdf. Case 1 delivers constant performance, case 2 linear performance. As expected. Flushing 500.000 tables takes about 2 seconds.

Comment by Axel Schwenke [ 2021-02-16 ]

I just updated log2.pdf. It contains now additional data for the latest 10.3 and 10.4 releases. There seems to be a very slight regression from 10.3.26 to 10.4.17 from about 1.5s to 1.7s for closing 500.000 tables. But nothing like what the customer sees.

Comment by Nilnandan Joshi [ 2021-04-21 ]

Hi julien.fritsch, Currently, we have asked some information from the customer and waiting for their answers. We can't decide yet that it is Windows problem or it is problem with new implementation of Backup stage. But I think this MDEV was created for benchmarking and it is done so I don't think this is required now. We'll report new bug if we found problem at MariaDB end. This is what my thinking but you are the best person to decide further.

Comment by Vladislav Vaintroub [ 2021-09-24 ]

The status is "we are also waiting for someone in Support to reproduce the problem".
I can let those people use shared Window box via RDP, if they need, and there is some sysbench on it, too.

Comment by Sergei Golubchik [ 2022-02-21 ]

still nothing

Comment by Oleksandr Byelkin [ 2022-03-21 ]

I do not understand why so unusual operation as FLUSH TABLES should work fast? (I have no time to read all thread, just read the titel and was surprised).

Comment by Valerii Kravchuk [ 2022-03-25 ]

I do not have good Windows or Linux server at hand to continue working on this, but I've got some results with current 10.4 from GitHub (with minor patches to make successful build) on macOS. The test was done with table_definition_cache of 500000 and then 50000 and included creation of 50000 tables 1 row each with sysbench and then running FLUSH TABLES. The results are like these:

Yuliyas-MacBook-Air:sysbench Valerii$ time ./src/sysbench oltp_read_write --threads=2 --tables=50000 --table-size=1 --db-driver=mysql --mysql-socket=/tmp/mysql.sock --mysql-user=Valerii prepare
...
real	42m18.544s
user	0m7.911s
sys	0m8.262s
 
Yuliyas-MacBook-Air:sysbench Valerii$ ~/dbs/maria10.4/bin/mysql -A sbtest -e "show global status like 'open%'; show global variables like 'table_%';"
+--------------------------+---------+
| Variable_name      | Value  |
+--------------------------+---------+
| Open_files        | 10   |
| Open_streams       | 0    |
| Open_table_definitions  | 50003  |
| Open_tables       | 200   |
| Opened_files       | 1726313 |
| Opened_plugin_libraries | 0    |
| Opened_table_definitions | 932482 |
| Opened_tables      | 525461 |
| Opened_views       | 0    |
+--------------------------+---------+
+----------------------------+--------+
| Variable_name       | Value |
+----------------------------+--------+
| table_definition_cache   | 500000 |
| table_open_cache      | 200  |
| table_open_cache_instances | 1   |
+----------------------------+--------+
Yuliyas-MacBook-Air:sysbench Valerii$ time ~/dbs/maria10.4/bin/mysql -A sbtest -e "flush tables"
real	0m2.176s
user	0m0.014s
sys	0m0.007s

Now, let's decrease table definition cache to 50000 and retry 9after dropping and recreating sbtest database):

Yuliyas-MacBook-Air:sysbench Valerii$ ~/dbs/maria10.4/bin/mysql -e "set global table_definition_cache=50000"
Yuliyas-MacBook-Air:sysbench Valerii$ ~/dbs/maria10.4/bin/mysql -A sbtest -e "show global status like 'open%'; show global variables like 'table_%';"
+--------------------------+---------+
| Variable_name      | Value  |
+--------------------------+---------+
| Open_files        | 14   |
| Open_streams       | 0    |
| Open_table_definitions  | 5    |
| Open_tables       | 5    |
| Opened_files       | 1776329 |
| Opened_plugin_libraries | 0    |
| Opened_table_definitions | 932487 |
| Opened_tables      | 525466 |
| Opened_views       | 0    |
+--------------------------+---------+
+----------------------------+-------+
| Variable_name       | Value |
+----------------------------+-------+
| table_definition_cache   | 50000 |
| table_open_cache      | 200  |
| table_open_cache_instances | 1   |
+----------------------------+-------+
 
Yuliyas-MacBook-Air:sysbench Valerii$ time ./src/sysbench oltp_read_write --threads=2 --tables=50000 --table-size=1 --db-driver=mysql --mysql-socket=/tmp/mysql.sock --mysql-user=Valerii prepare
...
Creating a secondary index on 'sbtest49999'...
real	35m35.986s
user	0m7.160s
sys	0m7.258s
 
Yuliyas-MacBook-Air:sysbench Valerii$ time ~/dbs/maria10.4/bin/mysql -A sbtest -e "flush tables"
real	0m1.789s
user	0m0.016s
sys	0m0.007s

So, we see that both creation of 50K tables takes less with smaller (50K vs 500K) table_definition_cache (35 minutes 36 seconds vs 42 minutes 18 seconds) and FLUSH TABLES after that (1.8 seconds vs 2.2 seconds). This "negative scalability" is something to study and probably fix or explain.

As for regression vs. 10.3, more tests on macOS are planned for this weekend. But IMHO there is a point already, even if not in the same environment as customer had, and not of the same scale.

Comment by Valerii Kravchuk [ 2022-03-26 ]

Now on regression part. On the same macOS with 10.3 that was started with single non-default setting, same as 10.4 before:

Yuliyas-MacBook-Air:sysbench Valerii$ bin/mysqld_safe --no-defaults --table-definition-cache=500000 &

I see the following:

Yuliyas-MacBook-Air:sysbench Valerii$ time ./src/sysbench oltp_read_write --threads=2 --tables=50000 --table-size=1 --db-driver=mysql --mysql-socket=/tmp/mysql.sock --mysql-user=root prepare
...
Creating a secondary index on 'sbtest49999'...
real	25m10.129s
user	0m7.162s
sys	0m7.452s
 
Yuliyas-MacBook-Air:sysbench Valerii$ ~/dbs/maria10.3/bin/mysql -A sbtest -uroot -e "select version(); show global status like 'open%'; show global variables like 'table_%';"
+-----------------+
| version()    |
+-----------------+
| 10.3.35-MariaDB |
+-----------------+
+--------------------------+--------+
| Variable_name      | Value |
+--------------------------+--------+
| Open_files        | 5   |
| Open_streams       | 0   |
| Open_table_definitions  | 50018 |
| Open_tables       | 200  |
| Opened_files       | 199920 |
| Opened_plugin_libraries | 0   |
| Opened_table_definitions | 200018 |
| Opened_tables      | 100018 |
| Opened_views       | 0   |
+--------------------------+--------+
+----------------------------+--------+
| Variable_name       | Value |
+----------------------------+--------+
| table_definition_cache   | 500000 |
| table_open_cache      | 200  |
| table_open_cache_instances | 1   |
+----------------------------+--------+
 
Yuliyas-MacBook-Air:sysbench Valerii$ time ~/dbs/maria10.3/bin/mysql -A sbtest -uroot -e "flush tables"
real	0m1.558s
user	0m0.014s
sys	0m0.008s

So, it worked notably faster than 10.4 (25 m 10 seconds and 1.5 seconds for FLUSH vs 42 minutes 18 seconds and 2.2 seconds for FLUSH on 104). Moreover, if we compare to table_open_cache of 50000:

Yuliyas-MacBook-Air:sysbench Valerii$ ~/dbs/maria10.3/bin/mysql -u root -e "set global table_definition_cache=50000"
 
Yuliyas-MacBook-Air:sysbench Valerii$ ~/dbs/maria10.3/bin/mysql -uroot -e "select version(); show global status like 'open%'; show global variables like 'table_%';"
+-----------------+
| version()    |
+-----------------+
| 10.3.35-MariaDB |
+-----------------+
+--------------------------+--------+
| Variable_name      | Value |
+--------------------------+--------+
| Open_files        | 15   |
| Open_streams       | 0   |
| Open_table_definitions  | 5   |
| Open_tables       | 5   |
| Opened_files       | 249935 |
| Opened_plugin_libraries | 0   |
| Opened_table_definitions | 200023 |
| Opened_tables      | 100023 |
| Opened_views       | 0   |
+--------------------------+--------+
+----------------------------+-------+
| Variable_name       | Value |
+----------------------------+-------+
| table_definition_cache   | 50000 |
| table_open_cache      | 200  |
| table_open_cache_instances | 1   |
+----------------------------+-------+
 
Yuliyas-MacBook-Air:sysbench Valerii$ time ./src/sysbench oltp_read_write --threads=2 --tables=50000 --table-size=1 --db-driver=mysql --mysql-socket=/tmp/mysql.sock --mysql-user=root prepare
...
Inserting 1 records into 'sbtest49999'
Creating a secondary index on 'sbtest49999'...
real	24m32.612s
user	0m7.144s
sys	0m7.327s
 
Yuliyas-MacBook-Air:sysbench Valerii$ ~/dbs/maria10.3/bin/mysql -A sbtest -uroot -e "select version(); show global status like 'open%'; show global variables like 'table_%';"
+-----------------+
| version()    |
+-----------------+
| 10.3.35-MariaDB |
+-----------------+
+--------------------------+--------+
| Variable_name      | Value |
+--------------------------+--------+
| Open_files        | 5   |
| Open_streams       | 0   |
| Open_table_definitions  | 50000 |
| Open_tables       | 200  |
| Opened_files       | 449804 |
| Opened_plugin_libraries | 0   |
| Opened_table_definitions | 400023 |
| Opened_tables      | 200023 |
| Opened_views       | 0   |
+--------------------------+--------+
+----------------------------+-------+
| Variable_name       | Value |
+----------------------------+-------+
| table_definition_cache   | 50000 |
| table_open_cache      | 200  |
| table_open_cache_instances | 1   |
+----------------------------+-------+
Yuliyas-MacBook-Air:sysbench Valerii$
 
Yuliyas-MacBook-Air:sysbench Valerii$ time ~/dbs/maria10.3/bin/mysql -A sbtest -uroot -e "flush tables"
real	0m1.787s
user	0m0.020s
sys	0m0.016s

it's a bit faster for 50K tables creation (24 minutes 32 seconds) and bit slower for FLUSH . We do not see "negative scalability" for larger table definition cache as in 10,4. So, this is a "regression" part to check.

I know macOS is not supported and testing on old laptop with custom builds is far from perfect, but I think I gave the idea on what to do on decent Windows box, to confirm that there is indeed a problem.

I can try to study it a bit more in depth on macOS, with dtrace and such, if needed. But I'd prefer for someone to continue this i a proper environment and prove that my results are NOT repeatable there.

Comment by Sergei Golubchik [ 2022-04-01 ]

I've tried to reproduce it with, as you suggested, 50,000 one-row tables, table_definition_cache being 50,000 and 500,000.
Without sysbench, I used flush_benchmark_1.test mtr test. On my linux laptop, times for 50k and 500k caches are:

1       0.08199579      flush tables
2       0.08333350      flush tables

On Windows they are

1       0.16712770      flush tables
2       0.21896320      flush tables

so, indeed, there's a small slowdown with larger cache on Windows. But flush is still fast, so doesn't seem to be an issue.

Comment by Valerii Kravchuk [ 2022-04-02 ]

Good to know that you have an easier to use test for this, and it confirmed both all my attempts on Linux (almost no difference) and shown some clear difference on Windows (what exact version it was?). It means my macOS-based attempts were not a waste of time entirely.

Now I'd also check it with 10.3 (customer stated there was no problem there), with small table_definition_cache value like 1000 as a base line and, based on suggestion I've got in the process, with concurrent load that does a lot of memory allocations. I tried to use something like this:

openxs@ao756:~/dbs/maria10.4$ bin/mysqlslap --socket=/tmp/mariadb.sock --concurrency=2 --create-schema=test --number-of-queries=1000000 --query='select id, c1 from tb'

where the table is:

MariaDB [test]> show create table tb\G
*************************** 1. row ***************************
       Table: tb
Create Table: CREATE TABLE `tb` (
  `id` int(11) NOT NULL AUTO_INCREMENT,
  `c1` text DEFAULT NULL,
  PRIMARY KEY (`id`)
) ENGINE=InnoDB AUTO_INCREMENT=16371 DEFAULT CHARSET=latin1
1 row in set (0.043 sec)
 
MariaDB [test]> select avg(length(c1)), count(*) from tb;
+-----------------+----------+
| avg(length(c1)) | count(*) |
+-----------------+----------+
|      18274.0479 |     8192 |
+-----------------+----------+
1 row in set (0.285 sec)

but again, on Linux in all my lame tests it had NOT made any notable difference. Maybe on Windows it will show something new/useful.

Comment by Sergei Golubchik [ 2022-04-02 ]

repeated on Windows with SSD: the full test execution time (not flush only): 729 seconds on the latest 10.4, and 731 on the latest 10.3, so no regression here.

Comment by Michael Widenius [ 2022-04-04 ]

First a note: There should be no difference between a small or large table definition cache if both cases uses the same number of tables. This is because the cache is allocated dynamically and will use exactly the same memory allocation in both cases.

Benchmarking on Liunx
I had to run with --mem as I have xfs and file creating is quite slow on
xfs, so it was not possible to run without -mem; I could create about 18
tables per second.

Running with --mem should be ok as there should not be any difference in disk operations between 10.3 and 10.4 when it comes to the table cache.

This is a run with posted benchmark, but adding timing between all important operations:

Description 10.3.35 10.4.25
create seconds 35 36
flush seconds 0 0
update seconds 18 19
update2 seconds 15 14
flush2 seconds 0 1
update3 seconds 16 16
update4 seconds 15 14
flush3 seconds 0 0
alter seconds 48 46
drop seconds 12 12

In conclusion, no notable difference between 10.3 and 10.4 on Linux

Comment by Michael Widenius [ 2022-04-04 ]

I did a check of malloc() calls, as these could in theory have an affect on the OS.

Number of my_malloc calls in 10.3 was 7186304 and in 10.4 7219090. The difference mainly comes from the optimizer trace in 10.4,
but should not have any notable effect. I was not able to get the number of malloc() calls.

Comment by Sergei Golubchik [ 2022-04-09 ]

Looking at the code there is no possible way that larger table definition cache would behave in any way differently (slower or faster) than a smaller table definition cache as long as all tables fit into the cache.
That is, flush_benchmark_1.test makes little sense, as it compares two cache sizes (50,000 and 500,000) with 50,000 tables, that is, in both cases the cache is large enough to fit all tables.

To make it meaningful, change the test to compare table_definition_cache=5000 with 500000. Flush with a smaller cache indeed is faster, on my benchmarks on Windows it's 0.25s to flush 5k tables vs 1.5s to flush 50k tables. This isn't unexpected. On the other hand, updating of 50k tables is faster when they all fit into the cache (1300s vs 850s).

These are 10.4 numbers, 10.3 is about the same (within 10%).

Comment by Sergei Golubchik [ 2022-04-09 ]

just for the record:

10.3, 1e859d4abcfd7e3b2c238e5dc8c909254661082a

set global table_definition_cache=5000||
for i in 1..50000 do execute immediate concat('update t_', i,' set a=a+1'); end for||
--> 1281.5512
for i in 1..50000 do execute immediate concat('update t_', i,' set a=a+1'); end for||
--> 1331.8555
flush tables||
--> 0.2177
set global table_definition_cache=500000||
for i in 1..50000 do execute immediate concat('update t_', i,' set a=a+1'); end for||
--> 1100.1275
for i in 1..50000 do execute immediate concat('update t_', i,' set a=a+1'); end for||
--> 870.1276
flush tables||
--> 1.6023

10.4, c62843a055f52b27230926b12d9ee4f7aa68e1a0

set global table_definition_cache=5000||
for i in 1..50000 do execute immediate concat('update t_', i,' set a=a+1'); end for||
--> 1335.7895
for i in 1..50000 do execute immediate concat('update t_', i,' set a=a+1'); end for||
--> 1298.4948
flush tables||
--> 0.2585
set global table_definition_cache=500000||
for i in 1..50000 do execute immediate concat('update t_', i,' set a=a+1'); end for||
--> 1134.9305
for i in 1..50000 do execute immediate concat('update t_', i,' set a=a+1'); end for||
--> 847.3583
flush tables||
--> 1.4667

Comment by Sergei Golubchik [ 2022-04-10 ]

I've extended the test, adding 10 connections doing

send while 1 do set @a=(select sum(seq) from (select seq from seq_1_to_10000 order by seq-1 limit 9000) x); end while||

This query uses filesort, so, it means 10 connections constantly doing malloc/free in parallel.
Timing of FLUSH TABLES were varying wildly, I've run the test three times for 10.3 and for 10.4:

10.3, 1e859d4abcfd7e3b2c238e5dc8c909254661082a

set global table_definition_cache=5000||
for i in 1..50000 do execute immediate concat('update t_', i,' set a=a+1'); end for||
--> 6931.8189 6940.7756 6404.4770
for i in 1..50000 do execute immediate concat('update t_', i,' set a=a+1'); end for||
--> 6911.4679 6812.9442 7807.4025
flush tables||
--> 226.1672 29.0306 167.4894
set global table_definition_cache=500000||
for i in 1..50000 do execute immediate concat('update t_', i,' set a=a+1'); end for||
--> 5974.9393 6181.5528 6156.5681
for i in 1..50000 do execute immediate concat('update t_', i,' set a=a+1'); end for||
--> 6798.4645 6419.2754 6597.2620
flush tables||
--> 55.8149 198.7723 123.2874

10.4, c62843a055f52b27230926b12d9ee4f7aa68e1a0

set global table_definition_cache=5000||
for i in 1..50000 do execute immediate concat('update t_', i,' set a=a+1'); end for||
--> 7011.0454 7098.4546 7223.2147
for i in 1..50000 do execute immediate concat('update t_', i,' set a=a+1'); end for||
--> 6959.7857 7218.5944 6988.3271
flush tables||
--> 468.6933 151.8975 110.3296
set global table_definition_cache=500000||
for i in 1..50000 do execute immediate concat('update t_', i,' set a=a+1'); end for||
--> 6268.1949 6679.8739 6747.0793
for i in 1..50000 do execute immediate concat('update t_', i,' set a=a+1'); end for||
--> 6408.3037 6681.3635 6184.1227
flush tables||
--> 99.8943 90.8010 202.2116

But even here one cannot say that "with a huge table_definition_cache, FLUSH TABLES is much slower on 10.4 compared to 10.3"

Comment by Sergei Golubchik [ 2022-04-10 ]

Despite all the testing I was not able to reproduce the reported issue.

Generated at Thu Feb 08 09:32:55 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.