[MDEV-16796] TRUNCATE TABLE slowdown with innodb_file_per_table=ON Created: 2018-07-21 Updated: 2021-09-30 Resolved: 2018-10-18 |
|
| Status: | Closed |
| Project: | MariaDB Server |
| Component/s: | Storage Engine - InnoDB |
| Affects Version/s: | 10.1.34 |
| Fix Version/s: | None |
| Type: | Bug | Priority: | Major |
| Reporter: | Geoff Montee (Inactive) | Assignee: | Eugene Kosov (Inactive) |
| Resolution: | Won't Fix | Votes: | 0 |
| Labels: | innodb, truncate | ||
| Attachments: |
|
||||||||||||||||||||
| Issue Links: |
|
||||||||||||||||||||
| Description |
|
The documentation for TRUNCATE TABLE says the following:
https://mariadb.com/kb/en/library/truncate-table/ What is the basis for the statement that it should be faster if innodb_file_per_table=ON is set? This does not seem to be true for a relatively small table with only 50k rows. For example, let's create a table and populate it with the following:
Then let's run the following statements with both innodb_file_per_table set to ON and OFF:
Here are the results:
It seems to be consistently faster with innodb_file_per_table=OFF set. The difference seems to be in the Table lock state, and there seems to be a difference in the Block_ops_out value. Is this a bug, or is the statement in the documentation incorrect, or is there an exception to this rule for relatively small tables? |
| Comments |
| Comment by Elena Stepanova [ 2018-07-30 ] | ||||||||||||||||||||||||||||||||||||||
|
marko, do you want to provide an "official" comment on this? | ||||||||||||||||||||||||||||||||||||||
| Comment by Marko Mäkelä [ 2018-09-14 ] | ||||||||||||||||||||||||||||||||||||||
|
I created a mysql-test-run test case for this:
Running cmake -DCMAKE_BUILD_TYPE=RelWithDebInfo with mysql-test-run --mem I see a small but noticeable difference in favour of innodb_file_per_table=1 (which is contrary to the claim):
With mysql-test-run -
On 10.3 with the
Again, if the buffer pool scan for dropping the adaptive hash index is removed, the relative difference becomes smaller:
I would like to see more detailed profile output (Linux perf or similar) that proves that TRUNCATE on an .ibd file is slower than for a table that is located in the InnoDB system tablespace, because I am consistently seeing the opposite. Could the file system I/O play a role? Note: My numbers are for optimized non-debug builds. | ||||||||||||||||||||||||||||||||||||||
| Comment by Geoff Montee (Inactive) [ 2018-10-16 ] | ||||||||||||||||||||||||||||||||||||||
|
The test case that you shared above sets innodb_file_per_table=0 before creating both t0 and t1. Shouldn't it have set innodb_file_per_table=1 before creating one of those if you wanted to test the difference? Are your test results invalid, or was this just a copy/paste error? | ||||||||||||||||||||||||||||||||||||||
| Comment by Marko Mäkelä [ 2018-10-17 ] | ||||||||||||||||||||||||||||||||||||||
|
GeoffMontee, my intention was to SET GLOBAL innodb_file_per_table=1 before CREATE TABLE t1. When repeating the (corrected) test today, I do get slower results for t1. It is likely that I accidentally had the tables t0 and t1 swapped in my previous run, and also in the previous test, indeed TRUNCATE is slower for innodb_file_per_table=1 tables:
With innodb_adaptive_hash_index=OFF the difference is bigger:
The difference between the last two runs show the impact of | ||||||||||||||||||||||||||||||||||||||
| Comment by Marko Mäkelä [ 2018-10-17 ] | ||||||||||||||||||||||||||||||||||||||
|
Please check with perf record or similar where the difference between innodb_file_per_table=1 and innodb_file_per_table=0 is coming from, with innodb_adaptive_hash_index=0. Side note: Why is 10.2 so much slower than 10.1? Could it be because of page_size_t and page_id_t? | ||||||||||||||||||||||||||||||||||||||
| Comment by Eugene Kosov (Inactive) [ 2018-10-17 ] | ||||||||||||||||||||||||||||||||||||||
|
For 10.1 I used manually started perf record -g -p `pgrep mysqld` just before TRUNCATE. Thame flamegraphs. One query is roughly two times faster for me:
And I see a bit similar staff for row_truncate_table_for_mysql(): 34 vs 74 samples. per_table_0.svg Flame graphs contains a lot of irrelevant info. If you open files in browser you may click on different function to zoom like here: http://www.brendangregg.com/FlameGraphs/example-perf.svg | ||||||||||||||||||||||||||||||||||||||
| Comment by Eugene Kosov (Inactive) [ 2018-10-17 ] | ||||||||||||||||||||||||||||||||||||||
|
One slow thing is a tablespace removal. I mean a system call which unlinks file. Example times: 0.0076 vs 0.0091. But file_per_table=0 is even faster 0.0070. That's a times for seq_1_to_500000. | ||||||||||||||||||||||||||||||||||||||
| Comment by Eugene Kosov (Inactive) [ 2018-10-18 ] | ||||||||||||||||||||||||||||||||||||||
|
file_per_table=0 TRUNCATE: iterates all records on disk and removes them from adaptive hash. file_per_table=1 TRUNCATE: iterates pages in buffer pool, iterates all records in interesting pages and removes them from adaptive hash, after that it recreates tablespace. So, performance of both cases depends on a table size, records size and buffer pool size. I'm not post here my benchmarks but in every case per_table=0 was faster. Even when table is bit and buffer pool is small when per_table=0 iterates a lot and per_table=1 iterates a little the former is slower because of file removal. Major note: do not benchmark both cases simultaneously. And my results are for 10.1. | ||||||||||||||||||||||||||||||||||||||
| Comment by Eugene Kosov (Inactive) [ 2018-10-18 ] | ||||||||||||||||||||||||||||||||||||||
|
Now with innodb_adapt_hash_index=0 per_table=0 is roughly 0.0081 The first one still iterates index pages. | ||||||||||||||||||||||||||||||||||||||
| Comment by Eugene Kosov (Inactive) [ 2018-10-18 ] | ||||||||||||||||||||||||||||||||||||||
|
And now for --mem:
| ||||||||||||||||||||||||||||||||||||||
| Comment by Eugene Kosov (Inactive) [ 2018-10-18 ] | ||||||||||||||||||||||||||||||||||||||
|
And flame graphs. Nothing very new. Performance schema code became noticeable with per_table=0. | ||||||||||||||||||||||||||||||||||||||
| Comment by Marko Mäkelä [ 2018-10-18 ] | ||||||||||||||||||||||||||||||||||||||
|
kevg’s flame graphs are very useful at visualizing this. My summary of the problems found (or previously known):
The fixes require changing the undo log and redo log formats, so they cannot be done in a GA version. | ||||||||||||||||||||||||||||||||||||||
| Comment by Eugene Kosov (Inactive) [ 2018-10-18 ] | ||||||||||||||||||||||||||||||||||||||
|
Now graphs for 10.2 safe_truncate=1 is ha_innobase::create + ha_innobase::delete in profile. safe_truncate=0 does a little bit more stuff than in 10.1 mem_adapt_1_per_table_0_10.2_safe_truncate_0.svg | ||||||||||||||||||||||||||||||||||||||
| Comment by Eugene Kosov (Inactive) [ 2018-10-18 ] | ||||||||||||||||||||||||||||||||||||||
|
And now some benchmark results. They're too variative, I know. But after looking at flame graphs I have an optinion than 10.2 is a little bit slower. safe_truncate=1 is significantly slower, indeed.
| ||||||||||||||||||||||||||||||||||||||
| Comment by Eugene Kosov (Inactive) [ 2018-10-18 ] | ||||||||||||||||||||||||||||||||||||||
|
Nothing more to do with this issue. AHI should be benchmarked to find cases where it brinds performance benefits. And if there aren't many such cases it probably should be removed. Here is the benchmarking task https://jira.mariadb.org/browse/MDEV-17492 | ||||||||||||||||||||||||||||||||||||||
| Comment by Geoff Montee (Inactive) [ 2018-10-18 ] | ||||||||||||||||||||||||||||||||||||||
|
Thanks for those thorough benchmarks, kevg. Thanks for the analysis. It's probably a good idea to fix the incorrect statement in the documentation. I've changed it to say the following:
https://mariadb.com/kb/en/library/truncate-table/ Did I forget or misunderstand anything important? Thanks! | ||||||||||||||||||||||||||||||||||||||
| Comment by Eugene Kosov (Inactive) [ 2018-10-19 ] | ||||||||||||||||||||||||||||||||||||||
|
Hi!. I'll summarise: adapt=0 per_file=0: adapt=0 per_file=1: adapt=1 per_file=0: adapt=1 per_file=1: > TRUNCATE TABLE can also perform poorly in cases where the InnoDB buffer pool is very large. That's a bit imprecise. Only adapt=1 per_file=1 case iterates through buffer pool. Everything else seems correct. Btw, I haven't checked https://mariadb.com/kb/en/library/truncate-table/ because it doesn't work for me today. | ||||||||||||||||||||||||||||||||||||||
| Comment by Geoff Montee (Inactive) [ 2018-10-19 ] | ||||||||||||||||||||||||||||||||||||||
The note about TRUNCATE and large buffer pools in the documentation was not originally related to this issue. It was related to https://bugs.mysql.com/bug.php?id=68184 After looking more closely at the upstream bug report, I see that it is indeed directly related to the issues with innodb_adaptive_hash_index. Thanks for pointing that out. I've clarified the documentation.
The knowledge base site has been having some intermittent issues after the latest web site update. |