[MDEV-5777] Significant performance regression using TokuDB tables in mariadb.org builds Created: 2014-03-03 Updated: 2014-03-26 Resolved: 2014-03-26 |
|
| Status: | Closed |
| Project: | MariaDB Server |
| Component/s: | None |
| Affects Version/s: | 5.5.36, 10.0.8 |
| Fix Version/s: | 5.5.37, 10.0.10 |
| Type: | Bug | Priority: | Major |
| Reporter: | Tim Callaghan (Inactive) | Assignee: | Sergei Golubchik |
| Resolution: | Fixed | Votes: | 3 |
| Labels: | benchmarking, performance, tokudb | ||
| Environment: |
Ubuntu 13.10, (2) x Intel Xeon E5540, 48GB, 8 x 10K SAS in RAID10, 256MB Cache, XFS |
||
| Attachments: |
|
| Description |
|
In testing the performance of TokuDB tables using the MariaDB builds (available from mariadb.org) I noticed significant performance regressions. The following is for Sysbench (data > cache, IO bound): Table are bulk-loaded, then ran the test for 64 and 128 client threads, 5 minutes per thread count. Results as follows: MariaDB 10.0.8 : source = mariadb.org MariaDB 5.5.36 : source = mariadb.org MariaDB 5.5.30 : source = tokutek.com MySQL 5.5.30 : source = tokutek.com MySQL 5.5.36 : source = tokutek.com |
| Comments |
| Comment by Elena Stepanova [ 2014-03-03 ] |
|
XL, Could you please take a look at the results and probably run tests on our side? From IRC: <tmcallaghan_> build scripts are in github |
| Comment by Tim Callaghan (Inactive) [ 2014-03-05 ] |
|
We suspect that we have identified the issue as build related, I'm confirming today and will provide more information when my tests complete. |
| Comment by Tim Callaghan (Inactive) [ 2014-03-05 ] |
|
I believe that ~5% of this performance regression is a misconfigured debug build parameter, should be producing optimized builds using "-DTOKU_DEBUG_PARANOID=OFF". Here is my performance with your 10.0.8 build, Versus a 10.0.8 build with -DTOKU_DEBUG_PARAIOID=OFF, Our MariaDB 5.5.30 builds are significantly faster, |
| Comment by Axel Schwenke [ 2014-03-07 ] |
|
I was able to reproduce the issue. I tested the following versions: The benchmark description is in the attached .ods file |
| Comment by Tim Callaghan (Inactive) [ 2014-03-07 ] |
|
Axel, is your reproducer enough for MariaDB to continue researching or do you need something else from me? |
| Comment by Axel Schwenke [ 2014-03-07 ] |
|
Thanks Tim, I think we can continue from here on our own. Next step should be pulling profiles for the different versions. There is however one question: you mention that you bulk-loaded the tables. Is that with sysbench or did you use the tokudb loader? I also notice some oddities that might need attention later:
|
| Comment by Tim Callaghan (Inactive) [ 2014-03-07 ] |
|
Axel, bulk loading has become less important as we've improved TokuDB's concurrency over time, so I do not think it matters for this test. The performance dips are indeed checkpoint related, our checkpointing is where most of the writing (and thus compression) occurs so CPU availability is important. |
| Comment by Axel Schwenke [ 2014-03-13 ] |
|
Tim, can you give me more information on how exactly you created the datadir in your tests? I guess: you create a fresh datadir with mysql_install_db and then populate sysbench tables with sysbench ... prepare? Which version of sysbench? The reason for asking is this: I observe different performance - even for read-only OLTP - when the datadir was created by different version of MariaDB. I.e. when creating the datadir with our 5.5.35 build, then both our 5.5.35 and your 5.5.30 are slow. When creating the data with 5.5.30, then both are fast(er). Right now I'm running another test to check this in detail. Related question: if the above behavior is confirmed - is there any way to examine the tokudb table files for differences? Like fragmentation or whatever? Any ideas what might cause this slowdown? FYI how I create the datadir: I start with fresh datadir from mysql_install_db. Then I run sysbench with parallel_prepare.lua to create all OLTP tables in parallel. I'm using sysbench trunk from https://code.launchpad.net/sysbench |
| Comment by Axel Schwenke [ 2014-03-13 ] |
|
Here are the results. Indeed it seems that the performance in the OLTP benchmark is determined by the server version that was used to load the tokudb tables. If both servers run on the same datadir, the performance is virtually the same. |
| Comment by Axel Schwenke [ 2014-03-14 ] |
|
Here comes another piece of information. I pulled SHOW GLOBAL VARIABLES from both 5.5.30 (tokutek) and 5.5.35. This file shows only those variables that differ. However none of those variables explains the performance difference. |
| Comment by Joel Epstein (Inactive) [ 2014-03-20 ] |
|
Hi Axel, My name is Joel and I am a QA Engineer here at Tokutek. Tim asked me to do some further investigation on this issue and it appears that the compression is not being set properly at the CREATE statement. When the sysbench prepare phase runs with MariaDB 5.5.36 and 10.0.9, none of the tables are compressed and the datadir sizes reflect this. When the sysbench prepare phase runs with TokuDB's version of MariaDB, the compression can be set successfully. Is there a way such that the TokuDB compression could be set to zlib by default, rather than what it is currently defaulting to (none)? |
| Comment by Tim Callaghan (Inactive) [ 2014-03-20 ] |
|
It appears that the COMPRESSION table create option does not pickup the tokudb_row_format default that is set for the session/global variable. Therefore, if the user does not use "COMPRESSION=" in their CREATE TABLE statements they are getting uncompressed, which is the current MariaDB default. |
| Comment by Axel Schwenke [ 2014-03-21 ] |
|
Hi Joel, Tim! good catch! How embarrassing that I didn't notice the size difference for the data directories. Note to self: whenever looking at a performance problem with a compressing engine, check table space size first. Anyway: Serg has already fixed this in his working tree and it should soon be pushed to the main trees. For 5.5 the compression default will be zlib, for 10.0 the compression default will be taken from a session variable (which in turn is initialized from the global variable) |
| Comment by Tim Callaghan (Inactive) [ 2014-03-21 ] |
|
I modified the KB page at https://mariadb.com/kb/en/tokudb-differences/, noting this behavioral difference. |
| Comment by Tim Callaghan (Inactive) [ 2014-03-21 ] |
|
Thanks for the update Axel, I wish the default could come from the session variable in both but zlib is much better than uncompressed. |
| Comment by Axel Schwenke [ 2014-03-21 ] |
|
Serg, please close this issue when you see fit. |