[MDEV-15034] Jemalloc 5 breaks TokuDB Created: 2018-01-22  Updated: 2023-04-27

Status: Open
Project: MariaDB Server
Component/s: Server, Storage Engine - TokuDB
Affects Version/s: 10.2.12
Fix Version/s: 10.4

Type: Bug Priority: Major
Reporter: Michal Schorm Assignee: Vicențiu Ciorbaru
Resolution: Unresolved Votes: 2
Labels: upstream
Environment:

Fedora Rawhide - Koji Building system



 Description   

Hello,

I can't build MariaDB with TokuDB while passing the testsuite on x64_64.

I've built 10.2.12 into Fedora without TokuDB without any problem. However the users have broken update path, if new TokuDB subpackage is not provided.

The testsuite will freeze on one tests - that is not always the same and does not belong to TokuDB testsuite.
Mostly around roles. and sys_vars. testsuites.

Few times, I'm lucky and the testsuite won't freeze. But in such case, >500 test will fail (when none should).

I filed this report now, because I need to start working on it and any help will be appreciated.

I'll gather as much info as I can and I'll try to make reproducer in other places, that Fedora build system with Rawhide buildroot.
So, do expect more info later this week.



 Comments   
Comment by Daniel Black [ 2018-01-22 ]

Is a url for that build available? I only see the ones after you disabled tokudb.

https://koji.fedoraproject.org/koji/builds?userID=mschorm&order=-build_id&prefix=m

Comment by Michal Schorm [ 2018-01-23 ]

I made a latest scratch build, to ensure that the issue didn't vanished in a meantime:
https://koji.fedoraproject.org/koji/getfile?taskID=24373850&volume=DEFAULT&name=build.log

You can see that:

  • the job is still not yet finished
  • the TokuDB testsuite part was retried several hundreds time with a strange output format

CURRENT_TEST: tokudb.locks-update-deadlock-1
CURRENT_TEST: tokudb.mvcc-1
CURRENT_TEST: tokudb.mvcc-10
CURRENT_TEST: tokudb.mvcc-11

instead of

roles.set_and_drop                       [ pass ]     22
roles.set_default_role_clear             [ pass ]      7
roles.set_default_role_for               [ pass ]      6
roles.set_default_role_invalid           [ pass ]      5
roles.set_default_role_new_connection    [ pass ]     19

  • There is a repeated error message (probbably the cause?)

Failed to start mysqld.1
 - skipping '/builddir/build/BUILD/mariadb-10.2.12/mysql-test/var/log/tokudb_mariadb.compression/'
***Warnings generated in error logs during shutdown after running tests: tokudb_mariadb.compression
2018-01-22 19:12:59 139891771803840 [ERROR] Couldn't load plugins from 'ha_tokudb.so'.
2018-01-22 19:12:59 139891771803840 [ERROR] /builddir/build/BUILD/mariadb-10.2.12/sql/mysqld: unknown option '--tokudb'
2018-01-22 19:12:59 139891771803840 [ERROR] Aborting

or

Failed to start mysqld.1
 - skipping '/builddir/build/BUILD/mariadb-10.2.12/mysql-test/var/log/tokudb_mariadb.commit_5396/'
***Warnings generated in error logs during shutdown after running tests: tokudb_mariadb.commit_5396
2018-01-22 19:12:59 140019852216512 [ERROR] Couldn't load plugins from 'ha_tokudb.so'.
2018-01-22 19:12:59 140019852216512 [ERROR] /builddir/build/BUILD/mariadb-10.2.12/sql/mysqld: unknown option '--tokudb'
2018-01-22 19:12:59 140019852216512 [ERROR] Aborting
tokudb_mariadb.compression               [ fail ]

  • even though the plugin should be there:

-- Installing: /builddir/build/BUILDROOT/mariadb-10.2.12-3.fc28.x86_64/usr/lib64/mariadb/plugin/ha_tokudb.so

  • Why does mysqld doesn't know option '--tokudb' ?
Comment by Elena Stepanova [ 2018-01-23 ]

It doesn't know the option --tokudb because as it says earlier, it cannot load plugins from ha_tokudb.so, including the engine itself. Why it cannot load plugins, that's the question.
As I understand, it only happens on Fedora 28? We don't have the builder yet, so I can't compare it with ours.
Do you have the server log? It might show why the library cannot be loaded.

Comment by Michal Schorm [ 2018-01-23 ]

I forgot to paste the link to the whole build, my bad:
https://koji.fedoraproject.org/koji/taskinfo?taskID=24373850

All the available logs are accessible from there.
However I didn't find anything in them that seemed suspicious to me.

Comment by Elena Stepanova [ 2018-01-23 ]

It might be the problem that cvicentiu investigated just recently, with jemalloc 5.0. This log shows jemalloc 5.0.1:
https://kojipkgs.fedoraproject.org//work/tasks/3850/24373850/root.log

DEBUG util.py:439:   jemalloc                     x86_64 5.0.1-1.fc28                   build 189 k

Comment by Vicențiu Ciorbaru [ 2018-01-23 ]

Hi mschorm
There was an architectural change with jemalloc 5.0.0 that caused backwards incompatibility. Tokudb specifically relies on an introspection feature in jemalloc to set up some variables. Search for opt.lg_chunk in:

https://raw.githubusercontent.com/jemalloc/jemalloc/master/ChangeLog

Currently I don't have a solution to this.

Comment by Vicențiu Ciorbaru [ 2018-01-23 ]

From Jemalloc's ChangeLog: I'm not sure how stable the link is.

- Remove mallctl interfaces (various authors):
    + config.munmap
    + config.tcache
    + config.tls
    + config.valgrind
    + opt.lg_chunk

Comment by Michal Schorm [ 2018-02-18 ]

Discussion on the Percona upstream: TDB-108

Comment by Michal Schorm [ 2019-01-23 ]

It seems, everything works (in the built package).

However I still hit:

TokuDB is enabled, but jemalloc is not. This configuration is not supported

CMake warning message originating from "storage/tokudb/CMakeLists.txt" line 51.

And from looking at "storage/tokudb/CMakeLists.txt" and "cmake/jemalloc.cmake" I wasn't able to tell if it's an real issue or just a false positive.


My CMake configuration is:
-DPLUGIN_TOKUDB=DYNAMIC
-DWITH_JEMALLOC=YES

I have "jemalloc-devel" package in the buildroot.

SPECfile:
https://src.fedoraproject.org/rpms/mariadb/blob/master/f/mariadb.spec
Build log:
https://kojipkgs.fedoraproject.org//packages/mariadb/10.3.12/6.fc30/data/logs/x86_64/build.log


I do then ship the built package with:

[Service]
Environment="LD_PRELOAD=%{_libdir}/libjemalloc.so.2

in the /usr/lib/systemd/system/mariadb.service.d/tokudb.conf

(but I don't have this configuration in the buildroot - can that be an issue?)

Comment by Michal Schorm [ 2019-05-10 ]

Can be closed.

I finally found a solution.

The "WITH_JEMALLOC" is likethe only option, that does not accept uppercase values, beacuse it's not CMake bool, but a string, which does not have those uppercase values defined.

When using "yes" or "no", it works as expected.

Offtopic question:
Why the ha_tokudb.so plugin does not link the jemalloc library right away and the LD_PRELOAD must be configured?

Comment by Sergei Golubchik [ 2019-05-10 ]

We used to link ha_tokudb.so with jemalloc. But it means that jemalloc was loaded run-time (with ha_tokudb.so) into an already running process (mysqld). This sounds like a suspicious idea. Still it used to work until jemalloc 5.

With jemalloc 5 it broke completely, jemalloc failed to initialize when loaded run-time. It wanted to handle the whole process.

So, I've came up with this LD_PRELOAD approach. When tokudb rpm is not installed, there is no jemalloc. And tokudb rpm brings in /usr/lib/systemd/system/mariadb.service.d/tokudb.conf which LD_PRELOADs jemalloc for the whole server.

As a bonus a user has a freedom not to load jemalloc or to load, say, tcmalloc instead — some users have requested that.

Generated at Thu Feb 08 08:18:11 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.