[MDEV-12439] MariaRocks produces numerous (spurious?) valgrind failures Created: 2017-04-04  Updated: 2019-08-20  Resolved: 2019-08-13

Status: Closed
Project: MariaDB Server
Component/s: Storage Engine - RocksDB
Affects Version/s: 10.2
Fix Version/s: 10.2.27, 10.3.18, 10.4.8

Type: Bug Priority: Major
Reporter: Sergei Petrunia Assignee: Sergei Petrunia
Resolution: Fixed Votes: 0
Labels: None

Issue Links:
PartOf
is part of MDEV-9658 Make MyRocks in MariaDB stable Closed
Relates
relates to MDEV-20315 MyRocks tests produce valgrind failures Closed

 Description   

Any test with MariaRocks produces a lot of valgrind failures.

  • A lot are probably harmless ("still reachable" failures for objects that are allocated once and then never freed)
  • However we should have suppressions for them
  • There are also "blocks are indirectly lost" and "blocks are definitely lost" failures. These are do not look harmless, and they come from upstream: https://github.com/facebook/mysql-5.6/issues/586 .


 Comments   
Comment by Sergei Petrunia [ 2018-05-18 ]

Running the testsuite under ASAN does not produce any errors.

Running it under valrgrind still does. There are many kinds of "still reachable" errors, with still-reachable data being in

  • background threads/global info. This can be suppressed as it is allocated in myrocks::rocksdb_init_func.
  • Block caches. These are hard to suppress meaningfully as the cache is filled during e.g. Get() operation.
Comment by Sergei Petrunia [ 2018-05-21 ]

Running valgrind on current upstream fails like so: https://gist.github.com/spetrunia/f5b314b6c039c04ff963838f78a447aa. This is valgrind-3.11 in Ubuntu LTS.

Comment by Sergei Petrunia [ 2018-05-21 ]

Upgraded to latest valgrind (3.13) and the error becomes a warning:

==14138== Warning: unimplemented fcntl command: 1036
==14138== Warning: unimplemented fcntl command: 1036
.... repeated 91 times: ==14138== Warning: unimplemented fcntl command: 1036
^ Found warnings in /home/ubuntu/mysql-5.6/mysql-test/var/log/mysqld.1.err

.. but I still get plenty of failures similar to what I observe in MariaDB:
https://gist.github.com/spetrunia/0e0351e0404d7c854e6549004c19ff7b

Comment by Sergei Petrunia [ 2018-05-22 ]

Discussed with the upstream. Upstream MyRocks is expected to work under valgrind. Will need to see why it fails in so many places for MariaDB.

Comment by Sergei Petrunia [ 2018-05-25 ]

Ran upstream test under valgrind (3.13). Some tests hang/timeout but I dont get any valgrind errors.

Comment by Sergei Petrunia [ 2018-05-25 ]

Examples:
https://gist.github.com/spetrunia/bf52c59e8fa6dc45d6aae2302db5d006
https://gist.github.com/spetrunia/3981596638ce90169fe0162f2fae5e30

Comment by Sergei Petrunia [ 2018-05-28 ]

> Ran upstream test under valgrind (3.13). Some tests hang/timeout but I dont get any valgrind errors.

However if I run valgrind tests on the revision of the upstream that we have merged from, I do see errors:
https://gist.github.com/spetrunia/d6125021b35850d5ab7da111d77e6352

So the [first part of the] solution is to wait for the merge from the upstream.

Comment by Sergei Petrunia [ 2019-08-13 ]

After fixing MDEV-20315 (rocksdb_sys_vars testsuite and relevant tests in mariabackup test suite), enabled Valgrind for for rocksdb test suite and ran tests:

...
valgrind_report                          w0 [ pass ]       
--------------------------------------------------------------------------
The servers were restarted 117 times
Spent 44520.265 of 15806 seconds executing testcases

I got no failures.

Comment by Sergei Petrunia [ 2019-08-13 ]

Enabled the valgrind back for rocksdb testsuite

Generated at Thu Feb 08 07:57:44 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.