[MDEV-13093] Leak of Datafile::m_crypt_info on shutdown after failed startup Created: 2017-06-14  Updated: 2019-01-25  Resolved: 2017-08-31

Status: Closed
Project: MariaDB Server
Component/s: Storage Engine - InnoDB
Affects Version/s: 10.2
Fix Version/s: 10.2.9, 10.3.2

Type: Bug Priority: Major
Reporter: Elena Stepanova Assignee: Jan Lindström (Inactive)
Resolution: Fixed Votes: 0
Labels: None

Issue Links:
Duplicate
is duplicated by MDEV-13315 test xb_compressed_encrypted may cras... Closed
is duplicated by MDEV-13319 REST encryption incremental backup af... Closed
is duplicated by MDEV-13334 Server crashes upon shutdown after un... Closed
is duplicated by MDEV-13557 Startup failure, unable to decrypt ib... Closed
is duplicated by MDEV-18382 Server crashes upon shutdown after un... Closed
Relates
relates to MDEV-13488 InnoDB writes CRYPT_INFO even though ... Closed
relates to MDEV-13485 MTR tests fail massively with --innod... Closed

 Description   

--disable_abort_on_error
--source include/have_innodb.inc
--let restart_parameters= --innodb-page-size=32K --innodb-buffer-pool-size=32M
--source include/restart_mysqld.inc
--echo # All done

10.2 227bfe446

2017-06-14 20:33:39 140704865052416 [Note] InnoDB: If the mysqld execution user is authorized, page cleaner thread priority can be changed. See the man page of setpriority().
2017-06-14 20:33:39 140705234473024 [ERROR] InnoDB: Data file './ibdata1' uses page size 16384, but the innodb_page_size start-up parameter is 32768
2017-06-14 20:33:39 140705234473024 [ERROR] InnoDB: Corrupted page [page id: space=0, page number=0] of datafile './ibdata1' could not be found in the doublewrite buffer.
2017-06-14 20:33:39 140705234473024 [ERROR] InnoDB: Plugin initialization aborted at srv0start.cc[1941] with error Generic error
2017-06-14 20:33:39 140705234473024 [Note] InnoDB: Starting shutdown...
170614 20:33:39 [ERROR] mysqld got signal 6 ;

#5  0x00007ff87b4f73fa in abort () from /lib/x86_64-linux-gnu/libc.so.6
#6  0x00007ff87b533bd0 in __libc_message () from /lib/x86_64-linux-gnu/libc.so.6
#7  0x00007ff87b539f96 in malloc_printerr () from /lib/x86_64-linux-gnu/libc.so.6
#8  0x000055bb26d1bf2a in LatchCounter::~LatchCounter (this=0x55bb291e9a98, __in_chrg=<optimized out>) at /data/src/10.2/storage/innobase/include/sync0types.h:598
#9  0x000055bb26d1db48 in LatchMeta<LatchCounter>::~LatchMeta (this=0x55bb291e9a70, __in_chrg=<optimized out>) at /data/src/10.2/storage/innobase/include/sync0types.h:785
#10 0x000055bb26d1bb17 in sync_latch_meta_destroy () at /data/src/10.2/storage/innobase/sync/sync0debug.cc:1620
#11 0x000055bb26d1bdb5 in sync_check_close () at /data/src/10.2/storage/innobase/sync/sync0debug.cc:1837
#12 0x000055bb26d0ac99 in innodb_shutdown () at /data/src/10.2/storage/innobase/srv/srv0start.cc:2922
#13 0x000055bb26b49592 in innobase_init (p=0x55bb291e6ba0) at /data/src/10.2/storage/innobase/handler/ha_innodb.cc:4492
#14 0x000055bb2683fdd3 in ha_initialize_handlerton (plugin=0x55bb290f4230) at /data/src/10.2/sql/handler.cc:512
#15 0x000055bb265eac0e in plugin_initialize (tmp_root=0x7ffdeec0e5b0, plugin=0x55bb290f4230, argc=0x55bb27b146b0 <remaining_argc>, argv=0x55bb28faaf50, options_only=false) at /data/src/10.2/sql/sql_plugin.cc:1407
#16 0x000055bb265eb817 in plugin_init (argc=0x55bb27b146b0 <remaining_argc>, argv=0x55bb28faaf50, flags=2) at /data/src/10.2/sql/sql_plugin.cc:1689
#17 0x000055bb264f8a35 in init_server_components () at /data/src/10.2/sql/mysqld.cc:5253
#18 0x000055bb264f9aba in mysqld_main (argc=146, argv=0x55bb28faaf50) at /data/src/10.2/sql/mysqld.cc:5843
#19 0x000055bb264ee9e0 in main (argc=24, argv=0x7ffdeec0f348) at /data/src/10.2/sql/main.cc:25

Also reproducible on bb-10.2-marko 8f2043673c.



 Comments   
Comment by Marko Mäkelä [ 2017-06-15 ]

In a build configured with cmake -DWITH_ASAN:BOOL=ON, some useful detail is available:

2017-06-15 10:01:40 140541280180096 [Note] InnoDB: Starting shutdown...
=================================================================
==3168==ERROR: AddressSanitizer: attempting free on address which was not malloc()-ed: 0x61200001b708 in thread T0
    #0 0x736d10 in operator delete(void*) (/mariadb/10.2/build/sql/mysqld+0x736d10)
    #1 0x1e927df in LatchCounter::~LatchCounter() /mariadb/10.2/storage/innobase/include/sync0types.h:598:4
    #2 0x1e92568 in LatchMeta<LatchCounter>::~LatchMeta() /mariadb/10.2/storage/innobase/include/sync0types.h:785:17
    #3 0x1e80129 in sync_latch_meta_destroy() /mariadb/10.2/storage/innobase/sync/sync0debug.cc:1620:3
    #4 0x1e7ff34 in sync_check_close() /mariadb/10.2/storage/innobase/sync/sync0debug.cc:1837:2
    #5 0x1e4809f in innodb_shutdown() /mariadb/10.2/storage/innobase/srv/srv0start.cc:2922:2
0x61200001b708 is located 200 bytes inside of 312-byte region [0x61200001b640,0x61200001b778)
allocated by thread T0 here:
    #0 0x6fd120 in calloc (/mariadb/10.2/build/sql/mysqld+0x6fd120)
    #1 0x2201346 in fil_space_create_crypt_data(unsigned int, fil_encryption_t, unsigned int, unsigned int) /mariadb/10.2/storage/innobase/fil/fil0crypt.cc:207:18
    #2 0x22012fe in fil_space_create_crypt_data(fil_encryption_t, unsigned int) /mariadb/10.2/storage/innobase/fil/fil0crypt.cc:233:10
    #3 0x2202373 in fil_space_read_crypt_data(page_size_t const&, unsigned char const*) /mariadb/10.2/storage/innobase/fil/fil0crypt.cc:308:15
    #4 0x2235cb6 in Datafile::read_first_page(bool) /mariadb/10.2/storage/innobase/fsp/fsp0file.cc:375:17
    #5 0x224cc42 in SysTablespace::read_lsn_and_check_flags(unsigned long*) /mariadb/10.2/storage/innobase/fsp/fsp0sysspace.cc:562:12
    #6 0x2250437 in SysTablespace::open_or_create(bool, bool, unsigned long*, unsigned long*) /mariadb/10.2/storage/innobase/fsp/fsp0sysspace.cc:903:9
    #7 0x1e392e3 in innobase_start_or_create_for_mysql() /mariadb/10.2/storage/innobase/srv/srv0start.cc:1921:22

The reason for this crash is that fil_space_crypt_t::~fil_space_crypt_t() was never called on this object. The fil_space_crypt_t is only allocated by fil_space_create_crypt_data() and only freed by fil_space_destroy_crypt_data(). We are clearly missing a call to the latter.

Comment by Marko Mäkelä [ 2017-06-15 ]

As I have pointed out earlier, the ownership rules of Datafile::m_crypt_info and SysTablespace::m_crypt_info need to be clarified. Why does SysTablespace::m_crypt_info exist at all?
Could the Datafile::m_crypt_info follow something like the C++11 std::unique_ptr move semantics (set Datafile::m_crypt_info to NULL when it is being passed to fil_space_create())?

It is possible that a similar resource leak exists in 10.1 already but is not being caught.

Comment by Marko Mäkelä [ 2017-06-15 ]

An even better approach would be remove Tablespace and Datafile altogether and move the code to member functions of fil_space_t and fil_node_t. But that is something longer-term, too big to do in a GA release.

Comment by Marko Mäkelä [ 2017-06-26 ]

This memory leak affects Mariabackup (bb-10.2-backup) on Windows and Linux. Debug like this:

cmake -DWITH_ASAN=ON ..
make -j$(nproc)
cd mysql-test
./mtr mariabackup.xb_compressed_encrypted mariabackup.incremental_encrypted

Comment by Marko Mäkelä [ 2017-08-31 ]

jplindst, I pushed into 10.2 an amended version of the commit.

Generated at Thu Feb 08 08:02:51 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.