[MDEV-32818] InnoDB: Failing assertion: UT_LIST_GET_LEN(trx_list) == 0 in innobase/trx/trx0sys.cc Created: 2023-11-16  Updated: 2023-11-16

Status: Confirmed
Project: MariaDB Server
Component/s: Plugins
Affects Version/s: 10.4
Fix Version/s: 10.4

Type: Bug Priority: Major
Reporter: Ramesh Sivaraman Assignee: Sergei Golubchik
Resolution: Unresolved Votes: 1
Labels: rr-profile-analyzed


 Description   

Replay using CLI

install plugin test_sql_service soname 'test_sql_service';
shutdown;

Leads to

10.4.32 f7552313d4e54a1cbfa5cb9bb9d06a55df7d0e95 (Optimized)

InnoDB: Failing assertion: UT_LIST_GET_LEN(trx_list) == 0

10.4.32 f7552313d4e54a1cbfa5cb9bb9d06a55df7d0e95 (Optimized)

Core was generated by `/test/mtest/MD111123-mariadb-10.4.32-linux-x86_64-opt/bin/mariadbd --no-default'.
Program terminated with signal SIGABRT, Aborted.
#0  __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:50
[Current thread is 1 (Thread 0x155299fdc7c0 (LWP 3299198))]
(gdb) bt
#0  __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:50
#1  0x000015529a1b5859 in __GI_abort () at abort.c:79
#2  0x00005603754f534c in ut_dbg_assertion_failed (expr=expr@entry=0x560376122838 "UT_LIST_GET_LEN(trx_list) == 0", file=file@entry=0x5603761226a8 "/test/mtest/10.4_opt/storage/innobase/trx/trx0sys.cc", line=line@entry=340) at /test/mtest/10.4_opt/storage/innobase/ut/ut0dbg.cc:60
#3  0x00005603754f39ca in trx_sys_t::close (this=0x5603766baf80 <trx_sys>) at /test/mtest/10.4_opt/storage/innobase/trx/trx0sys.cc:340
#4  0x0000560375c1fc3b in innodb_shutdown () at /test/mtest/10.4_opt/storage/innobase/srv/srv0start.cc:2539
#5  0x0000560375b0bf7b in innobase_end () at /test/mtest/10.4_opt/storage/innobase/handler/ha_innodb.cc:4464
#6  innobase_end () at /test/mtest/10.4_opt/storage/innobase/handler/ha_innodb.cc:4440
#7  0x0000560375828d4b in ha_finalize_handlerton (plugin=0x560377482370) at /test/mtest/10.4_opt/sql/handler.cc:510
#8  0x000056037562030c in plugin_deinitialize (plugin=0x560377482370, ref_check=ref_check@entry=true) at /test/mtest/10.4_opt/sql/sql_plugin.cc:1244
#9  0x000056037562253e in reap_plugins () at /test/mtest/10.4_opt/sql/sql_plugin.cc:1320
#10 0x0000560375623d6d in plugin_shutdown () at /test/mtest/10.4_opt/sql/sql_plugin.cc:2060
#11 0x000056037553e6c7 in clean_up (print_message=print_message@entry=true) at /test/mtest/10.4_opt/sql/mysqld.cc:2010
#12 0x0000560375549d5d in clean_up (print_message=true) at /test/mtest/10.4_opt/sql/mysqld.cc:5979
#13 mysqld_main (argc=<optimized out>, argv=<optimized out>) at /test/mtest/10.4_opt/sql/mysqld.cc:5979
#14 0x000015529a1b7083 in __libc_start_main (main=0x560375521520 <main(int, char**)>, argc=10, argv=0x7fffce977038, init=<optimized out>, fini=<optimized out>, rtld_fini=<optimized out>, stack_end=0x7fffce977028) at ../csu/libc-start.c:308
#15 0x000056037553ccde in _start () at /test/mtest/10.4_opt/include/mysql/plugin.h:211



 Comments   
Comment by Marko Mäkelä [ 2023-11-16 ]

I was not able to reproduce this with the following steps on a fresh build of 10.4 d4be70afb4eaffbd1c2dd3326e19752770b08bc6:

cd mysql-test
./mtr main.1st
rm -fr /dev/shm/rr
_RR_TRACE_DIR=/dev/shm/rr rr record \
../sql/mariadbd --no-defaults --innodb-log-file-size=5m --plugin-dir $(pwd)/var/plugins --datadir $(pwd)/var/mysqld.1/data &
echo "install plugin test_sql_service soname 'test_sql_service'; shutdown;"|
../client/mysql -uroot test --socket /tmp/mysql.sock
fg
rr replay /dev/shm/rr/latest-trace

By executing watch -l trx_sys.trx_list.count and continue in rr replay I see that some transactions are being created, among other things, for CREATE TABLE and DROP TABLE statements executed by the plugin during the installation time. But, the count will reach 1 during test_sql_service_plugin_deinit() inside plugin_shutdown() and later on 0 during purge_sys_t::close().

Can you please provide an rr replay trace of the failure?

Comment by Ramesh Sivaraman [ 2023-11-16 ]

marko rr trace is saved on galapq server: RR trace location - /test/mtest/MD111123-mariadb-10.4.32-linux-x86_64-opt/rr

Comment by Marko Mäkelä [ 2023-11-16 ]

ramesh, thank you. One reference is indeed being leaked. At the time the shutdown signal SIGUSR1 was received, we had trx_sys.trx_list.count=2, and after that, only purge_sys_t::close() released the one reference that it had. The transaction object once belonged to a transaction that last committed a write here:

10.4 f7552313d4e54a1cbfa5cb9bb9d06a55df7d0e95

#9  0x0000555f787d7d61 in trans_commit_stmt (thd=thd@entry=0x739750023978) at /test/mtest/10.4_opt/sql/transaction.cc:437
#10 0x0000555f786c1bcb in mysql_execute_command (thd=thd@entry=0x739750023978) at /test/mtest/10.4_opt/sql/sql_parse.cc:6276
#11 0x0000555f786da18b in execute_server_code (thd=0x739750023978, sql_text=<optimized out>, sql_len=<optimized out>) at /test/mtest/10.4_opt/sql/sql_prepare.cc:3926
#12 0x0000555f786e1aa6 in loc_advanced_command (mysql=<optimized out>, command=<optimized out>, header=<optimized out>, header_length=<optimized out>, 
    arg=0xf54f8ff080 "INSERT INTO test.ts_table VALUES('1234567890', NULL)", arg_length=52, skip_check=1 '\001', stmt=0x0) at /test/mtest/10.4_opt/sql/sql_prepare.cc:6032
#13 0x0000555f788a9fd4 in server_mysql_send_query (mysql=mysql@entry=0x739750023248, query=<optimized out>, length=<optimized out>) at /test/mtest/10.4_opt/sql-common/client.c:3577
#14 0x0000555f788aa036 in server_mysql_real_query (mysql=0x739750023248, query=<optimized out>, length=<optimized out>) at /test/mtest/10.4_opt/sql-common/client.c:3587
#15 0x000000f54f8fe16c in run_queries (mysql=0x739750023248) at /test/mtest/10.4_opt/plugin/test_sql_service/test_sql_service.c:82
#16 0x000000f54f8fe49f in do_tests () at /test/mtest/10.4_opt/plugin/test_sql_service/test_sql_service.c:113
#17 test_sql_service_plugin_init (p=<optimized out>) at /test/mtest/10.4_opt/plugin/test_sql_service/test_sql_service.c:226
#18 test_sql_service_plugin_init (p=<optimized out>) at /test/mtest/10.4_opt/plugin/test_sql_service/test_sql_service.c:215
#19 0x0000555f787c5c35 in initialize_audit_plugin (plugin=0x555f7a9177e0) at /test/mtest/10.4_opt/sql/sql_audit.cc:276

The object executed one more read-only transaction for select * from test.ts_table, in the same thread that later executed the SHUTDOWN statement.

The problem is outside InnoDB. I see two calls to server_mysql_init and two to server_mysql_close, for both returned handles. The first server_mysql_close would invoke storage engine cleanup:

10.4 f7552313d4e54a1cbfa5cb9bb9d06a55df7d0e95

#0  trx_sys_t::deregister_trx (trx=0x555f79769f80 <trx_sys>, this=<optimized out>) at /test/mtest/10.4_opt/storage/innobase/include/trx0sys.h:1097
#1  trx_t::free (this=this@entry=0x605738f86090) at /test/mtest/10.4_opt/storage/innobase/trx/trx0trx.cc:406
#2  0x0000555f78bb80df in innobase_close_connection (hton=0x555f7a9cdc68, thd=<optimized out>) at /test/mtest/10.4_opt/storage/innobase/handler/ha_innodb.cc:5224
#3  0x0000555f788d57d1 in closecon_handlerton (thd=thd@entry=0x73975002b708, plugin=<optimized out>, unused=unused@entry=0x0) at /test/mtest/10.4_opt/sql/handler.cc:844
#4  0x0000555f786d386a in plugin_foreach_with_mask (thd=thd@entry=0x73975002b708, func=func@entry=0x555f788d5790 <closecon_handlerton(THD*, plugin_ref, void*)>, type=type@entry=1, 
    state_mask=state_mask@entry=10, arg=arg@entry=0x0) at /test/mtest/10.4_opt/sql/sql_plugin.cc:2521
#5  0x0000555f788d83cd in ha_close_connection (thd=thd@entry=0x73975002b708) at /test/mtest/10.4_opt/sql/handler.cc:857
#6  0x0000555f78673a6e in THD::free_connection (this=this@entry=0x73975002b708) at /test/mtest/10.4_opt/sql/sql_class.cc:1599
#7  0x0000555f786792d8 in THD::~THD (this=0x73975002b708, __in_chrg=<optimized out>) at /test/mtest/10.4_opt/sql/sql_class.cc:1682
#8  0x0000555f78679406 in THD::~THD (this=0x73975002b708, __in_chrg=<optimized out>) at /test/mtest/10.4_opt/sql/sql_class.cc:1654
#9  0x0000555f786d9bfe in loc_on_close_free (mysql=0x73975002afd8) at /test/mtest/10.4_opt/sql/sql_prepare.cc:6137
#10 0x0000555f788a9f16 in mysql_close_free (mysql=0x73975002afd8) at /test/mtest/10.4_opt/sql-common/client.c:3355
#11 server_mysql_close (mysql=0x73975002afd8) at /test/mtest/10.4_opt/sql-common/client.c:3472
#12 server_mysql_close (mysql=0x73975002afd8) at /test/mtest/10.4_opt/sql-common/client.c:3463
#13 0x000000f54f8fe4ae in do_tests () at /test/mtest/10.4_opt/plugin/test_sql_service/test_sql_service.c:118
#14 test_sql_service_plugin_init (p=<optimized out>) at /test/mtest/10.4_opt/plugin/test_sql_service/test_sql_service.c:226
#15 test_sql_service_plugin_init (p=<optimized out>) at /test/mtest/10.4_opt/plugin/test_sql_service/test_sql_service.c:215
#16 0x0000555f787c5c35 in initialize_audit_plugin (plugin=0x555f7a9177e0) at /test/mtest/10.4_opt/sql/sql_audit.cc:276
#17 0x0000555f786cfc93 in plugin_do_initialize (plugin=plugin@entry=0x555f7a9177e0, state=@0x75875404724c: 4) at /test/mtest/10.4_opt/sql/sql_plugin.cc:1429

The second call fails to invoke innobase_close_connection in ha_close_connection, because the state_mask=2|8 does not match:

10.4 f7552313d4e54a1cbfa5cb9bb9d06a55df7d0e95

Thread 1 hit Breakpoint 12, 0x0000555f786d37e9 in intern_plugin_lock (state_mask=<optimized out>, rc=<optimized out>, lex=<optimized out>) at /test/mtest/10.4_opt/sql/sql_plugin.cc:2511
2511	in /test/mtest/10.4_opt/sql/sql_plugin.cc
1: x/i $pc
=> 0x555f786d37e9 <plugin_foreach_with_mask(THD*, char (*)(THD*, st_plugin_int*, void*), int, unsigned int, void*)+345>:	test   %edi,0x2c(%rax)
2: *(st_plugin_int*)$rax = {name = {str = 0x555f791a5846 <innobase_hton_name> "InnoDB", length = 6}, plugin = 0x555f7975d320 <builtin_maria_innobase_plugin>, plugin_dl = 0x0, ptr_backup = 0x555f7a9cae10, 
  nbackups = 160, state = 16, ref_count = 0, locks_total = 15, data = 0x555f7a9cdc68, mem_root = {free = 0x0, used = 0x0, pre_alloc = 0x0, min_malloc = 0, block_size = 0, total_alloc = 0, block_num = 0, 
    first_block_usage = 0, error_handler = 0x0, name = 0x0}, system_vars = 0x555f7a91ca48, load_option = PLUGIN_ON}
(rr) bt
#0  0x0000555f786d37e9 in intern_plugin_lock (state_mask=<optimized out>, rc=<optimized out>, lex=<optimized out>) at /test/mtest/10.4_opt/sql/sql_plugin.cc:2511
#1  plugin_foreach_with_mask (thd=thd@entry=0x739750023978, func=func@entry=0x555f788d5790 <closecon_handlerton(THD*, plugin_ref, void*)>, type=type@entry=1, state_mask=state_mask@entry=10, arg=arg@entry=0x0)
    at /test/mtest/10.4_opt/sql/sql_plugin.cc:2511
#2  0x0000555f788d83cd in ha_close_connection (thd=thd@entry=0x739750023978) at /test/mtest/10.4_opt/sql/handler.cc:857
#3  0x0000555f78673a6e in THD::free_connection (this=this@entry=0x739750023978) at /test/mtest/10.4_opt/sql/sql_class.cc:1599
#4  0x0000555f786792d8 in THD::~THD (this=0x739750023978, __in_chrg=<optimized out>) at /test/mtest/10.4_opt/sql/sql_class.cc:1682
#5  0x0000555f78679406 in THD::~THD (this=0x739750023978, __in_chrg=<optimized out>) at /test/mtest/10.4_opt/sql/sql_class.cc:1654
#6  0x0000555f786d9bfe in loc_on_close_free (mysql=0x739750023248) at /test/mtest/10.4_opt/sql/sql_prepare.cc:6137
#7  0x0000555f788a9f16 in mysql_close_free (mysql=0x739750023248) at /test/mtest/10.4_opt/sql-common/client.c:3355
#8  server_mysql_close (mysql=0x739750023248) at /test/mtest/10.4_opt/sql-common/client.c:3472
#9  server_mysql_close (mysql=0x739750023248) at /test/mtest/10.4_opt/sql-common/client.c:3463
#10 0x000000f54f8fe4d6 in test_sql_service_plugin_deinit (p=<optimized out>) at /test/mtest/10.4_opt/plugin/test_sql_service/test_sql_service.c:238

The field state had been changed from 2 to 16 (PLUGIN_IS_DYING) here:

10.4 f7552313d4e54a1cbfa5cb9bb9d06a55df7d0e95

#0  reap_plugins () at /test/mtest/10.4_opt/sql/sql_plugin.cc:1310
#1  0x0000555f786d2d6d in plugin_shutdown () at /test/mtest/10.4_opt/sql/sql_plugin.cc:2060
#2  0x0000555f785ed6c7 in clean_up (print_message=print_message@entry=true) at /test/mtest/10.4_opt/sql/mysqld.cc:2010
#3  0x0000555f785f8d5d in clean_up (print_message=true) at /test/mtest/10.4_opt/sql/mysqld.cc:5979
#4  mysqld_main (argc=<optimized out>, argv=<optimized out>) at /test/mtest/10.4_opt/sql/mysqld.cc:5979
#5  0x000014bb07f1d083 in __libc_start_main (main=0x555f785d0520 <main(int, char**)>, argc=12, argv=0x7ffd2062d9d8, init=<optimized out>, fini=<optimized out>, rtld_fini=<optimized out>, 
    stack_end=0x7ffd2062d9c8) at ../csu/libc-start.c:308
#6  0x0000555f785ebcde in _start () at /test/mtest/10.4_opt/include/mysql/plugin.h:211

      if (plugin->state == PLUGIN_IS_DELETED && !plugin->ref_count)
      {
        /* change the status flag to prevent reaping by another thread */
        plugin->state= PLUGIN_IS_DYING;
        *(reap++)= plugin;
      }

Generated at Thu Feb 08 10:34:15 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.