[MDEV-24422] Server crashes in GetTypeID / ha_connect::GetRealType upon altering table engine Created: 2020-12-16  Updated: 2021-03-15  Resolved: 2021-03-10

Status: Closed
Project: MariaDB Server
Component/s: Storage Engine - Connect
Affects Version/s: 10.5, 10.6
Fix Version/s: 10.5.10

Type: Bug Priority: Major
Reporter: Elena Stepanova Assignee: Michael Widenius
Resolution: Fixed Votes: 0
Labels: None


 Description   

--source include/have_innodb.inc
 
INSTALL SONAME 'ha_connect';
 
CREATE TABLE t (f INT) ENGINE=CONNECT;
ALTER TABLE t ENGINE InnoDB;
 
# Cleanup
DROP TABLE t;
UNINSTALL SONAME 'ha_connect';

10.5 6bb3949e

#3  <signal handler called>
#4  __strcasecmp_l_avx () at ../sysdeps/x86_64/multiarch/strcmp-sse42.S:270
#5  0x00007fa2ad4da34d in GetTypeID (type=0xa5a5a5a5a5a5a500 <error: Cannot access memory at address 0xa5a5a5a5a5a5a500>) at /data/src/10.5/storage/connect/mycat.cc:125
#6  0x00007fa2ad4bf274 in ha_connect::GetRealType (this=0x7fa288016428, pos=0x7fa2881aea28) at /data/src/10.5/storage/connect/ha_connect.cc:1056
#7  0x00007fa2ad4ca002 in ha_connect::check_privileges (this=0x7fa288016428, thd=0x7fa288000db8, options=0x7fa2881aea28, dbn=0x7fa2bcc4a560 "test", quick=false) at /data/src/10.5/storage/connect/ha_connect.cc:4479
#8  0x00007fa2ad4cbb85 in ha_connect::delete_or_rename_table (this=0x7fa288016428, name=0x7fa2bcc4b4e0 "./test/t", to=0x0) at /data/src/10.5/storage/connect/ha_connect.cc:5216
#9  0x00007fa2ad4cbd2c in ha_connect::delete_table (this=0x7fa288016428, name=0x7fa2bcc4b4e0 "./test/t") at /data/src/10.5/storage/connect/ha_connect.cc:5245
#10 0x000055757c8e763d in hton_drop_table (hton=0x7fa2881b3ca8, path=0x7fa2bcc4b4e0 "./test/t") at /data/src/10.5/sql/handler.cc:564
#11 0x000055757c8ecd71 in ha_delete_table (thd=0x7fa288000db8, hton=0x7fa2881b3ca8, path=0x7fa2bcc4b4e0 "./test/t", db=0x7fa2bcc4c820, alias=0x7fa2bcc4c830, generate_warning=false) at /data/src/10.5/sql/handler.cc:2770
#12 0x000055757c66fd14 in quick_rm_table (thd=0x7fa288000db8, base=0x7fa2881b3ca8, db=0x7fa2bcc4c820, table_name=0x7fa2bcc4c830, flags=4, table_path=0x0) at /data/src/10.5/sql/sql_table.cc:2884
#13 0x000055757c687be5 in mysql_alter_table (thd=0x7fa288000db8, new_db=0x7fa288005800, new_name=0x7fa288005c00, create_info=0x7fa2bcc4d420, table_list=0x7fa288014018, alter_info=0x7fa2bcc4d350, order_num=0, order=0x0, ignore=false, if_exists=false) at /data/src/10.5/sql/sql_table.cc:11014
#14 0x000055757c72ea94 in Sql_cmd_alter_table::execute (this=0x7fa2880146f8, thd=0x7fa288000db8) at /data/src/10.5/sql/sql_alter.cc:539
#15 0x000055757c5859cc in mysql_execute_command (thd=0x7fa288000db8) at /data/src/10.5/sql/sql_parse.cc:6006
#16 0x000055757c58bd88 in mysql_parse (thd=0x7fa288000db8, rawbuf=0x7fa288013f30 "ALTER TABLE t ENGINE InnoDB", length=27, parser_state=0x7fa2bcc4e510, is_com_multi=false, is_next_command=false) at /data/src/10.5/sql/sql_parse.cc:8042
#17 0x000055757c577d6b in dispatch_command (command=COM_QUERY, thd=0x7fa288000db8, packet=0x7fa2880090a9 "", packet_length=27, is_com_multi=false, is_next_command=false) at /data/src/10.5/sql/sql_parse.cc:1872
#18 0x000055757c57655f in do_command (thd=0x7fa288000db8) at /data/src/10.5/sql/sql_parse.cc:1353
#19 0x000055757c723e57 in do_handle_one_connection (connect=0x55757f167608, put_in_cache=true) at /data/src/10.5/sql/sql_connect.cc:1410
#20 0x000055757c723bba in handle_one_connection (arg=0x55757f181068) at /data/src/10.5/sql/sql_connect.cc:1312
#21 0x000055757cc8248f in pfs_spawn_thread (arg=0x55757f167248) at /data/src/10.5/storage/perfschema/pfs.cc:2201
#22 0x00007fa2c3dbe609 in start_thread (arg=<optimized out>) at pthread_create.c:477
#23 0x00007fa2c3992293 in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95

Reproducible on 10.5, 10.6.
A non-debug build doesn't crash on my machine, but a non-debug ASAN build does (also with SIGSEGV), so it is not just a debug problem.

Couldn't reproduce on 10.2-10.4.



 Comments   
Comment by Olivier Bertrand [ 2020-12-17 ]

I cannot reproduce it on my machine.

Comment by Elena Stepanova [ 2020-12-17 ]

Try valgrind (build with -DCMAKE_BUILD_TYPE=Debug -DWITH_VALGRIND=YES and run MTR with --valgrind option), maybe it will have more luck.

10.5 valgrind 6bb3949e

==67371== Thread 14:
==67371== Conditional jump or move depends on uninitialised value(s)
==67371==    at 0x118D2CDE: GetTypeID(char const*) (mycat.cc:124)
==67371==    by 0x118B74BB: ha_connect::GetRealType(ha_table_option_struct*) (ha_connect.cc:1056)
==67371==    by 0x118C224B: ha_connect::check_privileges(THD*, ha_table_option_struct*, char const*, bool) (ha_connect.cc:4479)
==67371==    by 0x118C3DCE: ha_connect::delete_or_rename_table(char const*, char const*) (ha_connect.cc:5216)
==67371==    by 0x118C3F75: ha_connect::delete_table(char const*) (ha_connect.cc:5245)
==67371==    by 0xD9C3CA: hton_drop_table(handlerton*, char const*) (handler.cc:564)
==67371==    by 0xDA1AEA: ha_delete_table(THD*, handlerton*, char const*, st_mysql_const_lex_string const*, st_mysql_const_lex_string const*, bool) (handler.cc:2770)
==67371==    by 0xB17490: quick_rm_table(THD*, handlerton*, st_mysql_const_lex_string const*, st_mysql_const_lex_string const*, unsigned int, char const*) (sql_table.cc:2884)
==67371==    by 0xB2F4C8: mysql_alter_table(THD*, st_mysql_const_lex_string const*, st_mysql_const_lex_string const*, HA_CREATE_INFO*, TABLE_LIST*, Alter_info*, unsigned int, st_order*, bool, bool) (sql_table.cc:11014)
==67371==    by 0xBDB167: Sql_cmd_alter_table::execute(THD*) (sql_alter.cc:539)
==67371==    by 0xA2A8DF: mysql_execute_command(THD*) (sql_parse.cc:6006)
==67371==    by 0xA30D6B: mysql_parse(THD*, char*, unsigned int, Parser_state*, bool, bool) (sql_parse.cc:8042)
==67371==    by 0xA1CBFC: dispatch_command(enum_server_command, THD*, char*, unsigned int, bool, bool) (sql_parse.cc:1872)
==67371==    by 0xA1B3F0: do_command(THD*) (sql_parse.cc:1353)
==67371==    by 0xBCFA34: do_handle_one_connection(CONNECT*, bool) (sql_connect.cc:1410)
==67371==    by 0xBCF797: handle_one_connection (sql_connect.cc:1312)
==67371==    by 0x114AE80: pfs_spawn_thread (pfs.cc:2201)
==67371==    by 0x4C31608: start_thread (pthread_create.c:477)
==67371==    by 0x50C0292: clone (clone.S:95)
==67371== Invalid read of size 8
==67371==    at 0x118B74C9: ha_connect::GetRealType(ha_table_option_struct*) (ha_connect.cc:1059)
==67371==    by 0x118C224B: ha_connect::check_privileges(THD*, ha_table_option_struct*, char const*, bool) (ha_connect.cc:4479)
==67371==    by 0x118C3DCE: ha_connect::delete_or_rename_table(char const*, char const*) (ha_connect.cc:5216)
==67371==    by 0x118C3F75: ha_connect::delete_table(char const*) (ha_connect.cc:5245)
==67371==    by 0xD9C3CA: hton_drop_table(handlerton*, char const*) (handler.cc:564)
==67371==    by 0xDA1AEA: ha_delete_table(THD*, handlerton*, char const*, st_mysql_const_lex_string const*, st_mysql_const_lex_string const*, bool) (handler.cc:2770)
==67371==    by 0xB17490: quick_rm_table(THD*, handlerton*, st_mysql_const_lex_string const*, st_mysql_const_lex_string const*, unsigned int, char const*) (sql_table.cc:2884)
==67371==    by 0xB2F4C8: mysql_alter_table(THD*, st_mysql_const_lex_string const*, st_mysql_const_lex_string const*, HA_CREATE_INFO*, TABLE_LIST*, Alter_info*, unsigned int, st_order*, bool, bool) (sql_table.cc:11014)
==67371==    by 0xBDB167: Sql_cmd_alter_table::execute(THD*) (sql_alter.cc:539)
==67371==    by 0xA2A8DF: mysql_execute_command(THD*) (sql_parse.cc:6006)
==67371==    by 0xA30D6B: mysql_parse(THD*, char*, unsigned int, Parser_state*, bool, bool) (sql_parse.cc:8042)
==67371==    by 0xA1CBFC: dispatch_command(enum_server_command, THD*, char*, unsigned int, bool, bool) (sql_parse.cc:1872)
==67371==    by 0xA1B3F0: do_command(THD*) (sql_parse.cc:1353)
==67371==    by 0xBCFA34: do_handle_one_connection(CONNECT*, bool) (sql_connect.cc:1410)
==67371==    by 0xBCF797: handle_one_connection (sql_connect.cc:1312)
==67371==    by 0x114AE80: pfs_spawn_thread (pfs.cc:2201)
==67371==    by 0x4C31608: start_thread (pthread_create.c:477)
==67371==    by 0x50C0292: clone (clone.S:95)
==67371==  Address 0x117416c8 is 8 bytes before a block of size 56 alloc'd
==67371==    at 0x483B7F3: malloc (in /usr/lib/x86_64-linux-gnu/valgrind/vgpreload_memcheck-amd64-linux.so)
==67371==    by 0x1742322: my_malloc (my_malloc.c:88)
==67371==    by 0x1732029: alloc_root (my_alloc.c:190)
==67371==    by 0xB6B19F: TABLE_SHARE::init_from_binary_frm_image(THD*, bool, unsigned char const*, unsigned long, unsigned char const*, unsigned long) (table.cc:3234)
==67371==    by 0xB6294D: open_table_def(THD*, TABLE_SHARE*, unsigned int) (table.cc:714)
==67371==    by 0x118C3D50: ha_connect::delete_or_rename_table(char const*, char const*) (ha_connect.cc:5211)
==67371==    by 0x118C3F75: ha_connect::delete_table(char const*) (ha_connect.cc:5245)
==67371==    by 0xD9C3CA: hton_drop_table(handlerton*, char const*) (handler.cc:564)
==67371==    by 0xDA1AEA: ha_delete_table(THD*, handlerton*, char const*, st_mysql_const_lex_string const*, st_mysql_const_lex_string const*, bool) (handler.cc:2770)
==67371==    by 0xB17490: quick_rm_table(THD*, handlerton*, st_mysql_const_lex_string const*, st_mysql_const_lex_string const*, unsigned int, char const*) (sql_table.cc:2884)
==67371==    by 0xB2F4C8: mysql_alter_table(THD*, st_mysql_const_lex_string const*, st_mysql_const_lex_string const*, HA_CREATE_INFO*, TABLE_LIST*, Alter_info*, unsigned int, st_order*, bool, bool) (sql_table.cc:11014)
==67371==    by 0xBDB167: Sql_cmd_alter_table::execute(THD*) (sql_alter.cc:539)
==67371==    by 0xA2A8DF: mysql_execute_command(THD*) (sql_parse.cc:6006)
==67371==    by 0xA30D6B: mysql_parse(THD*, char*, unsigned int, Parser_state*, bool, bool) (sql_parse.cc:8042)
==67371==    by 0xA1CBFC: dispatch_command(enum_server_command, THD*, char*, unsigned int, bool, bool) (sql_parse.cc:1872)
==67371==    by 0xA1B3F0: do_command(THD*) (sql_parse.cc:1353)
==67371==    by 0xBCFA34: do_handle_one_connection(CONNECT*, bool) (sql_connect.cc:1410)
==67371==    by 0xBCF797: handle_one_connection (sql_connect.cc:1312)
==67371==    by 0x114AE80: pfs_spawn_thread (pfs.cc:2201)
==67371==    by 0x4C31608: start_thread (pthread_create.c:477)
==67371== Conditional jump or move depends on uninitialised value(s)
==67371==    at 0x118D2CDE: GetTypeID(char const*) (mycat.cc:124)
==67371==    by 0x118B74BB: ha_connect::GetRealType(ha_table_option_struct*) (ha_connect.cc:1056)
==67371==    by 0x118C3DF7: ha_connect::delete_or_rename_table(char const*, char const*) (ha_connect.cc:5219)
==67371==    by 0x118C3F75: ha_connect::delete_table(char const*) (ha_connect.cc:5245)
==67371==    by 0xD9C3CA: hton_drop_table(handlerton*, char const*) (handler.cc:564)
==67371==    by 0xDA1AEA: ha_delete_table(THD*, handlerton*, char const*, st_mysql_const_lex_string const*, st_mysql_const_lex_string const*, bool) (handler.cc:2770)
==67371==    by 0xB17490: quick_rm_table(THD*, handlerton*, st_mysql_const_lex_string const*, st_mysql_const_lex_string const*, unsigned int, char const*) (sql_table.cc:2884)
==67371==    by 0xB2F4C8: mysql_alter_table(THD*, st_mysql_const_lex_string const*, st_mysql_const_lex_string const*, HA_CREATE_INFO*, TABLE_LIST*, Alter_info*, unsigned int, st_order*, bool, bool) (sql_table.cc:11014)
==67371==    by 0xBDB167: Sql_cmd_alter_table::execute(THD*) (sql_alter.cc:539)
==67371==    by 0xA2A8DF: mysql_execute_command(THD*) (sql_parse.cc:6006)
==67371==    by 0xA30D6B: mysql_parse(THD*, char*, unsigned int, Parser_state*, bool, bool) (sql_parse.cc:8042)
==67371==    by 0xA1CBFC: dispatch_command(enum_server_command, THD*, char*, unsigned int, bool, bool) (sql_parse.cc:1872)
==67371==    by 0xA1B3F0: do_command(THD*) (sql_parse.cc:1353)
==67371==    by 0xBCFA34: do_handle_one_connection(CONNECT*, bool) (sql_connect.cc:1410)
==67371==    by 0xBCF797: handle_one_connection (sql_connect.cc:1312)
==67371==    by 0x114AE80: pfs_spawn_thread (pfs.cc:2201)
==67371==    by 0x4C31608: start_thread (pthread_create.c:477)
==67371==    by 0x50C0292: clone (clone.S:95)
==67371== Invalid read of size 8
==67371==    at 0x118B74C9: ha_connect::GetRealType(ha_table_option_struct*) (ha_connect.cc:1059)
==67371==    by 0x118C3DF7: ha_connect::delete_or_rename_table(char const*, char const*) (ha_connect.cc:5219)
==67371==    by 0x118C3F75: ha_connect::delete_table(char const*) (ha_connect.cc:5245)
==67371==    by 0xD9C3CA: hton_drop_table(handlerton*, char const*) (handler.cc:564)
==67371==    by 0xDA1AEA: ha_delete_table(THD*, handlerton*, char const*, st_mysql_const_lex_string const*, st_mysql_const_lex_string const*, bool) (handler.cc:2770)
==67371==    by 0xB17490: quick_rm_table(THD*, handlerton*, st_mysql_const_lex_string const*, st_mysql_const_lex_string const*, unsigned int, char const*) (sql_table.cc:2884)
==67371==    by 0xB2F4C8: mysql_alter_table(THD*, st_mysql_const_lex_string const*, st_mysql_const_lex_string const*, HA_CREATE_INFO*, TABLE_LIST*, Alter_info*, unsigned int, st_order*, bool, bool) (sql_table.cc:11014)
==67371==    by 0xBDB167: Sql_cmd_alter_table::execute(THD*) (sql_alter.cc:539)
==67371==    by 0xA2A8DF: mysql_execute_command(THD*) (sql_parse.cc:6006)
==67371==    by 0xA30D6B: mysql_parse(THD*, char*, unsigned int, Parser_state*, bool, bool) (sql_parse.cc:8042)
==67371==    by 0xA1CBFC: dispatch_command(enum_server_command, THD*, char*, unsigned int, bool, bool) (sql_parse.cc:1872)
==67371==    by 0xA1B3F0: do_command(THD*) (sql_parse.cc:1353)
==67371==    by 0xBCFA34: do_handle_one_connection(CONNECT*, bool) (sql_connect.cc:1410)
==67371==    by 0xBCF797: handle_one_connection (sql_connect.cc:1312)
==67371==    by 0x114AE80: pfs_spawn_thread (pfs.cc:2201)
==67371==    by 0x4C31608: start_thread (pthread_create.c:477)
==67371==    by 0x50C0292: clone (clone.S:95)
==67371==  Address 0x117416c8 is 8 bytes before a block of size 56 alloc'd
==67371==    at 0x483B7F3: malloc (in /usr/lib/x86_64-linux-gnu/valgrind/vgpreload_memcheck-amd64-linux.so)
==67371==    by 0x1742322: my_malloc (my_malloc.c:88)
==67371==    by 0x1732029: alloc_root (my_alloc.c:190)
==67371==    by 0xB6B19F: TABLE_SHARE::init_from_binary_frm_image(THD*, bool, unsigned char const*, unsigned long, unsigned char const*, unsigned long) (table.cc:3234)
==67371==    by 0xB6294D: open_table_def(THD*, TABLE_SHARE*, unsigned int) (table.cc:714)
==67371==    by 0x118C3D50: ha_connect::delete_or_rename_table(char const*, char const*) (ha_connect.cc:5211)
==67371==    by 0x118C3F75: ha_connect::delete_table(char const*) (ha_connect.cc:5245)
==67371==    by 0xD9C3CA: hton_drop_table(handlerton*, char const*) (handler.cc:564)
==67371==    by 0xDA1AEA: ha_delete_table(THD*, handlerton*, char const*, st_mysql_const_lex_string const*, st_mysql_const_lex_string const*, bool) (handler.cc:2770)
==67371==    by 0xB17490: quick_rm_table(THD*, handlerton*, st_mysql_const_lex_string const*, st_mysql_const_lex_string const*, unsigned int, char const*) (sql_table.cc:2884)
==67371==    by 0xB2F4C8: mysql_alter_table(THD*, st_mysql_const_lex_string const*, st_mysql_const_lex_string const*, HA_CREATE_INFO*, TABLE_LIST*, Alter_info*, unsigned int, st_order*, bool, bool) (sql_table.cc:11014)
==67371==    by 0xBDB167: Sql_cmd_alter_table::execute(THD*) (sql_alter.cc:539)
==67371==    by 0xA2A8DF: mysql_execute_command(THD*) (sql_parse.cc:6006)
==67371==    by 0xA30D6B: mysql_parse(THD*, char*, unsigned int, Parser_state*, bool, bool) (sql_parse.cc:8042)
==67371==    by 0xA1CBFC: dispatch_command(enum_server_command, THD*, char*, unsigned int, bool, bool) (sql_parse.cc:1872)
==67371==    by 0xA1B3F0: do_command(THD*) (sql_parse.cc:1353)
==67371==    by 0xBCFA34: do_handle_one_connection(CONNECT*, bool) (sql_connect.cc:1410)
==67371==    by 0xBCF797: handle_one_connection (sql_connect.cc:1312)
==67371==    by 0x114AE80: pfs_spawn_thread (pfs.cc:2201)
==67371==    by 0x4C31608: start_thread (pthread_create.c:477)
^ Found warnings in /data/bld/10.5-valgrind-nightly/mysql-test/var/log/mysqld.1.err
ok

Comment by Olivier Bertrand [ 2020-12-18 ]

Well, perhaps some compilers do not accept non initialized variables even they always get a value.
So, you can try this fix:

in ha_connect.cc, initialize type in the first line of the GetRealType function, for instance by:

TABTYPE type= TAB_UNDEF;

( or even NULL)

I cannot test if this works, not being able to reproduce the crash. VALGRIND is a Linux stuff, I don't think it is working on Windows.

Comment by Anel Husakovic [ 2020-12-18 ]

I will take this bug bertrandop, thanks elenst.

Comment by Anel Husakovic [ 2021-03-03 ]

The reason why it work in <10.5 is the change in myql_alter_table(), where the idea is, after dropping the temporary (backup) frm, to drop the old table too:
added with commit 043a3a0176

Looking into int ha_connect::delete_or_rename_table(const char *name, const char *to) we can see that table share is first allocated using alloc_table_share() - in table.cc which is not using any .frm file.
After that table is opened with open_table_def() - in table.cc which is calling share->init_from_binary_frm_image(thd, false, buf, frmlen); .
Here the .frm file that will be used is from the new handlerton, in this case INNODB.
Let’s see what is happening with table share and table share attribute ha_table_option_struct option_struct in case of INNODB, ARIA and MYISAM.
Breakpoint is hton_drop_table since I cannot directly go to ha_connect::delete_table, mysql_alter_table and quick_rm_table.

ARIA:
In mysql_alter_table() in case when ALTER succeed (line above culprit commit) there is the first call of quick_rm_table where backup file (.frm) is deleted.
Right before that we have files:

anel@anel:~/mariadb/builds/10.5/mysql-test/var/mysqld.1/data/test$ ls
 db.opt  '#sql-backup-1d54-3.frm'   t.dos   t.frm   t.MAD   t.MAI

After quick_rm_table() is called first time (without engine_changed) => .frm is dropped

anel@anel:~/mariadb/builds/10.5/mysql-test/var/mysqld.1/data/test$ ls
db.opt  t.dos  t.frm  t.MAD  t.MAI

After quick_rm_table() is called second time (with engine_changed) hton_drop_table will be called and we are ending in
ha_connect::delete_or_rename_table where we need to check the table share.
So share is first changed in alloc_table_share() with NULL of option_struct.

Old value = (TABLE_SHARE *) 0x7fffc4000ca0
New value = (TABLE_SHARE *) 0x7fffc403f350
(gdb) p share->option_struct      
+p share->option_struct
$5 = (ha_table_option_struct *) 0x0

Here we should also put the breakpoint in parse_engine_table_options () - in sql/create_options.cc since we want to see how handlerton is changing from init_from_binary_frm_image ().
As handlerton, it is used ha_maria_* - ARIA, this is ok.
Note here that parse_option_list() is called, which is using the share->option_struct and share->option_list.

if (parse_option_list(thd, ht, &share->option_struct, & share->option_list,                                                                                                         
                                ht->table_options, TRUE, root))
 
gdb) p &share->option_struct
+p &share->option_struct
$18 = (ha_table_option_struct **) 0x7fffc403f5d0
(gdb) p share->option_struct
+p share->option_struct
$17 = (ha_table_option_struct *) 0x0
(gdb) p share->option_list
+p share->option_list
$12 = (engine_option_value *) 0x0
 
(gdb) p ht->table_options
+p ht->table_options
$23 = (ha_create_table_option *) 0x0

Note: ht->table_options = NULL
In parse_option_list() - sql/create_options.cc since rules and *option_list is NULL it will not change option_struct by allocating it again.

(gdb) p ((ha_table_option_struct **)option_struct_arg)
+p ((ha_table_option_struct **)option_struct_arg)
$20 = (ha_table_option_struct **) 0x7fffc403f5d0
(gdb) p *((ha_table_option_struct **)option_struct_arg)
+p *((ha_table_option_struct **)option_struct_arg)
$21 = (ha_table_option_struct *) 0x0

So as result - still original table (t.dos) is present. So even with engine_changed in mysql_alter() with culprit commit change the original table remains.

MYISAM:

Before first call of {{quick_rm_table() }}.

anel@anel:~/mariadb/builds/10.5/mysql-test/var/mysqld.1/data/test$ ls
 db.opt  '#sql-backup-1d54-3.frm'   t.dos   t.frm   t.MYD   t.MYI

After the call '#sql-backup-1d54-3.frm' got deleted.
Table share allocated in ha_connect::delete_or_rename_table()
Old value = (TABLE_SHARE *) 0x7fffc4000ca0
New value = (TABLE_SHARE *) 0x7fffc4040e00
And again as ^ (ht->table_options == NULL)
The same situation as in Aria, still original table is present.

INNODB:
Before first call of {{quick_rm_table() }}.

anel@anel:~/mariadb/builds/10.5/mysql-test/var/mysqld.1/data/test$ ls
 db.opt  '#sql-backup-1d54-3.frm'   t.dos   t.frm   t.ibd

After the call '#sql-backup-1d54-3.frm' got deleted.
Table share

(gdb) p share
+p share
$42 = (TABLE_SHARE *) 0x7fffc4041800
(gdb) p share->option_struct 
+p share->option_struct
$43 = (ha_table_option_struct *) 0x0
(gdb) p share->option_list  
+p share->option_list
$44 = (engine_option_value *) 0x0

But in function parse_engine_table_options() we have non-NULL ht->table_options. It get’s created during ha_create() while creating the .ibd file.
Seems that innobase plugin use in innodb_init() - in ha_innodb.cc function by default creating handlerton table_options (what other MyISAM, ARIA SE don't do (example myisam_init() ?)
innobase_hton->table_options = innodb_table_option_list;.

Here is the change of table share and backtrace:

Old value = (ha_table_option_struct *) 0x0
New value = (ha_table_option_struct *) 0x7fffbc02e688
parse_option_list (thd=0x7fffbc000d08, hton=0x555558204e68, option_struct_arg=0x7fffbc03f5d0, option_list=0x7fffbc03f5c8, rules=0x55555770b560 <innodb_table_option_list>, suppress_warning=true, roo
t=0x7fffbc03f3c8) at /home/anel/mariadb/10.5/sql/create_options.cc:277
 
+bt
#0  parse_option_list (thd=0x7fffbc000d08, hton=0x555558204e68, option_struct_arg=0x7fffbc03f5d0, option_list=0x7fffbc03f5c8, rules=0x55555770b560 <innodb_table_option_list>, suppress_warning=true, root=0x7fffbc03f3c8) at /home/anel/mariadb/10.5/sql/create_options.cc:277
#1  0x000055555601b4db in parse_engine_table_options (thd=0x7fffbc000d08, ht=0x555558204e68, share=0x7fffbc03f350) at /home/anel/mariadb/10.5/sql/create_options.cc:477
#2  0x0000555555f63a4f in TABLE_SHARE::init_from_binary_frm_image (this=0x7fffbc03f350, thd=0x7fffbc000d08, write=false, frm_image=0x7fffbc03dc28 "\376\001\n\f\022", frm_length=432, par_image=0x0, par_length=0) at /home/anel/mariadb/10.5/sql/table.cc:3181
#3  0x0000555555f5b5b9 in open_table_def (thd=0x7fffbc000d08, share=0x7fffbc03f350, flags=1) at /home/anel/mariadb/10.5/sql/table.cc:714
#4  0x00007fffd6cf05ab in ha_connect::delete_or_rename_table (this=0x7fffbc018850, name=0x7fffec2d44a0 "./test/t", to=0x0) at /home/anel/mariadb/10.5/storage/connect/ha_connect.cc:5245
#5  0x00007fffd6cf07c4 in ha_connect::delete_table (this=0x7fffbc018850, name=0x7fffec2d44a0 "./test/t") at /home/anel/mariadb/10.5/storage/connect/ha_connect.cc:5279
#6  0x00005555561798b1 in hton_drop_table (hton=0x7fffbc0313f8, path=0x7fffec2d44a0 "./test/t") at /home/anel/mariadb/10.5/sql/handler.cc:564
#7  0x000055555617ee2a in ha_delete_table (thd=0x7fffbc000d08, hton=0x7fffbc0313f8, path=0x7fffec2d44a0 "./test/t", db=0x7fffec2d57e0, alias=0x7fffec2d57f0, generate_warning=false) at /home/anel/mariadb/10.5/sql/handler.cc:2771
#8  0x0000555555f12eb8 in quick_rm_table (thd=0x7fffbc000d08, base=0x7fffbc0313f8, db=0x7fffec2d57e0, table_name=0x7fffec2d57f0, flags=4, table_path=0x0) at /home/anel/mariadb/10.5/sql/sql_table.cc:2914
#9  0x0000555555f2a9c0 in mysql_alter_table (thd=0x7fffbc000d08, new_db=0x7fffbc005558, new_name=0x7fffbc005958, create_info=0x7fffec2d63e0, table_list=0x7fffbc016440, alter_info=0x7fffec2d6310, order_num=0, order=0x0, ignore=false, if_exists=false) at /home/anel/mariadb/10.5/sql/sql_table.cc:11042
#10 0x0000555555fcd030 in Sql_cmd_alter_table::execute (this=0x7fffbc016b20, thd=0x7fffbc000d08) at /home/anel/mariadb/10.5/sql/sql_alter.cc:545
#11 0x0000555555e2d507 in mysql_execute_command (thd=0x7fffbc000d08) at /home/anel/mariadb/10.5/sql/sql_parse.cc:6024
#12 0x0000555555e3372c in mysql_parse (thd=0x7fffbc000d08, rawbuf=0x7fffbc016360 "ALTER TABLE t ENGINE InnoDB", length=27, parser_state=0x7fffec2d74d0, is_com_multi=false, is_next_command=false) at /home/anel/mariadb/10.5/sql/sql_parse.cc:8063

Here is the table_options and last->value in parse_option_list()

(gdb) p ht->table_options
+p ht->table_options
$47 = (ha_create_table_option *) 0x55555770b560 <innodb_table_option_list>
(gdb) p *ht->table_options
+p *ht->table_options
$48 = {
  type = HA_OPTION_TYPE_BOOL, 
  name = 0x555556e29e3d "PAGE_COMPRESSED", 
  name_length = 15, 
  offset = 0, 
  def_value = 0, 
  min_value = 0, 
  max_value = 0, 
  block_size = 0, 
  values = 0x0, 
  var = 0x55555770b4c0 <mysql_sysvar_compression_default>
}
 
(gdb) p last->value
+p last->value
$16 = {
  str = 0x48008b017b5e3705 <error: Cannot access memory at address 0x48008b017b5e3705>,
  length = 14378259170480117131
}

and it will go through the loop for all table_options of innodb:

ha_create_table_option innodb_table_option_list[]=
{
  /* With this option user can enable page compression feature for the
  table */
  HA_TOPTION_SYSVAR("PAGE_COMPRESSED", page_compressed, compression_default),
  /* With this option user can set zip compression level for page
  compression for this table*/
  HA_TOPTION_NUMBER("PAGE_COMPRESSION_LEVEL", page_compression_level, 0, 1, 9, 1),
  /* With this option the user can enable encryption for the table */
  HA_TOPTION_ENUM("ENCRYPTED", encryption, "DEFAULT,YES,NO", 0),
  /* With this option the user defines the key identifier using for the encryption */
  HA_TOPTION_SYSVAR("ENCRYPTION_KEY_ID", encryption_key_id, default_encryption_key_id),
 
  HA_TOPTION_END
};

(gdb) p *opt
+p *opt
$30 = {
  type = HA_OPTION_TYPE_ULL, 
  name = 0x555556e29e4d "PAGE_COMPRESSION_LEVEL", 
  name_length = 22, 
  offset = 8, 
  def_value = 0, 
  min_value = 1, 
  max_value = 9, 
  block_size = 1, 
  values = 0x0, 
  var = 0x0
 
(gdb) p *opt
+p *opt
$3 = {
  type = HA_OPTION_TYPE_ENUM, 
  name = 0x555556e29e64 "ENCRYPTED", 
  name_length = 9, 
  offset = 20, 
  def_value = 0, 
  min_value = 0, 
  max_value = 14, 
  block_size = 0, 
  values = 0x555556e29e6e "DEFAULT,YES,NO", 
  var = 0x0
}
 
 
(gdb) p *++opt
+p *++opt
$32 = {
  type = HA_OPTION_TYPE_ULL, 
  name = 0x555556e29e7d "ENCRYPTION_KEY_ID", 
  name_length = 17, 
  offset = 24, 
  def_value = 1, 
  min_value = 1, 
  max_value = 4294967295, 
  block_size = 0, 
  values = 0x0, 
  var = 0x55555770b500 <mysql_sysvar_default_encryption_key_id>
}
}

marko regarding ^ table_option result, how do we read this, what that means and is it possible to add some workaround in function parse_option_list() with respect to SQL_COM_ALTER_TABLE ?

After returning to ha_connect::delete_or_rename_table() from open_table_def() we got table share

(gdb) p *share->option_struct 
+p *share->option_struct
$44 = {
  type = 0xa5a5a5a5a5a5a500 <error: Cannot access memory at address 0xa5a5a5a5a5a5a500>, 
  filename = 0x0, 
  optname = 0xa5a5a5a5 <error: Cannot access memory at address 0xa5a5a5a5>, 
  tabname = 0x1 <error: Cannot access memory at address 0x1>, 
  tablist = 0x8f8f8f8fffffffff <error: Cannot access memory at address 0x8f8f8f8fffffffff>, 
  dbname = 0x8f8f8f8f8f8f8f8f <error: Cannot access memory at address 0x8f8f8f8f8f8f8f8f>, 
  separator = 0x8f8f8f8f8f8f8f8f <error: Cannot access memory at address 0x8f8f8f8f8f8f8f8f>, 
  qchar = 0x8f8f8f8f8f8f8f8f <error: Cannot access memory at address 0x8f8f8f8f8f8f8f8f>, 
  module = 0x8f8f8f8f8f8f8f8f <error: Cannot access memory at address 0x8f8f8f8f8f8f8f8f>, 
  subtype = 0x8f8f8f8f8f8f8f8f <error: Cannot access memory at address 0x8f8f8f8f8f8f8f8f>, 
  catfunc = 0x8f8f8f8f8f8f8f8f <error: Cannot access memory at address 0x8f8f8f8f8f8f8f8f>, 
  srcdef = 0x8f8f8f8f8f8f8f8f <error: Cannot access memory at address 0x8f8f8f8f8f8f8f8f>, 
  colist = 0x8f8f8f8f8f8f8f8f <error: Cannot access memory at address 0x8f8f8f8f8f8f8f8f>, 
  filter = 0x8f8f8f8f8f8f8f8f <error: Cannot access memory at address 0x8f8f8f8f8f8f8f8f>, 
  oplist = 0x8f8f8f8f8f8f8f8f <error: Cannot access memory at address 0x8f8f8f8f8f8f8f8f>, 
  data_charset = 0x8f8f8f8f8f8f8f8f <error: Cannot access memory at address 0x8f8f8f8f8f8f8f8f>, 
  http = 0x8f8f8f8f8f8f8f8f <error: Cannot access memory at address 0x8f8f8f8f8f8f8f8f>, 
  uri = 0x8f8f8f8f8f8f8f8f <error: Cannot access memory at address 0x8f8f8f8f8f8f8f8f>, 
  lrecl = 10344644715844964239, 
  elements = 10344644715844964239, 
  multiple = 10344644715844964239, 
  header = 10344644715844964239, 
  quoted = 10344644715844964239, 
  ending = 10344644715844964239, 
  compressed = 10344644715844964239, 
  mapped = 143, 
  huge = 143, 
  split = 143, 
  readonly = 143, 
  sepindex = 143, 
  zipped = 143
}

I tried various patches to solve this problem, but no effect.
What I didn't understand is why second call of {[quick_rm_table()}} with engine_changed is needed, if original table is not deleted (at least in this 2 cases myisam, aria and the bug is for innodb) monty?
Solution would be to drop that change, or to tweak innodb not to populate table_options, if it is possible (marko)? or somehow to check for not allocated part of table share->option_struct.
Will add need_feedback about this.

Comment by Marko Mäkelä [ 2021-03-03 ]

Table options were implemented by serg (see MDEV-4022) and @sanja. I do not know how exactly they would work or would be supposed to work in a cross-engine ALTER TABLE.

Comment by Olivier Bertrand [ 2021-03-03 ]

The delete_or_rename_table function purpose in this context is called so the data files of the original table can be removed. Indeed when creating the Connect f table a f.dos file was made by default in the current directory and can eventually contain data if an insert statement had been used. It is normal that the original table still exist at that point because it is used to locate the file to erase.

The problem is that when

    bool got_error= open_table_def(thd, share);

is executed, sometimes (not always) the returning share option struct pointer is wrong and, when some functions using it are called, an exception is raised causing the server to crash.

Apparently the bug is why MariaDB 10.5 return that wrong pointer when other versions do not.

Comment by Anel Husakovic [ 2021-03-05 ]

bertrandop here is the fix https://github.com/an3l/server/commit/086b9167bcbea369bf2bae0843e0085123701d29
based on commit https://github.com/MariaDB/server/commit/043a3a0176e2#diff-f223b918b8e982bb3edaed26dc567ac653c0cf35f5ca624e2e3b664d4be5d49dR10333 which introduced it.
Example of the test with the fix:

worker[1] Using MTR_BUILD_THREAD 300, with reserved ports 16000..16019
INSTALL SONAME 'ha_connect';
CREATE TABLE t (f INT) ENGINE=CONNECT;
Warnings:
Warning	1105	No table_type. Will be set to DOS
Warning	1105	No file name. Table will use t.dos
show create table t;
Table	Create Table
t	CREATE TABLE `t` (
  `f` int(11) DEFAULT NULL
) ENGINE=CONNECT DEFAULT CHARSET=latin1
ALTER TABLE t ENGINE InnoDB;
show create table t;
Table	Create Table
t	CREATE TABLE `t` (
  `f` int(11) DEFAULT NULL
) ENGINE=InnoDB DEFAULT CHARSET=latin1
DROP TABLE t;
CREATE TABLE t (f INT) ENGINE=CONNECT;
Warnings:
Warning	1105	No table_type. Will be set to DOS
Warning	1105	No file name. Table will use t.dos
Warning	1105	Default file /dev/shm/var_auto_p2EI/mysqld.1/data/./test/t.dos already exists
show create table t;
Table	Create Table
t	CREATE TABLE `t` (
  `f` int(11) DEFAULT NULL
) ENGINE=CONNECT DEFAULT CHARSET=latin1
ALTER TABLE t ENGINE Aria;
show create table t;
Table	Create Table
t	CREATE TABLE `t` (
  `f` int(11) DEFAULT NULL
) ENGINE=Aria DEFAULT CHARSET=latin1 PAGE_CHECKSUM=1
DROP TABLE t;
CREATE TABLE t (f INT) ENGINE=CONNECT;
Warnings:
Warning	1105	No table_type. Will be set to DOS
Warning	1105	No file name. Table will use t.dos
Warning	1105	Default file /dev/shm/var_auto_p2EI/mysqld.1/data/./test/t.dos already exists
show create table t;
Table	Create Table
t	CREATE TABLE `t` (
  `f` int(11) DEFAULT NULL
) ENGINE=CONNECT DEFAULT CHARSET=latin1
ALTER TABLE t ENGINE MYISAM;
show create table t;
Table	Create Table
t	CREATE TABLE `t` (
  `f` int(11) DEFAULT NULL
) ENGINE=MyISAM DEFAULT CHARSET=latin1
DROP TABLE t;
UNINSTALL SONAME 'ha_connect';
main.anel 'innodb'                       [ pass ]     16
--------------------------------------------------------------------------
The servers were restarted 0 times
Spent 0.016 of 1 seconds executing testcases
 
Completed: All 1 tests were successful.

However there are failing tests in other suite with this fix.

Failing test(s): main.log_tables main.mysqldump main.mysql_upgrade main.statistics main.partition_cache_innodb main.mysql_upgrade_view main.column_compression main.merge main.type_bit_innodb main.type_temporal_innodb

Will assign monty to review and suggest how to proceed for 10.5.
Alternative could be to disable initialization of table_options in innobase (https://github.com/MariaDB/server/blob/10.5/storage/innobase/handler/ha_innodb.cc#L3847)([~serg] )? but with that there are other types of errors.

Comment by Michael Widenius [ 2021-03-10 ]

o Answer Anel's question about two calls to quick_rm_table:

When doing an alter table from one engine to another, we need two
calls to quick_rm_table().

  • First is to delete just the .frm file of the #sql-backup table. This contains
    a backup of the original table definition.
  • The second call is to delete the original table from the OLD engine.

The difference compared to before and after "commit 043a3a0176" is
that if the engine ios changed in alter table, we don't create
a #sql-backup in the new engine but instead create directly the
final name.

For example, assume you do for an myisam table

alter table t1 engine=aria;

The files during alter table in the new code are:

t1.MAI
t1.MAD
t1.frm
t1.MYD
t1.MYI
#sql-backup-4b21-3.frm

In the old code we would had:
t1.frm
t1.MYD
t1.MYI
#sql-backup-4b21-3.frm
#sql-backup-4b21-3.MAI
#sql-backup-4b21-3.MAD

The new code avoids an extra rename (or move) of #sql-backup-4b21-3.MAI
and #sql-backup-4b21-3.MAD to t1.MAI and t1.MAD.

The problem with connect is that I had not anticipated that the engine would internally, as part of drop table, try to open
the original .frm file.

The fix is to ensure that we are using the old method with connect engine.
This could probably be achieved by adding HA_REUSES_FILE_NAMES as ha_table_flag for connect.

Inital testing suggests that this works. Now running a full test before pushing

Comment by Olivier Bertrand [ 2021-03-10 ]

The problem with connect is that I had not anticipated that the engine would internally, as part of drop table, try to open the original .frm file
Most Connect tables don't need to do anything in that context. Only "inward" tables (the ones for which the data file were not specified at creation and has been by default made in the current database) must erase that file. To do this, they don't need to really "open" the frm but just to know what is the name of the file to delete. For this they need the Option structure retrieved from the share structure.
The above bug is that when retrieving the pointer to the Option from the share structure, an invalid pointer is returned by version 10.5 (only?) causing the crash when used.

Comment by Michael Widenius [ 2021-03-10 ]

Pushed to 10.5

Comment by Anel Husakovic [ 2021-03-11 ]

Pushed with the patch https://github.com/MariaDB/server/commit/1799caa3a1305d21acaa37169e6b14307b4b5f08 thanks monty

Generated at Thu Feb 08 09:29:52 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.