[MDEV-14249] Wrong character set info of Query_log_event and the query in Query_log_event constructed by different charsets cause error when slave apply the event. Created: 2017-11-02  Updated: 2017-12-27  Resolved: 2017-12-27

Status: Closed
Project: MariaDB Server
Component/s: Character Sets, Replication
Affects Version/s: 10.1, 10.2
Fix Version/s: 10.2.12

Type: Bug Priority: Major
Reporter: Ze Yang Assignee: Alexander Barkov
Resolution: Fixed Votes: 0
Labels: contribution

Attachments: File new_charset_err_fix.diff    
Sprint: 10.2.12

 Description   

environment:Linux system_charset_info utf8.

TestCase

--disable_warnings
--source include/master-slave.inc
--enable_warnings
 
# view with chinese when charset not utf.
 
create table t(c1 int);
SET @@session.character_set_client=gbk;
set @@session.collation_connection=gbk_chinese_ci;
set @@session.collation_server=utf8_general_ci;
create view `收费明细` as select * from t;
drop view `收费明细`;
show tables;
 
--sync_slave_with_master
 
connection slave;
show tables;
 
 
connection master;
drop table t;
 
# memory table
SET @@session.character_set_client=utf8;
set @@session.collation_connection=utf8_general_ci;
set @@session.collation_server=utf8_general_ci;
create table `收费明细表`(c1 int) engine=memory;
create view tv as select * from `收费明细表`;
 
--connection slave
-- source include/stop_slave.inc
 
--let $rpl_server_number= 1
--source include/rpl_restart_server.inc
# access memory table after restarting server cause binlog 'delete from tableName'
connection master;
SET @@session.character_set_client=gbk;
set @@session.collation_connection=gbk_chinese_ci;
set @@session.collation_server=utf8_general_ci;
select * from tv;
 
 
 
--connection slave
-- source include/start_slave.inc
connection master;
--sync_slave_with_master
 
# procedure with chinese when charset not utf.
connection master;
delimiter $$;
create procedure 收费明细()
begin
  select 'hello world';
end $$
delimiter ;$$
drop procedure `收费明细`;
 
connection master;
SET @@session.character_set_client=utf8;
set @@session.collation_connection=utf8_general_ci;
set @@session.collation_server=utf8_general_ci;
drop view tv;
drop table `收费明细表`;
 
## column comment in chinese.
connection master;
set character_set_client = utf8;
set character_set_connection = utf8;
set character_set_database = utf8;
set character_set_results = utf8;
set character_set_server = utf8;
 
CREATE TABLE `t1` (
  `id` int(11) NOT NULL,
  `orderType` char(6) NOT NULL DEFAULT '已创建',
  PRIMARY KEY (`id`)
);
 
show create table t1;
 
## switch client charset
set character_set_client = latin1;
CREATE TABLE t2 SELECT * FROM t1;
drop table t1;
drop table t2;
--sync_slave_with_master
 
connection slave;
show tables;
 
 
--source include/rpl_end.inc

Slave SQL Error

Last_SQL_Errno 1300
Last_SQL_Error Error 'Invalid gbk character string: '\xE9\x8F\x80\xE6\x83\xB0\xE5\x9E\x82\xE9\x8F\x84\xE5\xBA\xA3\xE7'' on query. Default database: 'test'. Query: 'CREATE ALGORITHM=UNDEFINED DEFINER=`root`@`localhost` SQL SECURITY DEFINER VIEW `鏀惰垂鏄庣粏` AS select * from t'

Bug analysis:

MariaDB 10.2
1. the query in Query_log_event constructed by string in different charsets.

sql/sql_view.cc:  mysql_create_view 
  /*The identifier parsed by parser would change to system_info character. The table name is in utf8 character.*/
669    append_identifier(thd, &buff, views->table_name,
                      views->table_name_length);
 
/* The source.str is in character_set_client.*/
685     buff.append(views->source.str, views->source.length); 
 
 
log_event.cc:  Query_log_event::Query_log_event(THD* thd_arg, const char* query_arg,...)
 
/* The Query_log_event store charset of thd's variable.  But the view name is utf8.*/
 4037   int2store(charset, thd_arg->variables.character_set_client->number);
 4038   int2store(charset+2, thd_arg->variables.collation_connection->number);
 4039   int2store(charset+4, thd_arg->variables.collation_server->number);
 
The character of identifier  parsed by parser is system charset info. The source string may be other character. The view definition constructed by server contains two characters in one string. This problem also exist in procedure.
 
sql/sp.cc:
 
1281     log_query.set_charset(system_charset_info);
1283     if (!show_create_sp(thd, &log_query,
1284                        sp->m_type,
1285                        (sp->m_explicit_name ? sp->m_db.str : NULL),  //!these identifiers use system charset.
1286                        (sp->m_explicit_name ? sp->m_db.length : 0),
1287                        sp->m_name.str, sp->m_name.length,
1288                        sp->m_params.str, sp->m_params.length,
1289                        retstr.ptr(), retstr.length(),
1290                        sp->m_body.str, sp->m_body.length, //!the body use character_set_client
1291                        sp->m_chistics, &(thd->lex->definer->user),
1292                        &(thd->lex->definer->host),
1293                        saved_mode))

2. The charset info in Query_log_event is wrong.

The delete statement generated by server.

sql/sql_base.cc
2645 static bool open_table_entry_fini
2667       append_identifier(thd, &query, share->table_name.str,
2668                           share->table_name.length);  //system charset info
    ...
2674       Query_log_event qinfo(thd, query.ptr(), query.length(),
2675                             FALSE, TRUE, TRUE, 0);  //use thd charset info
2676       if (mysql_bin_log.write(&qinfo))
 
 
The create table statement generated by server.
sql/sql_insert.cc   select_create::binlog_show_create_table
4365 select_create::binlog_show_create_table  //Query use system charset info

The diff below just fix the replication error. The character_set_client and collation_connection would be different between master and slave.



 Comments   
Comment by Ze Yang [ 2017-11-17 ]

I upload the new fix.diff.

Comment by Alexander Barkov [ 2017-12-22 ]

Thanks for reporting the problem, good analysis and the patch!

Comment by Alexander Barkov [ 2017-12-22 ]

This is a minimal script reproducing the problem with VIEWs:

--disable_warnings
--source include/master-slave.inc
--enable_warnings
 
#
# The below tests uses a sequence of bytes 0xD191,
# which in a utf8 console looks like ё (CYRILIC SMALL LETTER YO).
# Don't be mislead. This sequence is used in latin1 context and
# represents a sequence of two characters:
# U+00D1 CAPITAL LATIN LETTER N WITH TILDE (_latin1 0xD1)
# U+2018 LEFT SINGLE QUOTATION MARK        (_latin1 0x91)
#
 
SET NAMES latin1;
CREATE VIEW `ё` AS SELECT 'ё';
DROP VIEW `ё`;
SHOW TABLES;
 
--sync_slave_with_master
 
--source include/rpl_end.inc

It fails with this error on the slave:

Last_Error      Error 'Unknown table 'test.Ñ‘'' on query. Default database: 'test'. Query: 'DROP VIEW `ё`'

Comment by Alexander Barkov [ 2017-12-22 ]

A similar minimum script reproducing the problem with SP:

--disable_warnings
--source include/master-slave.inc
--enable_warnings
 
#
# The below tests uses a sequence of bytes 0xD191,
# which in a utf8 console looks like ё (CYRILIC SMALL LETTER YO).
# Don't be mislead. This sequence is used in latin1 context and
# represents a sequence of two characters:
# U+00D1 CAPITAL LATIN LETTER N WITH TILDE (_latin1 0xD1)
# U+2018 LEFT SINGLE QUOTATION MARK        (_latin1 0x91)
#
 
SET NAMES latin1;
CREATE PROCEDURE `ё`() SELECT 'ё';
DROP PROCEDURE `ё`;
SHOW TABLES;
 
--sync_slave_with_master
 
--source include/rpl_end.inc

It fails with this error on the slave:

Last_Error      Error 'PROCEDURE test.Ñ‘ does not exist' on query. Default database: 'test'. Query: 'DROP PROCEDURE `ё`'

Generated at Thu Feb 08 08:12:05 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.