[MDEV-12725] select on federated table crashes server Created: 2017-05-08  Updated: 2020-12-08  Resolved: 2017-08-23

Status: Closed
Project: MariaDB Server
Component/s: Storage Engine - Federated
Affects Version/s: 10.1, 10.1.23, 10.2
Fix Version/s: 10.1.26, 10.2.8

Type: Bug Priority: Major
Reporter: Rob Assignee: Sergei Golubchik
Resolution: Fixed Votes: 3
Labels: regression
Environment:

Centos 6


Issue Links:
Duplicate
is duplicated by MDEV-13098 [ERROR] mysqld got signal 11 Closed
is duplicated by MDEV-14048 Mariadb crashes with signal 11 when u... Closed
Relates
relates to MDEV-12951 Server crash [mysqld got exception 0x... Closed
relates to MDEV-13163 10.1.24-MariaDB crash report Closed
relates to MDEV-15617 [ERROR] mysqld got signal 11 ; Closed

 Description   

The following error occurred:

This has happened since the upgrade from 10.1.22 to 10.1.23,
It has happened now 6 times
Each time on a query that selects from a federated table and a non federated table.

We have been running this query for more than a year, without any problems. It started happening straight after the upgrade.

It does not happen all the time. I would say about once in each ten runs of such a query.

170508  7:08:47 [ERROR] mysqld got signal 11 ;
This could be because you hit a bug. It is also possible that this binary
or one of the libraries it was linked against is corrupt, improperly built,
or misconfigured. This error can also be caused by malfunctioning hardware.
 
To report this bug, see https://mariadb.com/kb/en/reporting-bugs
 
We will try our best to scrape up some info that will hopefully help
diagnose the problem, but since we have already crashed, 
something is definitely wrong and this may fail.
 
Server version: 10.1.23-MariaDB
key_buffer_size=268435456
read_buffer_size=1048576
max_used_connections=7
max_threads=2002
thread_count=5
It is possible that mysqld could use up to 
key_buffer_size + (read_buffer_size + sort_buffer_size)*max_threads = 4403338 K  bytes of memory
Hope that's ok; if not, decrease some variables in the equation.
 
Thread pointer: 0x7f6f3a61f008
Attempting backtrace. You can use the following information to find out
where mysqld died. If you see no messages after this, something went
terribly wrong...
stack_bottom = 0x7f6ef3a71140 thread_stack 0x48400
/usr/sbin/mysqld(my_print_stacktrace+0x2b)[0x7fb113b96c8b]
/usr/sbin/mysqld(handle_fatal_signal+0x4d5)[0x7fb1136f1bf5]
/lib64/libpthread.so.0(+0xf7e0)[0x7fb112cf17e0]
/usr/sbin/mysqld(_ZN17Query_cache_block11headers_lenEv+0x0)[0x7fb113544ff0]
/usr/sbin/mysqld(_ZN17Query_cache_block4dataEv+0x11)[0x7fb113545011]
/usr/sbin/mysqld(_ZN11Query_cache6insertEP3THDP15Query_cache_tlsPKcmj+0x63)[0x7fb1135488b3]
/usr/sbin/mysqld(net_real_write+0x41)[0x7fb1134f5801]
/usr/sbin/mysqld(net_flush+0x23)[0x7fb1134f5dc3]
/usr/sbin/mysqld(net_write_command+0x18a)[0x7fb1134f5f7a]
/usr/sbin/mysqld(cli_advanced_command+0xf7)[0x7fb1136d20c7]
/usr/sbin/mysqld(mysql_send_query+0x31)[0x7fb1136cf7a1]
/usr/sbin/mysqld(mysql_real_query+0x11)[0x7fb1136cf7c1]
/usr/lib64/mysql/plugin/ha_federatedx.so(_ZN19federatedx_io_mysql12actual_queryEPKcj+0x3a)[0x7fb0f70f501a]
/usr/lib64/mysql/plugin/ha_federatedx.so(_ZN19federatedx_io_mysql5queryEPKcj+0x88)[0x7fb0f70f5168]
mysys/stacktrace.c:268(my_print_stacktrace)[0x7fb0f70f497f]
sql/signal_handler.cc:168(handle_fatal_signal)[0x7fb0f70f08f4]
sql/net_serv.cc:609(net_real_write)[0x7fb1135cc436]
/usr/sbin/mysqld(_ZN4JOIN14optimize_innerEv+0x74e)[0x7fb1135ce7fe]
/usr/sbin/mysqld(_ZN4JOIN8optimizeEv+0x37)[0x7fb1135d1047]
/usr/sbin/mysqld(_Z12mysql_selectP3THDPPP4ItemP10TABLE_LISTjR4ListIS1_ES2_jP8st_orderSB_S2_SB_yP13select_resultP18st_select_lex_unitP13st_select_lex+0x9d)[0x7fb1135d119d]
sql/sql_select.cc:3580(make_join_statistics)[0x7fb1135d4b4d]
sql/sql_select.cc:1366(JOIN::optimize_inner())[0x7fb113574de2]
/usr/sbin/mysqld(_Z21mysql_execute_commandP3THD+0x5b3a)[0x7fb1135810da]
sql/sql_parse.cc:5921(execute_sqlcom_select)[0x7fb1135843c4]
sql/sql_parse.cc:1490(dispatch_command(enum_server_command, THD*, char*, unsigned int))[0x7fb113586e43]
sql/sql_parse.cc:1111(do_command(THD*))[0x7fb1135873c1]
sql/sql_connect.cc:1349(do_handle_one_connection(THD*))[0x7fb1136462cf]
sql/sql_connect.cc:1263(handle_one_connection)[0x7fb113646407]
/lib64/libpthread.so.0(+0x7aa1)[0x7fb112ce9aa1]
/lib64/libc.so.6(clone+0x6d)[0x7fb1111cdbcd]
 
Trying to get some variables.
Some pointers may be invalid and cause the dump to abort.
Query (0x7f6f44c20020): is an invalid pointer
Connection ID (thread ID): 145
Status: NOT_KILLED
 
Optimizer switch: index_merge=on,index_merge_union=on,index_merge_sort_union=on,index_merge_intersection=on,index_merge_sort_intersection=off,engine_condition_pushdown=off,index_condition_pushdown=on,derived_merge=on,derived_with_keys=on,firstmatch=on,loosescan=on,materialization=on,in_to_exists=on,semijoin=on,partial_match_rowid_merge=on,partial_match_table_scan=on,subquery_cache=on,mrr=off,mrr_cost_based=off,mrr_sort_keys=off,outer_join_with_cache=on,semijoin_with_cache=on,join_cache_incremental=on,join_cache_hashed=on,join_cache_bka=on,optimize_join_buffer_size=off,table_elimination=on,extended_keys=on,exists_to_in=on,orderby_uses_equalities=off
 
The manual page at http://dev.mysql.com/doc/mysql/en/crashing.html contains
information that should help you find out what is causing the crash.
 
We think the query pointer is invalid, but we will try to print it anyway. 
Query: select count(*) from director.task t where FALSE  or (t.PROCID=(select if(CALCPROCID is null,PROCID,CALCPROCID) from director.proc where PROCID=55) 
                                                                        and t.SUBJECTID='533ecae8ea68c' and t.SUBJECTTPID=13) or (t.PROCID=(select if(CALCPROCID is null,PROCID,CALCPROCID) from director.proc where PROCID=247) 
                                                                        and t.SUBJECTID='533ecae8ea68c' and t.SUBJECTTPID=13) or (t.PROCID=(select if(CALCPROCID is null,PROCID,CALCPROCID) from director.proc where PROCID=57) 
                                                                        and t.SUBJECTID='533ecae8ea68c' and t.SUBJECTTPID=13)
 
170508 07:08:47 mysqld_safe Number of processes running now: 0
170508 07:08:47 mysqld_safe mysqld restarted



 Comments   
Comment by Rob [ 2017-05-08 ]

We are investigating more issues and all server crashes are related to joins over federated and not federated tables.

Comment by Elena Stepanova [ 2017-05-08 ]

Could you please paste the stack trace from non-federated crash?
I believe this is a representation of the bug MDEV-12673, but from the Federated crash it is not easy to confirm.
Alternatively, maybe you could provide the data dump which would allow us to reproduce the problem with Federated.

Comment by Rob [ 2017-05-24 ]

Hi Elena,

thanks for that

Below is the error I thought was not federated as the query it printed did not use the federated engine. Looking at the stack trace it still seems to be the federated engine.

In the mean time I have made my application connect straight to the other server and split up some queries in two. I did not have any further problems since than.

Let me know if I can still help.

Rob

170516  0:01:51 [ERROR] mysqld got signal 11 ;
This could be because you hit a bug. It is also possible that this binary
or one of the libraries it was linked against is corrupt, improperly built,
or misconfigured. This error can also be caused by malfunctioning hardware.
 
To report this bug, see https://mariadb.com/kb/en/reporting-bugs
 
We will try our best to scrape up some info that will hopefully help
diagnose the problem, but since we have already crashed,
something is definitely wrong and this may fail.
 
Server version: 10.1.23-MariaDB
key_buffer_size=268435456
read_buffer_size=1048576
max_used_connections=170
max_threads=2002
thread_count=170
It is possible that mysqld could use up to
key_buffer_size + (read_buffer_size + sort_buffer_size)*max_threads = 4403338 K  bytes of memory
Hope that's ok; if not, decrease some variables in the equation.
 
Thread pointer: 0x7f233fdc3008
Attempting backtrace. You can use the following information to find out
where mysqld died. If you see no messages after this, something went
terribly wrong...
stack_bottom = 0x7f5912aea140 thread_stack 0x48400
/usr/sbin/mysqld(my_print_stacktrace+0x2b)[0x7f651476ac8b]
/usr/sbin/mysqld(handle_fatal_signal+0x4d5)[0x7f65142c5bf5]
/lib64/libpthread.so.0(+0xf7e0)[0x7f65138c57e0]
/usr/sbin/mysqld(_ZN17Query_cache_block11headers_lenEv+0x0)[0x7f6514118ff0]
/usr/sbin/mysqld(_ZN17Query_cache_block4dataEv+0x11)[0x7f6514119011]
/usr/sbin/mysqld(_ZN11Query_cache6insertEP3THDP15Query_cache_tlsPKcmj+0x63)[0x7f651411c8b3]
/usr/sbin/mysqld(net_real_write+0x41)[0x7f65140c9801]
/usr/sbin/mysqld(net_flush+0x23)[0x7f65140c9dc3]
/usr/sbin/mysqld(net_write_command+0x18a)[0x7f65140c9f7a]
/usr/sbin/mysqld(cli_advanced_command+0xf7)[0x7f65142a60c7]
/usr/sbin/mysqld(mysql_close_slow_part+0x54)[0x7f65142a3684]
/usr/sbin/mysqld(mysql_close+0x1a)[0x7f65142a36ba]
/usr/lib64/mysql/plugin/ha_federatedx.so(_ZN19federatedx_io_mysqlD2Ev+0x23)[0x7f64f7cf4ce3]
/usr/lib64/mysql/plugin/ha_federatedx.so(_ZN14federatedx_txn5closeEP18st_fedrated_server+0x69)[0x7f64f7cf3c29]
/usr/lib64/mysql/plugin/ha_federatedx.so(+0xaa9b)[0x7f64f7ceda9b]
/usr/lib64/mysql/plugin/ha_federatedx.so(+0xac50)[0x7f64f7cedc50]
/usr/lib64/mysql/plugin/ha_federatedx.so(_ZN13ha_federatedx5closeEv+0x98)[0x7f64f7cf0c18]
/usr/sbin/mysqld(_Z8closefrmP5TABLEb+0x38)[0x7f65141efc58]
/usr/sbin/mysqld(_Z18intern_close_tableP5TABLE+0x36)[0x7f651410b146]
mysys/stacktrace.c:268(my_print_stacktrace)[0x7f6514278b3b]
sql/sql_cache.cc:850(Query_cache_block::data())[0x7f651410da72]
sql/sql_string.h:312(String::free())[0x7f65141118f2]
sql/sql_base.cc:5258(open_and_lock_tables(THD*, DDL_options_st const&, TABLE_LIST*, bool, unsigned int, Prelocking_strategy*))[0x7f6514111c64]
/usr/sbin/mysqld(+0x452d83)[0x7f6514148d83]
/usr/sbin/mysqld(_Z21mysql_execute_commandP3THD+0x5b3a)[0x7f65141550da]
sql/sql_parse.cc:5841(execute_sqlcom_select)[0x7f65141583c4]
sql/sql_parse.cc:1490(dispatch_command(enum_server_command, THD*, char*, unsigned int))[0x7f651415ae43]
sql/sql_parse.cc:1111(do_command(THD*))[0x7f651415b3c1]
sql/sql_connect.cc:1349(do_handle_one_connection(THD*))[0x7f651421a2cf]
sql/sql_connect.cc:1263(handle_one_connection)[0x7f651421a407]
/lib64/libpthread.so.0(+0x7aa1)[0x7f65138bdaa1]
/lib64/libc.so.6(clone+0x6d)[0x7f6511da1bcd]
 
Trying to get some variables.
Some pointers may be invalid and cause the dump to abort.
Query (0x7f234d1c7020): is an invalid pointer
Connection ID (thread ID): 6520
Status: NOT_KILLED
 
Optimizer switch: index_merge=on,index_merge_union=on,index_merge_sort_union=on,index_merge_intersection=on,index_merge_sort_intersection=off,engine_condition_pushdown=off,index_condition_pushdown=on,derived_merge=on,derived_with_keys=on,firstmatch=on,loosescan=on,materialization=on,in_to_exists=on,semijoin=on,partial_match_rowid_merge=on,partial_match_table_scan=on,subquery_cache=on,mrr=off,mrr_cost_based=off,mrr_sort_keys=off,outer_join_with_cache=on,semijoin_with_cache=on,join_cache_incremental=on,join_cache_hashed=on,join_cache_bka=on,optimize_join_buffer_size=off,table_elimination=on,extended_keys=on,exists_to_in=on,orderby_uses_equalities=off
 
The manual page at http://dev.mysql.com/doc/mysql/en/crashing.html contains
information that should help you find out what is causing the crash.
 
We think the query pointer is invalid, but we will try to print it anyway.
Query: 
170516 00:01:52 mysqld_safe Number of processes running now: 0
170516 00:01:52 mysqld_safe mysqld restarted

Comment by Elena Stepanova [ 2017-06-02 ]

Rob,

Could you please paste the output of SHOW CREATE TABLE director.task and SHOW INDEX IN director.task, and also the same for the underlying non-federated table that it points at?

Better still, if you can upload the data dump, but I realize it might be not possible if it's production. In case you are willing to do it, you can upload it to ftp.askmonty.org/private, only MariaDB developers will have access to it.

Please also attach your cnf file(s).

Thanks.

Comment by Rob [ 2017-06-02 ]

SHOW CREATE TABLE director.task and SHOW INDEX IN director.task, and also the same for the underlying non-federated table that it points at?

CREATE TABLE `task` (
  `PROCID` int(10) unsigned NOT NULL,
  `SUBJECTTPID` int(10) unsigned NOT NULL COMMENT 'Universe',
  `SUBJECTID` varchar(55) NOT NULL COMMENT 'Universe',
  `DT` date NOT NULL,
  `PRIORITY` smallint(5) unsigned NOT NULL,
  `INSERTDT` datetime DEFAULT NULL,
  `STARTDT` datetime DEFAULT NULL,
  `TASKCOUNT` smallint(5) unsigned NOT NULL DEFAULT '1',
  `STATUS` tinyint(1) unsigned NOT NULL DEFAULT '0',
  `HISTORIC` tinyint(1) unsigned NOT NULL DEFAULT '0',
  `FORSUBJECTTPID` int(10) unsigned NOT NULL DEFAULT '0' COMMENT 'STP to calc for',
  `SERVERID` tinyint(1) unsigned DEFAULT NULL,
  PRIMARY KEY (`PROCID`,`SUBJECTTPID`,`SUBJECTID`,`HISTORIC`,`FORSUBJECTTPID`) USING BTREE,
  KEY `Prio` (`PRIORITY`),
  KEY `task_update` (`SERVERID`,`STATUS`,`PROCID`)
) ENGINE=FEDERATED DEFAULT CHARSET=utf8 CONNECTION='dispatch/task'

+-------+------------+-------------+--------------+----------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| Table | Non_unique | Key_name    | Seq_in_index | Column_name    | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment | Index_comment |
+-------+------------+-------------+--------------+----------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| task  |          0 | PRIMARY     |            1 | PROCID         | NULL      |        NULL |     NULL | NULL   |      | REMOTE     |         |               |
| task  |          0 | PRIMARY     |            2 | SUBJECTTPID    | NULL      |        NULL |     NULL | NULL   |      | REMOTE     |         |               |
| task  |          0 | PRIMARY     |            3 | SUBJECTID      | NULL      |        NULL |     NULL | NULL   |      | REMOTE     |         |               |
| task  |          0 | PRIMARY     |            4 | HISTORIC       | NULL      |        NULL |     NULL | NULL   |      | REMOTE     |         |               |
| task  |          0 | PRIMARY     |            5 | FORSUBJECTTPID | NULL      |        NULL |     NULL | NULL   |      | REMOTE     |         |               |
| task  |          1 | Prio        |            1 | PRIORITY       | NULL      |        NULL |     NULL | NULL   |      | REMOTE     |         |               |
| task  |          1 | task_update |            1 | SERVERID       | NULL      |        NULL |     NULL | NULL   | YES  | REMOTE     |         |               |
| task  |          1 | task_update |            2 | STATUS         | NULL      |        NULL |     NULL | NULL   |      | REMOTE     |         |               |
| task  |          1 | task_update |            3 | PROCID         | NULL      |        NULL |     NULL | NULL   |      | REMOTE     |         |               |
+-------+------------+-------------+--------------+----------------+-----------+-------------+----------+--------+------+------------+---------+---------------+

Comment by Rob [ 2017-06-02 ]

[client]
port            = 3306
socket          = /var/lib/mysql/mysql.sock
 
[mysql]
default-character-set=utf8
no-auto-rehash
#net_write_timeout=120
 
[mysqld]
tmpdir=/tmp # /data/tmp if the linux drive is too small
datadir=/data/mysql
port            = 3306
socket          = /var/lib/mysql/mysql.sock
skip-external-locking
key_buffer_size = 256M
max_allowed_packet = 1G
table_open_cache = 256
sort_buffer_size = 1M
read_buffer_size = 1M
read_rnd_buffer_size = 4M
myisam_sort_buffer_size = 64M
thread_cache_size = 8
query_cache_size= 100M
lower_case_table_names=1
default_storage_engine=InnoDB
#default_time_zone=Etc/UTC #May only be added once timezone tables are loaded
tmp_table_size = 4096M
max_heap_table_size = 4096M
 
# replication
server-id = 58                                  # the server number ctc-30 means 30
 
 
slave_compressed_protocol = 1
 
# on master
log-bin=mysql-bin
log-bin=/data2/mysql/mysql-bin
expire_logs_days=10                                     # Make sure the number of days times daily log creation is less than 70% of available disk space
innodb_flush_log_at_trx_commit=1
sync_binlog=1
 
# for master master
auto_increment_increment=2
auto_increment_offset=2 # should be different on the other master
 
# on slave
replicate-wild-ignore-table=temp.%
replicate-ignore-db = mysql
replicate-ignore-table=director.task
replicate-ignore-table=director.tasklog
relay-log=ctc-relay-bin                # on slave only - note: if copying from another slave, make sure it is the same as on the existing slave
relay-log=ctc-relay-bin                # on slave only - note: if copying from another slave, make sure it is the same as on the existing slave
relay-log-index=ctc-relay-bin.index    # on slave only - note: if copying from another slave, make sure it is the same as on the existing slave
slave_parallel_threads=16
slave-parallel-mode=optimistic
 
#to ensure that users with problems do not kill other users interaction
max_connections=2000 # number of directors times max_user_connections
max_user_connections=250
 
#for fulltext search
ft_min_word_len=2
 
# Innodb parameters
innodb_buffer_pool_size = 240G #80% of physical memory
innodb_buffer_pool_instances=64
innodb_log_file_size = 80G
innodb_file_per_table
innodb_lock_wait_timeout=100
# Calculate index stats more often
innodb_stats_sample_pages=100
############For disk based servers
# innodb_log_buffer_size=4M
# Number of CPU's*2
innodb_thread_concurrency = 16
############For fusion io
innodb_log_buffer_size=1G
#innodb_thread_concurrency=0
#innodb_read_ahead=0
#innodb_read_io_threads=16
#innodb_write_io_threads=16
#innodb_adaptive_checkpoint=keep_average
#innodb_flush_method=O_DIRECT
#innodb_io_capacity=1700 # to match IOPs of the FusionIO device
#Innodb_flush_neighbor_pages=0
########
 
# Innodb plugin and allow compressed tables
ignore-builtin-innodb
plugin-load=innodb=ha_innodb.so;innodb_trx=ha_innodb.so;innodb_locks=ha_innodb.so;innodb_lock_waits=ha_innodb.so;innodb_cmp=ha_innodb.so;innodb_cmp_reset=ha_innodb.so;innodb_cmpmem=ha_innodb.so;innodb_cmpmem_reset=ha_innodb.so
innodb_file_format=barracuda
innodb_strict_mode=on
 
#logs
log_output=TABLE #,FILE
slow_query_log
long_query_time=10
#slow_query_log_file=/var/log/mysql-slow-queries.log
log-error = /var/log/mysqlerror.log
# log-queries-not-using-indexes #proper on test servers
 
#federated
[mysqldump]
quick
max_allowed_packet = 1G
 
aria_pagecache_buffer_size=128M
 
[isamchk]
key_buffer_size = 64k
sort_buffer_size = 128M
read_buffer = 2M
write_buffer = 2M
 
[myisamchk]
key_buffer_size = 64k
sort_buffer_size = 128M
read_buffer = 2M
write_buffer = 2M
 
[mysqlhotcopy]
interactive-timeout

Comment by Rob [ 2017-06-02 ]

Hope that helps.

I cannot provide the data. What I can say about it is that it is a very volatile table with typically 5 million records created and deleted per day

Thanks
Rob

Comment by marceloffffff [ 2017-06-22 ]

Hi People,

I have 10 machines using mariadb and after update to new version, i found the same problem related in this topic.

The problem in mariadb only It happened when some database use federatedx.

The solution, this problem is very important to my enviroment.

Comment by Elena Stepanova [ 2017-06-25 ]

MTR test case

SET GLOBAL query_cache_size= 16*1024*1024;
SET GLOBAL query_cache_type= 1;
INSTALL SONAME 'ha_federatedx';
 
CREATE TABLE t1 (i INT);
eval
CREATE TABLE t2 (i INT) ENGINE=FEDERATED 
  CONNECTION="mysql://root@localhost:$MASTER_MYPORT/test/t1";
--error ER_ILLEGAL_HA
ALTER TABLE t2 DISABLE KEYS;
eval 
CREATE TABLE t3 (i INT) ENGINE=FEDERATED 
  CONNECTION="mysql://root@localhost:$MASTER_MYPORT/test/t1";

10.1 b76b69cd5fe634d8ddb9406aa2c82ef2a375b4d8

#3  <signal handler called>
#4  0x000055a2c7dfc251 in QUERY_PROFILE::new_status (this=0x8f8f8f8f8f8f8f8f, status_arg=0x55a2c85c76da "Waiting for query cache lock", function_arg=0x55a2c85dca1b <Query_cache::try_lock(THD*, Query_cache::Cache_try_lock_mode)::__FUNCTION__> "try_lock", file_arg=0x55a2c85da160 "/data/src/10.1/sql/sql_cache.cc", line_arg=603) at /data/src/10.1/sql/sql_profile.cc:312
#5  0x000055a2c7c088fc in PROFILING::status_change (this=0x7fa2b8ed54c8, status_arg=0x55a2c85c76da "Waiting for query cache lock", function_arg=0x55a2c85dca1b <Query_cache::try_lock(THD*, Query_cache::Cache_try_lock_mode)::__FUNCTION__> "try_lock", file_arg=0x55a2c85da160 "/data/src/10.1/sql/sql_cache.cc", line_arg=603) at /data/src/10.1/sql/sql_profile.h:312
#6  0x000055a2c7c08ecc in THD::enter_stage (this=0x7fa2b8ed2070, stage=0x55a2c8d5edb0 <stage_waiting_for_query_cache_lock>, calling_func=0x55a2c85dca1b <Query_cache::try_lock(THD*, Query_cache::Cache_try_lock_mode)::__FUNCTION__> "try_lock", calling_file=0x55a2c85da160 "/data/src/10.1/sql/sql_cache.cc", calling_line=603) at /data/src/10.1/sql/sql_class.h:2042
#7  0x000055a2c7c839d5 in set_thd_stage_info (thd_arg=0x7fa2b8ed2070, new_stage=0x55a2c8d5edb0 <stage_waiting_for_query_cache_lock>, old_stage=0x7fa2c2fb9878, calling_func=0x55a2c85dca1b <Query_cache::try_lock(THD*, Query_cache::Cache_try_lock_mode)::__FUNCTION__> "try_lock", calling_file=0x55a2c85da160 "/data/src/10.1/sql/sql_cache.cc", calling_line=603) at /data/src/10.1/sql/sql_class.cc:557
#8  0x000055a2c7c81dae in Query_cache_wait_state::Query_cache_wait_state (this=0x7fa2c2fb9870, thd=0x7fa2b8ed2070, func=0x55a2c85dca1b <Query_cache::try_lock(THD*, Query_cache::Cache_try_lock_mode)::__FUNCTION__> "try_lock", file=0x55a2c85da160 "/data/src/10.1/sql/sql_cache.cc", line=603) at /data/src/10.1/sql/sql_cache.cc:432
#9  0x000055a2c7c744d8 in Query_cache::try_lock (this=0x55a2c8f5a280 <query_cache>, thd=0x7fa2b8ed2070, mode=Query_cache::WAIT) at /data/src/10.1/sql/sql_cache.cc:603
#10 0x000055a2c7c754d5 in Query_cache::insert (this=0x55a2c8f5a280 <query_cache>, thd=0x7fa2b8ed2070, query_cache_tls=0x7fa2b8ed2328, packet=0x7fa2b71ce070 "\001", length=5, pkt_nr=1) at /data/src/10.1/sql/sql_cache.cc:1082
#11 0x000055a2c7c75405 in query_cache_insert (thd_arg=0x7fa2b8ed2070, packet=0x7fa2b71ce070 "\001", length=5, pkt_nr=1) at /data/src/10.1/sql/sql_cache.cc:1057
#12 0x000055a2c7c0a1cc in net_real_write (net=0x7fa2b7126228, packet=0x7fa2b71ce070 "\001", len=5) at /data/src/10.1/sql/net_serv.cc:606
#13 0x000055a2c7c09afb in net_flush (net=0x7fa2b7126228) at /data/src/10.1/sql/net_serv.cc:363
#14 0x000055a2c7c09f10 in net_write_command (net=0x7fa2b7126228, command=1 '\001', header=0x0, head_len=0, packet=0x0, len=0) at /data/src/10.1/sql/net_serv.cc:501
#15 0x000055a2c7ec4781 in cli_advanced_command (mysql=0x7fa2b7126228, command=COM_QUIT, header=0x0, header_length=0, arg=0x0, arg_length=0, skip_check=1 '\001', stmt=0x0) at /data/src/10.1/sql-common/client.c:701
#16 0x000055a2c7ecba5a in mysql_close_slow_part (mysql=0x7fa2b7126228) at /data/src/10.1/sql-common/client.c:3949
#17 0x000055a2c7ecbad3 in mysql_close (mysql=0x7fa2b7126228) at /data/src/10.1/sql-common/client.c:3961
#18 0x00007fa2ba1f5963 in federatedx_io_mysql::~federatedx_io_mysql (this=0x7fa2b71261f8, __in_chrg=<optimized out>) at /data/src/10.1/storage/federatedx/federatedx_io_mysql.cc:152
#19 0x00007fa2ba1f59ca in federatedx_io_mysql::~federatedx_io_mysql (this=0x7fa2b71261f8, __in_chrg=<optimized out>) at /data/src/10.1/storage/federatedx/federatedx_io_mysql.cc:156
#20 0x00007fa2ba1f42d9 in federatedx_txn::close (this=0x7fa2ba3fe380 <zero_txn>, server=0x7fa2b71260b0) at /data/src/10.1/storage/federatedx/federatedx_txn.cc:86
#21 0x00007fa2ba1ecdf7 in free_server (txn=0x7fa2ba3fe380 <zero_txn>, server=0x7fa2b71260b0) at /data/src/10.1/storage/federatedx/ha_federatedx.cc:1662
#22 0x00007fa2ba1ecfeb in free_share (txn=0x7fa2ba3fe380 <zero_txn>, share=0x7fa2b71e8c88) at /data/src/10.1/storage/federatedx/ha_federatedx.cc:1701
#23 0x00007fa2ba1ed500 in ha_federatedx::close (this=0x7fa2b70af888) at /data/src/10.1/storage/federatedx/ha_federatedx.cc:1831
#24 0x000055a2c7f078fd in handler::ha_close (this=0x7fa2b70af888) at /data/src/10.1/sql/handler.cc:2566
#25 0x000055a2c7dba07d in closefrm (table=0x7fa2b7081c70, free_share=true) at /data/src/10.1/sql/table.cc:3054
#26 0x000055a2c7c5caf9 in intern_close_table (table=0x7fa2b7081c70) at /data/src/10.1/sql/sql_base.cc:354
#27 0x000055a2c7e89767 in tc_purge (mark_flushed=true) at /data/src/10.1/sql/table_cache.cc:204
#28 0x000055a2c7c5cf0f in close_cached_tables (thd=0x0, tables=0x0, wait_for_refresh=false, timeout=31536000) at /data/src/10.1/sql/sql_base.cc:485
#29 0x000055a2c7e8a0de in tdc_start_shutdown () at /data/src/10.1/sql/table_cache.cc:460
#30 0x000055a2c7bf9c68 in clean_up (print_message=true) at /data/src/10.1/sql/mysqld.cc:2119
#31 0x000055a2c7bf988e in unireg_end () at /data/src/10.1/sql/mysqld.cc:2001
#32 0x000055a2c7bf97a5 in kill_server (sig_ptr=0x0) at /data/src/10.1/sql/mysqld.cc:1929
#33 0x000055a2c7bf97cd in kill_server_thread (arg=0x7fa2c30e21d0) at /data/src/10.1/sql/mysqld.cc:1952
#34 0x000055a2c81b1f56 in pfs_spawn_thread (arg=0x7fa2b682c0f0) at /data/src/10.1/storage/perfschema/pfs.cc:1860
#35 0x00007fa2c2ccd494 in start_thread (arg=0x7fa2c2fbab00) at pthread_create.c:333
#36 0x00007fa2c0e1893f in clone () from /lib/x86_64-linux-gnu/libc.so.6

10.1 9ed325efc17e78f4c98d923c2af92c8b18cec0c5 valgrind build

==25510== Invalid read of size 8
==25510==    at 0x5863A5: Query_cache::insert(THD*, Query_cache_tls*, char const*, unsigned long, unsigned int) (sql_cache.cc:1073)
==25510==    by 0x58634E: query_cache_insert(void*, char const*, unsigned long, unsigned int) (sql_cache.cc:1057)
==25510==    by 0x5195FB: net_real_write (net_serv.cc:606)
==25510==    by 0x518F2A: net_flush (net_serv.cc:363)
==25510==    by 0x51933F: net_write_command (net_serv.cc:501)
==25510==    by 0x7E23F6: cli_advanced_command (client.c:701)
==25510==    by 0x7E968A: mysql_close_slow_part (client.c:3949)
==25510==    by 0x7E9703: mysql_close (client.c:3961)
==25510==    by 0xD07D4D4: federatedx_io_mysql::~federatedx_io_mysql() (federatedx_io_mysql.cc:152)
==25510==    by 0xD07D53B: federatedx_io_mysql::~federatedx_io_mysql() (federatedx_io_mysql.cc:156)
==25510==    by 0xD07B792: federatedx_txn::close(st_fedrated_server*) (federatedx_txn.cc:86)
==25510==    by 0xD074016: free_server(federatedx_txn*, st_fedrated_server*) (ha_federatedx.cc:1662)
==25510==    by 0xD07420A: free_share(federatedx_txn*, st_federatedx_share*) (ha_federatedx.cc:1701)
==25510==    by 0xD07471F: ha_federatedx::close() (ha_federatedx.cc:1831)
==25510==    by 0x826964: handler::ha_close() (handler.cc:2566)
==25510==    by 0x6CF5DA: closefrm(TABLE*, bool) (table.cc:3054)

The problem appeared in 10.1.23 with this revision:

commit 99e1294c1e2ddd0bbd81129f1c0902be31a38f48
Author: Sergei Golubchik <serg@mariadb.org>
Date:   Mon Apr 24 15:39:47 2017 +0200
 
    bugfix: federated/replication did not increment bytes_received status variable
    
    because mysql->net.thd was reset to NULL in mysql_real_connect()
    and thd_increment_bytes_received() didn't do anything.
    
    Fix:
    * set mysql->net.thd to current_thd instread.
    * remove the test for non-null THD from a very often used
      function thd_increment_bytes_received().

Comment by Sergei Golubchik [ 2017-08-08 ]

cannot repeat so far

Comment by Elena Stepanova [ 2017-08-08 ]

As discussed, still reproducible on 10.1 8e8d42ddf0291b2364fef8e3224e65d596ef4202 – for me, both the crash and the valgrind error, so if the crash isn't happening for you, please try valgrind.

Comment by Estrategy | Support [ 2017-08-23 ]

Server crash still exists within v10.2.8, when accessing federated table

Comment by Sergei Golubchik [ 2017-08-23 ]

EstrategySupport, you need to tell us more about your setup and what "access" exactly makes federated table to crash. If I cannot repeat the crash, I cannot fix it. And federated doesn't crash in our tests.

Comment by Estrategy | Support [ 2017-08-23 ]

I report it in issue MDEV-12951
This one was marked as a duplicate of this ticket, but problem is not gone with the use us this new release.
I did a mysql_upgrade before testing the new release

Comment by Sergei Golubchik [ 2017-08-23 ]

Thanks, then I'll close this one as fixed, again, and reopen the other one.

Generated at Thu Feb 08 07:59:58 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.