[MDEV-11276] Mulltiple Insert into Federated Table Select From local Table Created: 2016-11-13  Updated: 2016-11-24  Resolved: 2016-11-24

Status: Closed
Project: MariaDB Server
Component/s: Storage Engine - Federated
Affects Version/s: 10.1.18
Fix Version/s: N/A

Type: Bug Priority: Major
Reporter: Su, Jun-Ming Assignee: Unassigned
Resolution: Duplicate Votes: 0
Labels: None
Environment:

Windows 2012 R2 x2
MariaDB 10.1.18 x2


Issue Links:
Duplicate
is duplicated by MDEV-4452 Problem with FederatedX between two l... Closed

 Description   

I have 2 sites, and there is the same database and tables at each site.
I use event to call the procedure that will insert into federated table select from local table from one site, but there is only this error message "ERROR 1296 (HY000): Got error 10000 'Error on remote system: 2006: MySQL server has gone away' from FEDERATED"
And I ran the procedure in cmd, there is also this message for this procedure, but sometime there is error message and the data seem ok to inserted into foreign tables.
What can I do for this case?



 Comments   
Comment by Elena Stepanova [ 2016-11-16 ]

Communication between local and remote servers in Federated is subject to standard connection timeouts. See some more details in MDEV-4452.

Comment by Su, Jun-Ming [ 2016-11-17 ]

Hi, thanks for your respond.

I think it is not the problem for timeouts. Because the query execute use only one second to throw this error (almost instantly), and I use mysqldump and pipe to mysql to restore the same table use 3 mins (include table lock).

original
insert into federated_table select * from local_table where time = ...;

worked solution
mysqldump ... local_database local_table --where="time= ..." | mysql .... federated_database

Comment by Elena Stepanova [ 2016-11-17 ]

sujunmin,

We are not talking about query timeouts, but connection timeouts which happen when the connection is inactive, ones controlled by wait_timeout.

That is, you establish the connection between local and remote server using a federated table; then you don't use the connection for more than wait_timeout seconds; during this time, the connection expires, but obviously you can't see it because you're not using it. Then you're trying to use it again by querying the table, and server informs you that the connection is gone. That's exactly why you're getting the error immediately, within a second, and why you're not getting it when you're doing the dump, because the connection stays busy and it doesn't expire.

Comment by Su, Jun-Ming [ 2016-11-17 ]

Thanks for your response.

This query is executed in an event, not by users. Is it still wait_time issue? What do I change this variable to?

Comment by Elena Stepanova [ 2016-11-17 ]

Yes, wait_timeout is applied anyway.
What is the value now? Don't look in the MySQL client, it can be deceiving, because the value is populated differently for interactive and non-interactive connections; check your cnf files instead. The default value is 28800 (8 hours), but it's often modified in config templates.

What to change the value to depends on your work flow. How long the connection can stay idle? Configure the value to be higher than the maximum expected idle time.

Comment by Su, Jun-Ming [ 2016-11-17 ]

Thanks.

I have checked the value, and it is default 28800.

Comment by Elena Stepanova [ 2016-11-17 ]

Is it possible that there have been no use of the connection for over 8 hours before the problem occurs?

Comment by Su, Jun-Ming [ 2016-11-17 ]

I think it is not possible. Because it is the query executed in a mariadb schedule event (every day at am 3:00) ,not by user client login and execute.

Comment by Elena Stepanova [ 2016-11-17 ]

But if it's executed only once a day at the same time, it is certainly more than 8 hours between the queries, so it will definitely time out. What did you mean by "not possible"?
It does not matter whether it's the client login or not. We are not talking about the connection to your local server, it's the connection between local and remote server that times out. It is established when the federated table is used, and after that it's a normal connection which is subject to normal rules.

Comment by Su, Jun-Ming [ 2016-11-17 ]

So you mean the connection you connect to federated table this time and next time is not longer than wait_timeout, if this time update federated table and next time update federated table difference more than wait_timeout, will throw this error, right?

Comment by Elena Stepanova [ 2016-11-17 ]

That's right. Consider the following example (it's more or less the same as the test in MDEV-4452 does, but I'm intentionally convert it into a manual test with two separate servers, to make it more obvious.

On server 1 (remote server): create a remote table

create table t_remote (a int primary key, b varchar(8));

On server 2 (local server): create a federated table:

create table t_local (a int primary key, b varchar(8)) 
  engine=federated connection='mysql://root@127.0.0.1:3307/test/t_remote';

So far you've just created tables, but did not use them, so there is no connection between the local and remote server. Check it by running show processlist on the server 1 (remote):

MariaDB [test]> show processlist;
+----+------+-----------------+------+---------+------+-------+------------------+----------+
| Id | User | Host            | db   | Command | Time | State | Info             | Progress |
+----+------+-----------------+------+---------+------+-------+------------------+----------+
|  3 | root | localhost:44325 | test | Query   |    0 | init  | show processlist |    0.000 |
+----+------+-----------------+------+---------+------+-------+------------------+----------+
1 row in set (0.00 sec)

Now, set the timeout to a small value on the server 1 (the remote server):

set global wait_timeout = 20;

Select from the federated table on the local server:

MariaDB [test]> select * from t_local;
Empty set (0.00 sec)

While 20 seconds have not passed, check the processlist on the remote server:

MariaDB [test]> show processlist;
+----+------+-----------------+------+---------+------+-------+------------------+----------+
| Id | User | Host            | db   | Command | Time | State | Info             | Progress |
+----+------+-----------------+------+---------+------+-------+------------------+----------+
|  3 | root | localhost:44325 | test | Query   |    0 | init  | show processlist |    0.000 |
|  7 | root | localhost:55683 | test | Sleep   |    3 |       | NULL             |    0.000 |
+----+------+-----------------+------+---------+------+-------+------------------+----------+
2 rows in set (0.00 sec)

As you can see, we now have another connection, it's the connection from Federated.
If you keep running SHOW PROCESSLIST, you'll see how the Time value grows:

MariaDB [test]> show processlist;
+----+------+-----------------+------+---------+------+-------+------------------+----------+
| Id | User | Host            | db   | Command | Time | State | Info             | Progress |
+----+------+-----------------+------+---------+------+-------+------------------+----------+
|  3 | root | localhost:44325 | test | Query   |    0 | init  | show processlist |    0.000 |
|  7 | root | localhost:55683 | test | Sleep   |   15 |       | NULL             |    0.000 |
+----+------+-----------------+------+---------+------+-------+------------------+----------+
2 rows in set (0.00 sec)

But as soon as it exceeds the configured 20 seconds, the connection will disappear:

MariaDB [test]> show processlist;
+----+------+-----------------+------+---------+------+-------+------------------+----------+
| Id | User | Host            | db   | Command | Time | State | Info             | Progress |
+----+------+-----------------+------+---------+------+-------+------------------+----------+
|  3 | root | localhost:44325 | test | Query   |    0 | init  | show processlist |    0.000 |
+----+------+-----------------+------+---------+------+-------+------------------+----------+
1 row in set (0.00 sec)

And now, if you run the query on the local server again, you'll get an error. It can be either "server has gone away" or "error reading communication packets" – all the same, it will fail:

MariaDB [test]> select * from t_local;
ERROR 1158 (08S01): Got an error reading communication packets

That's what bug MDEV-4452 is about.

If you re-run the query, it will re-establish the connection, and the timeout will start count-down again.

Absolute values does not matter. Even if you have the timeout 28800, but you only run the query once a day, you'll get the same problem.

Try to set the timeout to more than a day and see if the problem goes away. Remember to set the GLOBAL value of wait_timeout.

Comment by Su, Jun-Ming [ 2016-11-17 ]

Thanks, you help me a lot, maybe I can do another job (event) for query the federated table per 1 hour to prevent to remote server to drop the connection, right?

Comment by Elena Stepanova [ 2016-11-17 ]

Yes, I suppose it should work too.

Comment by Su, Jun-Ming [ 2016-11-24 ]

Thanks, it works now. Please close this issue.

Comment by Elena Stepanova [ 2016-11-24 ]

Closing as a duplicate of MDEV-4452

Generated at Thu Feb 08 07:48:41 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.