[MDEV-4452] Problem with FederatedX between two local MariaDB servers Created: 2013-04-29 Updated: 2022-10-31 Resolved: 2022-10-31 |
|
| Status: | Closed |
| Project: | MariaDB Server |
| Component/s: | Storage Engine - Federated |
| Affects Version/s: | 5.5.30, 5.5.40, 10.0.14 |
| Fix Version/s: | N/A |
| Type: | Bug | Priority: | Minor |
| Reporter: | Hank Eskin (Inactive) | Assignee: | Sergei Golubchik |
| Resolution: | Not a Bug | Votes: | 1 |
| Labels: | federated | ||
| Environment: |
CentOS 5.0 64-bit on VMWare ESxi 5 |
||
| Issue Links: |
|
||||||||||||||||
| Description |
|
I spoke with Sergei at Percona about this issue. I have two MariaDB 5.5.30 (on CentOS 5) servers set up as master-slave (although that should matter, as this is not a replication issue). Here is my setup. On the slave, I have a federated table linking back to the master. Also on the slave, I have a batch reporting process that inserts records into a local (slave) table which matches the table definition of the table on the master. When the process is complete, I do a bulk insert like this: insert into <federated_table> select * from <local_table>; The records are then loaded into the server table, which then flow back to the slave (as well as other slaves) through normal replication. The problem is that one time in about 20 (about 5% of the time), I receive the following error on the slave: Got error 10000 'Error on remote system: 2006: MySQL server has gone away' from FEDERATED Of course, the master is running fine and never hangs or "goes away". When this error occurs, I have my slave reporting code immediately retry the insert query, and it usually works without failure the second time. The two servers are right next to each other and operate on an internal network (10.0.0.x) with just one HP gigabit switch between them. They are both VMWare VMs running in VMWare ESXi 5. The master server is a development server, so it is not under any load or contention. I'd like to try to find out some more detailed error logging on the slave when this happens with Federated tables or what could be causing this issue. Many thanks. |
| Comments |
| Comment by Hank Eskin (Inactive) [ 2013-04-29 ] | ||||||||||||||||||||||||||
|
Just to be clear, there are four tables: Master: results The reporting process populates "results_local" on the slave. The problematic insert statement on the slave is: insert into results_fed select * from results_local; Hope this clears things up. | ||||||||||||||||||||||||||
| Comment by Elena Stepanova [ 2013-05-01 ] | ||||||||||||||||||||||||||
|
Please ignore my previous question if you received it (deleted now), the summary of the report already answers it. | ||||||||||||||||||||||||||
| Comment by Hank Eskin (Inactive) [ 2013-08-16 ] | ||||||||||||||||||||||||||
|
I would like to add that along the same lines, I am also experiencing another similar problem with FEDERATEDX. I have code that deletes all records for a primary key in a federated table (between the same two servers mentioned above), and then inserts new records replacing those deleted records. Again, about 5% of the time, I will get an error "Can't write; duplicate key in table <table>' when I know for sure that there are absolutely no duplicate records in the existing table or the source records. The source records are coming from another table with the same primary key, so there can not be duplicate records. If I try the same insert again, it usually works. So something on the original table side is triggering this error, but the Federated engine can't interpret and transmit the error correctly to the federated table database. Just to eliminate the delete, in my code I now delete the primary key records on the "master" table directly (not through FEDERATED), and then insert into the federated table and the same problem occurs. So the problem is definitely on the federated table insert. | ||||||||||||||||||||||||||
| Comment by Elena Stepanova [ 2014-10-17 ] | ||||||||||||||||||||||||||
|
Eureka (18 months too late). Since we are talking about some regular analytic/reporting activity, it is probably scheduled and isn't very frequent. Then, it might be that the batch reporting process which does local inserts runs at such time when it finishes close to the moment when the existing federated connection is about to timeout. Sometimes (like, in 5% cases) the process takes a bit longer and the federated connection actually times out. And then we hit the problem – when the remote connection disappeared, the next INSERT ends with either "server has gone away" or with "error reading communication packets", see the test case below.
| ||||||||||||||||||||||||||
| Comment by Aleksey Midenkov [ 2018-05-03 ] | ||||||||||||||||||||||||||
|
Another use case with UPDATE:
| ||||||||||||||||||||||||||
| Comment by Su, Jun-Ming [ 2018-11-12 ] | ||||||||||||||||||||||||||
|
I read this issue and I only got error Got an error reading communication packets, no MySQL connection flew away message in error log or application log, and once I flush table for foreign tables first, all error will be gone. Please consider this issue If there is something help to solve it, please let me know and I will provide as full as possible. | ||||||||||||||||||||||||||
| Comment by Sergei Golubchik [ 2022-10-31 ] | ||||||||||||||||||||||||||
|
Assuming the "eureka" comment correctly describes the reason of this old issue, it's not a bug. With sufficiently slow queries FederatedX connection can timeout, and to fix that one needs to increase the timeout accordingly. |