[MDEV-27930] halts on simple update Created: 2022-02-23 Updated: 2022-04-08 Resolved: 2022-04-08 |
|
| Status: | Closed |
| Project: | MariaDB Server |
| Component/s: | Replication |
| Affects Version/s: | 10.5.14 |
| Fix Version/s: | N/A |
| Type: | Bug | Priority: | Major |
| Reporter: | Anton Petin | Assignee: | Unassigned |
| Resolution: | Incomplete | Votes: | 0 |
| Labels: | None | ||
| Environment: |
10.5.14-MariaDB |
||
| Description |
|
Today we founded a problem, all our slaves (about 7) halts on the same simple update request. Such updates we have about 1k per second. In the error_log there is no any error strings about broken tables or so on. But in the
And very huge hdd reads iops. So what you advice to do and how to solve it? All slaves halts on same update. |
| Comments |
| Comment by Anton Petin [ 2022-02-24 ] |
|
Also after my investigation, some bug in our application sends an invalid "id" in update where clause, it's "6650624433" but the ID field is INT, not BIGINT |
| Comment by Daniel Black [ 2022-02-24 ] |
|
Glad you resolved this Anton. Thanks for updating this issue. |
| Comment by Anton Petin [ 2022-02-24 ] |
|
No, as i said, if i try to do it directly on my slave there is no lag, but via replication it halts....i think that something wrong.... |
| Comment by Daniel Black [ 2022-02-24 ] |
|
Opps. The number of row locks make it look like its doing a table scan. Is id the primary/unique key of orders_history? binlog_format=ROW might help ifs its a single row. Is there a lag if the id value isn't quoted? |
| Comment by Anton Petin [ 2022-02-24 ] |
|
As i said, if i do such request directly on slave - there is no a problem. If such request goes from master via binlog - yea - full scan of table, ssd fully loaded, but this key not exists, this is must be millisec check. Cause ID is a PRIMARY KEY |
| Comment by Anton Petin [ 2022-02-24 ] |
|
and replication with MIXED binlog format. |
| Comment by Andrei Elkin [ 2022-02-24 ] |
|
antonp1976: could you please upload the slave stack threads ?
the slave unhalt after those 3 000 +/- seconds? (I am guessing this can't be though, so probably it just takes long time to process replication events?) |
| Comment by Anton Petin [ 2022-02-25 ] |
|
Some slaves 3k secs, one slave with raid sync on at this moment - 15k secs |
| Comment by Anton Petin [ 2022-02-25 ] |
|
As i said before thos id not exists which is where clause |
| Comment by Sergei Golubchik [ 2022-03-10 ] |
|
What does EXPLAIN UPDATE show if you run it directly? What does ANALYZE UPDATE show? |