[MDEV-18392] Getting DEADLOCKS frequently on Versioned TABLE Created: 2019-01-28  Updated: 2019-07-24  Resolved: 2019-03-05

Status: Closed
Project: MariaDB Server
Component/s: Data Manipulation - Update, Optimizer, Partitioning, Versioned Tables
Affects Version/s: 10.3.8
Fix Version/s: N/A

Type: Bug Priority: Major
Reporter: Vivek SACHIDANANDA Assignee: Aleksey Midenkov
Resolution: Not a Bug Votes: 0
Labels: innodb, versioned-table

Attachments: PNG File Screenshot 2019-02-07 at 11.41.27.png    
Issue Links:
Relates
relates to MDEV-18706 ER_LOCK_DEADLOCK on concurrent read a... Stalled

 Description   

I am having trouble to understand the increase in DEADLOCKS for Update transactions on Versioned TABLES. The problem, was not present earlier and then we enabled the versioning and after a while we started to receive many DEADLOCK exceptions.

Here is the TABLE:

CREATE TABLE `sample_table` (
`id` bigint(20) unsigned NOT NULL WITHOUT SYSTEM VERSIONING,
`amount` bigint(20) NOT NULL DEFAULT 0,
`release_date` date DEFAULT NULL,
`details` VARCHAR(64) DEAFULT NULL,
`changed_by_user_id` int(10) unsigned DEFAULT NULL,
`last_changed_at` timestamp NOT NULL DEFAULT current_timestamp() WITHOUT SYSTEM VERSIONING,
PRIMARY KEY (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1 WITH SYSTEM VERSIONING
PARTITION BY SYSTEM_TIME LIMIT 100000
(PARTITION `p0` HISTORY ENGINE = InnoDB,
PARTITION `p1` HISTORY ENGINE = InnoDB,
PARTITION `p2` HISTORY ENGINE = InnoDB,
PARTITION `p3` HISTORY ENGINE = InnoDB,
PARTITION `p4` HISTORY ENGINE = InnoDB,
PARTITION `pcur` CURRENT ENGINE = InnoDB)
I read documents on MariaDB/SQL still cannot pin point on whats causing the deadlock. Here is the log

Deadlock Info : update sample_table set changed_by_user_id=8127,details='sample' where id=815977 *** (1) WAITING FOR THIS LOCK TO BE GRANTED: RECORD LOCKS space id 10235669 page no 94 n bits 192 index PRIMARY of table ebs.sample_table /* Partition p0 */ trx id 1903402610 lock_mode X locks rec but not gap waiting



 Comments   
Comment by Elena Stepanova [ 2019-01-28 ]

One possible reason might be the difference in the UPDATE query plan which can be seen here:

With versioning + partitioning

MariaDB [test]> explain update sample_table set changed_by_user_id=8127,details='sample' where id=100;
+------+-------------+--------------+-------+---------------+---------+---------+------+------+---------------------------+
| id   | select_type | table        | type  | possible_keys | key     | key_len | ref  | rows | Extra                     |
+------+-------------+--------------+-------+---------------+---------+---------+------+------+---------------------------+
|    1 | SIMPLE      | sample_table | range | PRIMARY       | PRIMARY | 8       | NULL |    6 | Using where; Using buffer |
+------+-------------+--------------+-------+---------------+---------+---------+------+------+---------------------------+
1 row in set (0.00 sec)

Without versioning/partitioning

MariaDB [test]> explain update sample_table2 set changed_by_user_id=8127,details='sample' where id=100;
+------+-------------+---------------+-------+---------------+---------+---------+------+------+-------------+
| id   | select_type | table         | type  | possible_keys | key     | key_len | ref  | rows | Extra       |
+------+-------------+---------------+-------+---------------+---------+---------+------+------+-------------+
|    1 | SIMPLE      | sample_table2 | range | PRIMARY       | PRIMARY | 8       | NULL |    1 | Using where |
+------+-------------+---------------+-------+---------------+---------+---------+------+------+-------------+
1 row in set (0.00 sec)

I'll leave further commenting on it to versioning experts.

Comment by Aleksey Midenkov [ 2019-02-07 ]

Vivek, please supply as much info as you can. Can you reproduce in test environment? How much load do you have?

Comment by Vivek SACHIDANANDA [ 2019-02-07 ]

Hi Aleksey,

We are not able to reproduce it on staging/test because we do not have enough traffic to simulate live!

Also, what I see in the Live DB logs are warnings related to partition being full - "Versioned table `test`.`sample_table `: partition `p4` is full, add more HISTORY partitions". I am not sure, if its due to this it takes more time to swap and deadlocks itself.
In terms of traffic on DB WOPS is as attached! Let me know if you need specific information so that I can provide it for you.

Thanks!

Comment by Aleksey Midenkov [ 2019-02-14 ]

Vivek, you did supply the query from the first transaction, but not from the second one. Can you supply full LATEST DETECTED DEADLOCK info as provided by SHOW engine innodb status?

Can you estimate the increase in deadlocks since System Versioning appeared?

Comment by Vivek SACHIDANANDA [ 2019-02-18 ]

Here is the latest one which has the second query and full logs! The deadlocks have significantly increased to now 15pc! and it seems to rise as and when the data is getting increased in the DB
--------------------------------------------------------------------------------------
LATEST DETECTED DEADLOCK
------------------------
2019-02-18 07:47:29 0x2b94a7b40700

      • (1) TRANSACTION:
        TRANSACTION 1945144252, ACTIVE 0 sec fetching rows
        mysql tables in use 6, locked 6
        LOCK WAIT 8 lock struct(s), heap size 1136, 3 row lock(s)
        MySQL thread id 278004, OS thread handle 47923558876928, query id 517713740 172.XX.XX.XXX root Searching rows for update
        update `sample_table` set `release_date`='2019-02-18',`changed_by_user_id`=null,`last_changed_at`=20190218074729 where `id`=825053
      • (1) WAITING FOR THIS LOCK TO BE GRANTED:
        RECORD LOCKS space id 11244249 page no 83 n bits 208 index PRIMARY of table `ebs`.`sample_table` /* Partition `p0` */ trx id 1945144252 lock_mode X locks rec but not gap waiting
        Record lock, heap no 108 PHYSICAL RECORD: n_fields 29; compact format; info bits 0
        0: len 8; hex 00000000000c96dd; asc ;;
        1: len 7; hex 5c69df52070190; asc \i R ;;
        2: len 6; hex 000000000000; asc ;;
        3: len 7; hex 80000000000000; asc ;;
        4: len 8; hex 8000000000000000; asc ;;
        5: SQL NULL;
        6: SQL NULL;
        7: len 4; hex 5c4912f9; asc \I ;;
        8: len 1; hex 01; asc ;;
        9: len 4; hex 00000f3d; asc =;;
        10: len 4; hex 5c4912f9; asc \I ;;
        11: SQL NULL;
        12: SQL NULL;
        13: SQL NULL;
        14: SQL NULL;
        15: SQL NULL;
        16: len 8; hex 0000000000000000; asc ;;
        17: SQL NULL;
        18: len 8; hex 0000000000000000; asc ;;
        19: SQL NULL;
        20: SQL NULL;
        21: SQL NULL;
        22: SQL NULL;
        23: SQL NULL;
        24: SQL NULL;
        25: len 7; hex 5c4912f90d6667; asc \I fg;;
        26: len 17; hex 726f6f74403137322e33312e38322e3238; asc root@172.XX.XX.28;;
        27: SQL NULL;
        28: SQL NULL;
      • (2) TRANSACTION:
        TRANSACTION 1945144250, ACTIVE 0 sec fetching rows
        mysql tables in use 6, locked 6
        8 lock struct(s), heap size 1136, 32 row lock(s)
        MySQL thread id 277999, OS thread handle 47917468747520, query id 517713730 172.XX.XX.XXX root Searching rows for update
        update `sample_table` set `release_date`='2019-02-18',`changed_by_user_id`=null,`last_changed_at`=20190218074729 where `id`=825053
      • (2) HOLDS THE LOCK(S):
        RECORD LOCKS space id 11244249 page no 83 n bits 208 index PRIMARY of table `ebs`.`sample_table` /* Partition `p0` */ trx id 1945144250 lock_mode X locks rec but not gap
        Record lock, heap no 108 PHYSICAL RECORD: n_fields 29; compact format; info bits 0
        0: len 8; hex 00000000000c96dd; asc ;;
        1: len 7; hex 5c69df52070190; asc \i R ;;
        2: len 6; hex 000000000000; asc ;;
        3: len 7; hex 80000000000000; asc ;;
        4: len 8; hex 8000000000000000; asc ;;
        5: SQL NULL;
        6: SQL NULL;
        7: len 4; hex 5c4912f9; asc \I ;;
        8: len 1; hex 01; asc ;;
        9: len 4; hex 00000f3d; asc =;;
        10: len 4; hex 5c4912f9; asc \I ;;
        11: SQL NULL;
        12: SQL NULL;
        13: SQL NULL;
        14: SQL NULL;
        15: SQL NULL;
        16: len 8; hex 0000000000000000; asc ;;
        17: SQL NULL;
        18: len 8; hex 0000000000000000; asc ;;
        19: SQL NULL;
        20: SQL NULL;
        21: SQL NULL;
        22: SQL NULL;
        23: SQL NULL;
        24: SQL NULL;
        25: len 7; hex 5c4912f90d6667; asc \I fg;;
        26: len 17; hex 726f6f74403137322e33312e38322e3238; asc root@172.XX.XX.28;;
        27: SQL NULL;
        28: SQL NULL;

Record lock, heap no 109 PHYSICAL RECORD: n_fields 29; compact format; info bits 0
0: len 8; hex 00000000000c96dd; asc ;;
1: len 7; hex 5c69df52070572; asc \i R r;;
2: len 6; hex 000000000000; asc ;;
3: len 7; hex 80000000000000; asc ;;
4: len 8; hex 7ffffffffffffe5f; asc _;;
5: SQL NULL;
6: SQL NULL;
7: len 4; hex 5c4912f9; asc \I ;;
8: len 1; hex 01; asc ;;
9: len 4; hex 00000f3d; asc =;;
10: len 4; hex 5c4912f9; asc \I ;;
11: SQL NULL;
12: SQL NULL;
13: SQL NULL;
14: SQL NULL;
15: SQL NULL;
16: len 8; hex 0000000000000000; asc ;;
17: SQL NULL;
18: len 8; hex 0000000000000000; asc ;;
19: SQL NULL;
20: SQL NULL;
21: SQL NULL;
22: SQL NULL;
23: SQL NULL;
24: SQL NULL;
25: len 7; hex 5c69df52070190; asc \i R ;;
26: len 18; hex 726f6f74403137322e33312e37362e313337; asc root@172.XX.XX.28;;
27: SQL NULL;
28: SQL NULL;

Record lock, heap no 110 PHYSICAL RECORD: n_fields 29; compact format; info bits 0
0: len 8; hex 00000000000c96dd; asc ;;
1: len 7; hex 5c6a630d0d52d4; asc \jc R ;;
2: len 6; hex 000000000000; asc ;;
3: len 7; hex 80000000000000; asc ;;
4: len 8; hex 7ffffffffffffe5f; asc _;;
5: SQL NULL;
6: SQL NULL;
7: len 4; hex 5c4912f9; asc \I ;;
8: len 1; hex 00; asc ;;
9: len 4; hex 000005d7; asc ;;
10: SQL NULL;
11: SQL NULL;
12: SQL NULL;
13: SQL NULL;
14: SQL NULL;
15: SQL NULL;
16: len 8; hex 0000000000000000; asc ;;
17: SQL NULL;
18: len 8; hex 0000000000000000; asc ;;
19: SQL NULL;
20: SQL NULL;
21: SQL NULL;
22: SQL NULL;
23: SQL NULL;
24: SQL NULL;
25: len 7; hex 5c69df52070572; asc \i R r;;
26: len 18; hex 726f6f74403137322e33312e37362e313337; asc root@172.XX.XX.XXX;;
27: SQL NULL;
28: SQL NULL;

Record lock, heap no 111 PHYSICAL RECORD: n_fields 29; compact format; info bits 0
0: len 8; hex 00000000000c96dd; asc ;;
1: len 7; hex 5c6a630d0d5e81; asc \jc ^ ;;
2: len 6; hex 000000000000; asc ;;
3: len 7; hex 80000000000000; asc ;;
4: len 8; hex 8000000000000000; asc ;;
5: SQL NULL;
6: len 4; hex 00000f3d; asc =;;
7: len 4; hex 5c6a630d; asc \jc ;;
8: len 1; hex 00; asc ;;
9: len 4; hex 00000f3d; asc =;;
10: SQL NULL;
11: len 30; hex 496e73756666696369656e7420616d6f756e7420746f20636f7665722063; asc Insufficient amount to cover c; (total 39 bytes);
12: len 0; hex ; asc ;;
13: SQL NULL;
14: SQL NULL;
15: SQL NULL;
16: len 8; hex 0000000000000000; asc ;;
17: SQL NULL;
18: len 8; hex 0000000000000000; asc ;;
19: SQL NULL;
20: SQL NULL;
21: SQL NULL;
22: SQL NULL;
23: SQL NULL;
24: len 4; hex 00000f3d; asc =;;
25: len 7; hex 5c6a630d0d52d4; asc \jc R ;;
26: len 18; hex 726f6f74403137322e33312e37362e313337; asc root@172.XX.XX.28;;
27: SQL NULL;
28: SQL NULL;

Record lock, heap no 112 PHYSICAL RECORD: n_fields 29; compact format; info bits 0
0: len 8; hex 00000000000c96dd; asc ;;
1: len 7; hex 5c6a630d0d5f43; asc \jc _C;;
2: len 6; hex 000000000000; asc ;;
3: len 7; hex 80000000000000; asc ;;
4: len 8; hex 8000000000000000; asc ;;
5: SQL NULL;
6: len 4; hex 00000f3d; asc =;;
7: len 4; hex 5c6a630d; asc \jc ;;
8: len 1; hex 00; asc ;;
9: len 4; hex 00000f3d; asc =;;
10: SQL NULL;
11: len 30; hex 496e73756666696369656e7420616d6f756e7420746f20636f7665722063; asc Insufficient amount to cover c; (total 39 bytes);
12: len 0; hex ; asc ;;
13: SQL NULL;
14: SQL NULL;
15: SQL NULL;
16: len 8; hex 0000000000000000; asc ;;
17: SQL NULL;
18: len 8; hex 0000000000000000; asc ;;
19: SQL NULL;
20: SQL NULL;
21: SQL NULL;
22: SQL NULL;
23: SQL NULL;
24: len 4; hex 00000f3d; asc =;;
25: len 7; hex 5c6a630d0d5e81; asc \jc ^ ;;
26: len 18; hex 726f6f74403137322e33312e37362e313337; asc root@172.XX.XX.28;;
27: SQL NULL;
28: SQL NULL;

Record lock, heap no 113 PHYSICAL RECORD: n_fields 29; compact format; info bits 0
0: len 8; hex 00000000000c96dd; asc ;;
1: len 7; hex 5c6a630d0d692c; asc \jc i,;;
2: len 6; hex 000000000000; asc ;;
3: len 7; hex 80000000000000; asc ;;
4: len 8; hex 8000000000000000; asc ;;
5: SQL NULL;
6: len 4; hex 00000f3d; asc =;;
7: len 4; hex 5c6a630d; asc \jc ;;
8: len 1; hex 00; asc ;;
9: len 4; hex 00000f3d; asc =;;
10: SQL NULL;
11: len 30; hex 496e73756666696369656e7420616d6f756e7420746f20636f7665722063; asc Insufficient amount to cover c; (total 39 bytes);
12: len 0; hex ; asc ;;
13: SQL NULL;
14: SQL NULL;
15: SQL NULL;
16: len 8; hex 0000000000000000; asc ;;
17: SQL NULL;
18: len 8; hex 000…

Comment by Aleksey Midenkov [ 2019-02-18 ]

I see that both transactions do update same record (`id`=825053) with the same data. Does that make sense with such duplicate UPDATES? To me this looks like wrong application/balancer behavior. Since versioned UPDATE is at least twice heavier (because it inserts history row), I'd expect increase in deadlocks on concurrent access to same row.

Comment by Vivek SACHIDANANDA [ 2019-02-19 ]

We just have one execute command, I guess internally there are two updates one on main table and the other under partition. If you see, the lock is on the partition table p0. Correct me, if I am wrong or miss something.

Comment by Aleksey Midenkov [ 2019-02-19 ]

Vivek, no, you have 2 (i.e. many) different queries that run same UPDATE on exact same record.

Comment by Aleksey Midenkov [ 2019-03-05 ]

Since many threads concurrently update same record deadlocks are expected. I would recommend to review the application and networking infrastructure and fix incorrect behavior.

Generated at Thu Feb 08 08:43:45 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.