Details
-
Task
-
Status: Closed (View Workflow)
-
Major
-
Resolution: Fixed
-
None
Description
In TokuDB/PerconaFT range locking works as follows:
- Each SQL table has a global "Lock Tree" (see locktree.h,cc, class locktree) which stores all locks that are currently held by all transactions.
- Besides that, each transaction keeps a list of its own locks in each locktree
in db_txn_struct_i(txn)->lt_map.
It is defined as follows:
struct txn_lt_key_ranges { |
toku::locktree *lt;
|
toku::range_buffer *buffer;
|
};
|
|
...
|
// maps a locktree to a buffer of key ranges that are locked. |
// it is protected by the txn_mutex, so hot indexing and a client |
// thread can concurrently operate on this txn. |
toku::omt<txn_lt_key_ranges> lt_map;
|
Lock escalation joins multiple locks into one in the global lock tree. Then it calls escalation callback (which points to toku_db_txn_escalate_callback()).
void toku_db_txn_escalate_callback(TXNID txnid, |
const toku::locktree *lt, |
const toku::range_buffer &buffer, |
void *extra) |
The 3rd parameter of the function is a list of ranges that the transaction has locked after the escalation. toku_db_txn_escalate_callback replaces transaction's list of owned ranges with the provided list.
This way, lock escalation reduces memory usage in both the global lock table and in each transaction's list of owned locks.
One thing to care about is that lock escalation can happen in thread X, while the transaction operates in thread Y.
So, access to db_txn_struct_i(txn)->lt_map (or its equivalent) must be synchronized.
Attachments
Issue Links
- is part of
-
MDEV-15603 Gap Lock support in MyRocks
-
- Stalled
-
...and in TokuDB it is not fully synchronized.
Consider this example: apply this patch:
https://gist.github.com/spetrunia/b8d3d24acb957e772539af384a36d98a
Start the server.
) engine=tokudb;
Thread 2: insert something to disable STO:
Thread 1: acquire 10 locks:
Start acquiring 11th lock. Freeze the execution after we've got the lock, but before db_txn_note_row_lock has added it into the transaction's list of owned locks.
!touch /tmp/dbg1-range-lock-wait
Thread 2:
!rm /tmp/dbg1-range-lock-wait
b toku::locktree_manager::check_current_lock_constraints.
return true from check_current_lock_constraints, observe this in debugger:
toku::locktree::escalate
num_extracted=19 // total ranges
num_range_buffers= 2 // total transactions that own ranges
toku_db_txn_escalate_callback
ranges.buffer._num_ranges=10 // the list of owned locks has 10 entries
buffer._num_ranges=1 // they will be replaced with one
Continue and let Thread2's insert to complete.
Unfreeze thread1's INSERT
b db_txn_note_row_lock
!touch /tmp/dbg1-range-lock-wait-done
and see how this call
size_t old_mem_size = ranges.buffer->total_memory_size();
ranges.buffer->append(left_key, right_key);
will add a lock on point "12"
(gdb) p *left_key
$270 = {data = 0x7ffec40245d0, size = 5, ulen = 0, flags = 0}
(gdb) p *right_key
$271 = {data = 0x7ffec40245d0, size = 5, ulen = 0, flags = 0}
(gdb) x/5cx left_key.data
0x7ffec40245d0: 0x00 0x0c 0x00 0x00 0x00
(gdb) x/5cx right_key.data
0x7ffec40245d0: 0x00 0x0c 0x00 0x00 0x00
into the post-escalation lock list:
(gdb) p *ranges.buffer
$275 = {static MAX_KEY_SIZE = 65536, _arena = {_current_chunk = {buf = 0x7ffec4034d30 "", used = 18, size = 4096}, _other_chunks = 0x0, _n_other_chunks = 0, _size_of_other_chunks = 0, _footprint_of_other_chunks = 0}, _num_ranges = 1}
(gdb) set $ptr= ((char*)0x7ffec4034d30)
(gdb) x/10x ($ptr + sizeof(toku::range_buffer::record_header))
0x7ffec4034d38: 0x00 0x00 0x00 0x00 0x00 0x00 0x5a 0x00
0x7ffec4034d40: 0x00 0x00
(gdb) p 0x5a
$293 = 90
So, this property exists, but is currently harmless.