Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-18227

MyRocks-Gap-Lock: Lock escalation and updates to transaction's list of owned locks

Details

    Description

      In TokuDB/PerconaFT range locking works as follows:

      • Each SQL table has a global "Lock Tree" (see locktree.h,cc, class locktree) which stores all locks that are currently held by all transactions.
      • Besides that, each transaction keeps a list of its own locks in each locktree
        in db_txn_struct_i(txn)->lt_map.

      It is defined as follows:

      struct txn_lt_key_ranges {
          toku::locktree *lt;
          toku::range_buffer *buffer;
      };
       
      ...
          // maps a locktree to a buffer of key ranges that are locked.
          // it is protected by the txn_mutex, so hot indexing and a client
          // thread can concurrently operate on this txn.
          toku::omt<txn_lt_key_ranges> lt_map;
      

      Lock escalation joins multiple locks into one in the global lock tree. Then it calls escalation callback (which points to toku_db_txn_escalate_callback()).

      void toku_db_txn_escalate_callback(TXNID txnid, 
        const toku::locktree *lt, 
        const toku::range_buffer &buffer, 
        void *extra) 
      

      The 3rd parameter of the function is a list of ranges that the transaction has locked after the escalation. toku_db_txn_escalate_callback replaces transaction's list of owned ranges with the provided list.
      This way, lock escalation reduces memory usage in both the global lock table and in each transaction's list of owned locks.

      One thing to care about is that lock escalation can happen in thread X, while the transaction operates in thread Y.

      So, access to db_txn_struct_i(txn)->lt_map (or its equivalent) must be synchronized.

      Attachments

        Issue Links

          Activity

            psergei Sergei Petrunia added a comment - - edited

            ...and in TokuDB it is not fully synchronized.

            Consider this example: apply this patch:
            https://gist.github.com/spetrunia/b8d3d24acb957e772539af384a36d98a

            Start the server.

            create table ten(a int);
            insert into ten values (0),(1),(2),(3),(4),(5),(6),(7),(8),(9);
            create table t10 (
              pk int not null primary key,
              a int
            ) engine=tokudb;
            

            Thread 2: insert something to disable STO:

            begin;
            insert into t10 values (1000*1000, 100500);
            

            Thread 1: acquire 10 locks:

            begin;
            insert into t10 select a*10, a*10 from ten;
            

            Start acquiring 11th lock. Freeze the execution after we've got the lock, but before db_txn_note_row_lock has added it into the transaction's list of owned locks.

            !touch /tmp/dbg1-range-lock-wait
            insert into t10 values (12,12); -- this hangs
            

            Thread 2:

            !rm /tmp/dbg1-range-lock-wait
            

            b toku::locktree_manager::check_current_lock_constraints.

            insert into t10 values (1000*1000+1, 100500);
            

            return true from check_current_lock_constraints, observe this in debugger:

            toku::locktree::escalate
              num_extracted=19  // total ranges 
              num_range_buffers= 2 // total transactions that own ranges
             
            toku_db_txn_escalate_callback 
              ranges.buffer._num_ranges=10 // the list of owned locks has 10 entries
              buffer._num_ranges=1  //  they will be replaced with one 
            

            Continue and let Thread2's insert to complete.
            Unfreeze thread1's INSERT

            b db_txn_note_row_lock
            !touch /tmp/dbg1-range-lock-wait-done
            

            and see how this call

                // add a new lock range to this txn's row lock buffer
                size_t old_mem_size = ranges.buffer->total_memory_size();
                ranges.buffer->append(left_key, right_key);
            

            will add a lock on point "12"

            (gdb) p *left_key
              $270 = {data = 0x7ffec40245d0, size = 5, ulen = 0, flags = 0}
            (gdb) p *right_key
              $271 = {data = 0x7ffec40245d0, size = 5, ulen = 0, flags = 0}
            (gdb) x/5cx left_key.data
              0x7ffec40245d0:	0x00	0x0c	0x00	0x00	0x00
            (gdb) x/5cx right_key.data
              0x7ffec40245d0:	0x00	0x0c	0x00	0x00	0x00
            

            into the post-escalation lock list:

            (gdb) p *ranges.buffer
              $275 = {static MAX_KEY_SIZE = 65536, _arena = {_current_chunk = {buf = 0x7ffec4034d30 "", used = 18, size = 4096}, _other_chunks = 0x0, _n_other_chunks = 0, _size_of_other_chunks = 0, _footprint_of_other_chunks = 0}, _num_ranges = 1}
             
            (gdb) set $ptr= ((char*)0x7ffec4034d30)
             
            (gdb) x/10x ($ptr + sizeof(toku::range_buffer::record_header))
              0x7ffec4034d38:	0x00	0x00	0x00	0x00	0x00	0x00	0x5a	0x00
              0x7ffec4034d40:	0x00	0x00
            (gdb) p 0x5a
              $293 = 90
            

            So, this property exists, but is currently harmless.

            psergei Sergei Petrunia added a comment - - edited ...and in TokuDB it is not fully synchronized. Consider this example: apply this patch: https://gist.github.com/spetrunia/b8d3d24acb957e772539af384a36d98a Start the server. create table ten(a int ); insert into ten values (0),(1),(2),(3),(4),(5),(6),(7),(8),(9); create table t10 ( pk int not null primary key , a int ) engine=tokudb; Thread 2: insert something to disable STO: begin ; insert into t10 values (1000*1000, 100500); Thread 1: acquire 10 locks: begin ; insert into t10 select a*10, a*10 from ten; Start acquiring 11th lock. Freeze the execution after we've got the lock, but before db_txn_note_row_lock has added it into the transaction's list of owned locks. !touch /tmp/dbg1-range-lock-wait insert into t10 values (12,12); -- this hangs Thread 2: !rm /tmp/dbg1-range-lock-wait b toku::locktree_manager::check_current_lock_constraints . insert into t10 values (1000*1000+1, 100500); return true from check_current_lock_constraints, observe this in debugger: toku::locktree::escalate num_extracted=19 // total ranges num_range_buffers= 2 // total transactions that own ranges   toku_db_txn_escalate_callback ranges.buffer._num_ranges=10 // the list of owned locks has 10 entries buffer._num_ranges=1 // they will be replaced with one Continue and let Thread2's insert to complete. Unfreeze thread1's INSERT b db_txn_note_row_lock !touch /tmp/dbg1-range-lock-wait-done and see how this call // add a new lock range to this txn's row lock buffer size_t old_mem_size = ranges.buffer->total_memory_size(); ranges.buffer->append(left_key, right_key); will add a lock on point "12" (gdb) p *left_key $270 = {data = 0x7ffec40245d0, size = 5, ulen = 0, flags = 0} (gdb) p *right_key $271 = {data = 0x7ffec40245d0, size = 5, ulen = 0, flags = 0} (gdb) x/5cx left_key.data 0x7ffec40245d0: 0x00 0x0c 0x00 0x00 0x00 (gdb) x/5cx right_key.data 0x7ffec40245d0: 0x00 0x0c 0x00 0x00 0x00 into the post-escalation lock list: (gdb) p *ranges.buffer $275 = {static MAX_KEY_SIZE = 65536, _arena = {_current_chunk = {buf = 0x7ffec4034d30 "", used = 18, size = 4096}, _other_chunks = 0x0, _n_other_chunks = 0, _size_of_other_chunks = 0, _footprint_of_other_chunks = 0}, _num_ranges = 1}   (gdb) set $ptr= ((char*)0x7ffec4034d30)   (gdb) x/10x ($ptr + sizeof(toku::range_buffer::record_header)) 0x7ffec4034d38: 0x00 0x00 0x00 0x00 0x00 0x00 0x5a 0x00 0x7ffec4034d40: 0x00 0x00 (gdb) p 0x5a $293 = 90 So, this property exists, but is currently harmless.

            People

              psergei Sergei Petrunia
              psergei Sergei Petrunia
              Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.