Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-33004

innodb.cursor-restore-locking test fails

Details

    Description

      https://buildbot.mariadb.org/#/builders/572/builds/4845/steps/8/logs/stdio:

      innodb.cursor-restore-locking 'innodb'   w30 [ fail ]
              Test ended at 2023-12-08 09:32:32
      CURRENT_TEST: innodb.cursor-restore-locking
      mysqltest: At line 71: query 'reap' failed: 1205: Lock wait timeout exceeded; try restarting transaction
      

      Attachments

        Activity

          The bug is well reproducible and can be recorded with rr.

          The initial state before the bug:
          1. "START TRANSACTION WITH CONSISTENT SNAPSHOT" - is active, has not been rolled back yet
          2. "DELETE FROM t WHERE b = 20 # trx_1" - is committed
          3. "INSERT INTO t VALUES(10, 20) # trx_2" - hanging on "ib_after_row_insert" sync point, waiting for "first_ins_cont" signal
          4. "DELETE FROM t WHERE b = 20 # trx_3" - blocked on delete-marked by trx_1 record, waiting for trx_2
          5. connection "default" is waiting on 'now WAIT_FOR row_purge_del_mark_finished'

          The bug scenario is the following:

          1. trx_purge() is invoked by purge coordinator thread, trx_purge_fetch_next_rec() returns NULL because "START TRANSACTION WITH CONSISTENT SNAPSHOT" is still active and the condition (purge_sys.tail.trx_no >= purge_sys.low_limit_no()) is true. trx_purge_attach_undo_recs() and trx_purge() return 0 to srv_do_purge(), see the following call stack:

          ▾ TrxUndoRsegsIterator::set_next                                                                                     
            ▾ trx_purge_choose_next_log                                                   
              ▾ trx_purge_get_next_rec                                                    
                ▾ trx_purge_fetch_next_rec                                                
                  ▾ trx_purge_attach_undo_recs                                            
                    ▾ trx_purge                                                           
                      ▾ srv_do_purge                                                      
                        ▾ purge_coordinator_callback_low                                  
                          • purge_coordinator_callback 
          

          2. Consider the following code in srv_do_purge():

                  do {                                                                    
                          ...                                                             
                          /* Take a snapshot of the history list before purge. */         
                          if (!(rseg_history_len = trx_sys.rseg_history_len)) {           
                                  ...                                          
                          }                                                               
                                                                                          
                          n_pages_purged = trx_purge(n_use_threads);                      
                          ...                                                             
                  } while (n_pages_purged > 0 && !purge_sys.paused()                      
                           && !srv_purge_should_exit(rseg_history_len)); 
                  return(rseg_history_len);
          

          So, trx_purge() purges nothing because there is transaction with view, which can see the changes, made by trx_1 and trx2, i.e. delete-marked and inserted records. trx_sys.rseg_history_len equal to 2. srv_do_purge() returns 2 despite it purged nothing.

          3. Consider the following code in purge_coordinator_callback_low():

            do                                                                                                                 
            {                                                                                                                  
              ...                                                                                                              
              purge_state.m_history_length= srv_do_purge(&n_total_purged);                                                     
              ...                                                                                                              
              /* If last purge batch processed less than 1 page and there is                                                   
              still work to do, delay the next batch by 10ms. Unless                                                           
              someone added work and woke us up. */                                       
              if (n_total_purged == 0)                                                                                         
              {                                                                           
                ...                                                                                                            
                if (!woken_during_purge)                                                                                       
                {                                                                                                              
                  /* Delay next purge round*/                                                                                  
                  purge_coordinator_timer->set_time(10, 0);                                                                    
                  return;                                                                 
                }                                                                                                              
              }                                                                                                                
            }                                                                                                                  
            while ((purge_sys.enabled() && !purge_sys.paused()) ||                                                             
                   !srv_purge_should_exit(trx_sys.rseg_history_len));    
          

          After the above code execution purge_state.m_history_length is equal to 2, the timer is set to 10 milliseconds and purge_coordinator_callback() is quitted.

          4. When 10 milliseconds is expired, purge_coordinator_timer_callback() is invoked. Take a look the following code in purge_coordinator_timer_callback():

            if (purge_state.m_history_length < 5000 &&                                    
                purge_state.m_history_length == trx_sys.rseg_history_len)                 
              /* No new records were added since wait started.                            
              Simply wait for new records. The magic number 5000 is an                    
              approximation for the case where we have cached UNDO                        
              log records which prevent truncate of the UNDO segments.*/                  
              return;                                                                     
            srv_wake_purge_thread_if_not_active();  
          

          So, purge_state.m_history_length was set to 2 on step 3. It was set to the value of trx_sys.rseg_history_len. The (purge_state.m_history_length == trx_sys.rseg_history_len) condition is true, that's why purge_coordinator_timer_callback() does not awake purge coordinator thread. Every further purge_coordinator_timer_callback() will not awake purge coordinator thread because the above condition stays true.

          The overall logic is the following. Save history length before suspending, and if the history length was not changed while purge coordinator thread was suspended, then just do nothing. But it does not take into account that history can be not purged before coordinator thread is suspended.

          vlad.lesin Vladislav Lesin added a comment - The bug is well reproducible and can be recorded with rr. The initial state before the bug: 1. "START TRANSACTION WITH CONSISTENT SNAPSHOT" - is active, has not been rolled back yet 2. "DELETE FROM t WHERE b = 20 # trx_1" - is committed 3. "INSERT INTO t VALUES(10, 20) # trx_2" - hanging on "ib_after_row_insert" sync point, waiting for "first_ins_cont" signal 4. "DELETE FROM t WHERE b = 20 # trx_3" - blocked on delete-marked by trx_1 record, waiting for trx_2 5. connection "default" is waiting on 'now WAIT_FOR row_purge_del_mark_finished' The bug scenario is the following: 1. trx_purge() is invoked by purge coordinator thread, trx_purge_fetch_next_rec() returns NULL because "START TRANSACTION WITH CONSISTENT SNAPSHOT" is still active and the condition (purge_sys.tail.trx_no >= purge_sys.low_limit_no()) is true. trx_purge_attach_undo_recs() and trx_purge() return 0 to srv_do_purge(), see the following call stack: ▾ TrxUndoRsegsIterator::set_next ▾ trx_purge_choose_next_log ▾ trx_purge_get_next_rec ▾ trx_purge_fetch_next_rec ▾ trx_purge_attach_undo_recs ▾ trx_purge ▾ srv_do_purge ▾ purge_coordinator_callback_low • purge_coordinator_callback 2. Consider the following code in srv_do_purge(): do { ... /* Take a snapshot of the history list before purge. */ if (!(rseg_history_len = trx_sys.rseg_history_len)) { ... } n_pages_purged = trx_purge(n_use_threads); ... } while (n_pages_purged > 0 && !purge_sys.paused() && !srv_purge_should_exit(rseg_history_len)); return (rseg_history_len); So, trx_purge() purges nothing because there is transaction with view, which can see the changes, made by trx_1 and trx2, i.e. delete-marked and inserted records. trx_sys.rseg_history_len equal to 2. srv_do_purge() returns 2 despite it purged nothing. 3. Consider the following code in purge_coordinator_callback_low(): do { ... purge_state.m_history_length= srv_do_purge(&n_total_purged); ... /* If last purge batch processed less than 1 page and there is still work to do, delay the next batch by 10ms. Unless someone added work and woke us up. */ if (n_total_purged == 0 ) { ... if (!woken_during_purge) { /* Delay next purge round*/ purge_coordinator_timer->set_time( 10 , 0 ); return ; } } } while ((purge_sys.enabled() && !purge_sys.paused()) || !srv_purge_should_exit(trx_sys.rseg_history_len)); After the above code execution purge_state.m_history_length is equal to 2, the timer is set to 10 milliseconds and purge_coordinator_callback() is quitted. 4. When 10 milliseconds is expired, purge_coordinator_timer_callback() is invoked. Take a look the following code in purge_coordinator_timer_callback(): if (purge_state.m_history_length < 5000 && purge_state.m_history_length == trx_sys.rseg_history_len) /* No new records were added since wait started. Simply wait for new records. The magic number 5000 is an approximation for the case where we have cached UNDO log records which prevent truncate of the UNDO segments.*/ return ; srv_wake_purge_thread_if_not_active(); So, purge_state.m_history_length was set to 2 on step 3. It was set to the value of trx_sys.rseg_history_len. The (purge_state.m_history_length == trx_sys.rseg_history_len) condition is true, that's why purge_coordinator_timer_callback() does not awake purge coordinator thread. Every further purge_coordinator_timer_callback() will not awake purge coordinator thread because the above condition stays true. The overall logic is the following. Save history length before suspending, and if the history length was not changed while purge coordinator thread was suspended, then just do nothing. But it does not take into account that history can be not purged before coordinator thread is suspended.

          Some useful notes from Marko:

          marko
          In 10.6 717a32154cc6de81d1a0d3a820928b78aa7b1140 it was refactored:

          +      /* The magic number 5000 is an approximation for the case where we have
          +      cached undo log records which prevent truncate of rollback segments. */
          +      wakeup= history_size &&
          +        (history_size >= 5000 ||
          +         history_size != trx_sys.history_size_approx());
          

          marko
          and in 44689eb7d85b05df9ffbc631735447ba96c92e41 (MDEV-32050) this code was removed:

          -  if (wakeup)
          -    purge_coordinator_timer->set_time(10, 0);
          

          marko
          It might be easiest to remove this test from 10.5 (but not 10.6).

          vlad.lesin Vladislav Lesin added a comment - Some useful notes from Marko: marko In 10.6 717a32154cc6de81d1a0d3a820928b78aa7b1140 it was refactored: + /* The magic number 5000 is an approximation for the case where we have + cached undo log records which prevent truncate of rollback segments. */ + wakeup= history_size && + (history_size >= 5000 || + history_size != trx_sys.history_size_approx()); marko and in 44689eb7d85b05df9ffbc631735447ba96c92e41 ( MDEV-32050 ) this code was removed: - if (wakeup) - purge_coordinator_timer->set_time(10, 0); marko It might be easiest to remove this test from 10.5 (but not 10.6).

          This was fixed by removing the test innodb.cursor-restore-locking from 10.5 only. The test will remain in 10.6 and later releases, where it is stable.

          marko Marko Mäkelä added a comment - This was fixed by removing the test innodb.cursor-restore-locking from 10.5 only. The test will remain in 10.6 and later releases, where it is stable.

          People

            vlad.lesin Vladislav Lesin
            vlad.lesin Vladislav Lesin
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Git Integration

                Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.