Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-30658

lock_row_lock_current_waits counter in information_schema.innodb_metrics may become negative

Details

    Description

      Consider the following test case involving 3 sessions. In the first one run this:

      MariaDB [test]> CREATE TABLE `test` (`col` varchar(10) ) engine=InnoDB;
      Query OK, 0 rows affected (0.106 sec)
       
      MariaDB [test]> insert into test values ('a');
      Query OK, 1 row affected (0.038 sec)
       
      MariaDB [test]> select name, count,time_enabled from information_schema.innodb_metrics where name ='lock_row_lock_current_waits'; 
      +-----------------------------+-------+---------------------+
      | name                        | count | time_enabled        |
      +-----------------------------+-------+---------------------+
      | lock_row_lock_current_waits |     0 | 2023-02-14 18:31:50 |
      +-----------------------------+-------+---------------------+
      1 row in set (0.001 sec)
      

      Now, in the second one update the only row in the InnoDB table created:

      MariaDB [test]> begin;update test set col='c' where col='a';
      Query OK, 0 rows affected (0.000 sec)
       
      Query OK, 1 row affected (0.003 sec)
      Rows matched: 1  Changed: 1  Warnings: 0
      

      In the third one run similar update that will hang waiting for 50 seconds (with default settings):

      MariaDB [test]> begin;update test set col='b' where col='a';
      

      Now in the first session we can see counter increased, reset it and wait a bit:

      MariaDB [test]> select name, count,time_enabled from information_schema.innodb_metrics where name ='lock_row_lock_current_waits'; 
      +-----------------------------+-------+---------------------+
      | name                        | count | time_enabled        |
      +-----------------------------+-------+---------------------+
      | lock_row_lock_current_waits |     1 | 2023-02-14 18:31:50 |
      +-----------------------------+-------+---------------------+
      1 row in set (0.001 sec)
       
      MariaDB [test]> set global innodb_monitor_disable='lock_row_lock_current_waits';
      Query OK, 0 rows affected (0.000 sec)
       
      MariaDB [test]> set global innodb_monitor_reset_all='lock_row_lock_current_waits'; 
      Query OK, 0 rows affected (0.000 sec)
       
      MariaDB [test]> set global innodb_monitor_enable='lock_row_lock_current_waits';
      Query OK, 0 rows affected (0.000 sec)
       
      MariaDB [test]> select name, count,time_enabled from information_schema.innodb_metrics where name ='lock_row_lock_current_waits';
      +-----------------------------+-------+---------------------+
      | name                        | count | time_enabled        |
      +-----------------------------+-------+---------------------+
      | lock_row_lock_current_waits |     0 | 2023-02-15 18:19:42 |
      +-----------------------------+-------+---------------------+
      1 row in set (0.001 sec)
      

      When we get this error message in the third session:

      Query OK, 0 rows affected (0.000 sec)
       
      ERROR 1205 (HY000): Lock wait timeout exceeded; try restarting transaction
      MariaDB [test]> 
      

      in the first one we get a negative value of the counter:

      MariaDB [test]> select name, count,time_enabled from information_schema.innodb_metrics where name ='lock_row_lock_current_waits';
      +-----------------------------+-------+---------------------+
      | name                        | count | time_enabled        |
      +-----------------------------+-------+---------------------+
      | lock_row_lock_current_waits |    -1 | 2023-02-15 18:19:42 |
      +-----------------------------+-------+---------------------+
      1 row in set (0.001 sec)
      

      This is a bug and similar bugs were reported (and some fixed) many times for MySQL:

      https://bugs.mysql.com/bug.php?id=73114
      https://bugs.mysql.com/bug.php?id=75966
      https://bugs.mysql.com/bug.php?id=81099
      https://bugs.mysql.com/bug.php?id=79703
      ...

      Attachments

        Issue Links

          Activity

            This was reported to MySQL too:

            https://bugs.mysql.com/bug.php?id=110101

            valerii Valerii Kravchuk added a comment - This was reported to MySQL too: https://bugs.mysql.com/bug.php?id=110101

            As far as I remember, there are three types of these counters:

            • totally racey ones, incremented and decremented without atomic memory operations or mutex protection
            • ones that mirror another counter (typically starting with an internal name MONITOR_OLVD_) that may or may not be updated in a race-free way
            • ones that use atomic increments and decrements

            The lock_row_lock_current_waits counter in MariaDB 10.6 is MONITOR_OVLD_ROW_LOCK_CURRENT_WAIT. It reflects the following:

            	/* innodb_row_lock_current_waits */
            	case MONITOR_OVLD_ROW_LOCK_CURRENT_WAIT:
            		// dirty read without lock_sys.wait_mutex
            		value = lock_sys.get_wait_pending();
            		break;
            

            Updates of the counter are protected by lock_sys.wait_mutex, but we are performing a dirty read here, because after we have read the value, it may be stale anyway. I do not think that a negative return value can be caused due to that.

            In 10.5, the code is a little different:

            	/* innodb_row_lock_current_waits */
            	case MONITOR_OVLD_ROW_LOCK_CURRENT_WAIT:
            		value = srv_stats.n_lock_wait_current_count;
            		break;
            

            This is a sharded atomic counter:

            	/** Number of threads currently waiting on database locks */
            	MY_ALIGNED(CACHE_LINE_SIZE) Atomic_counter<ulint>
            				n_lock_wait_current_count;
            

            In neither case should the negative value be caused by an unsafe read of the counter. Something in the InnoDB transactional lock subsystem is causing the counter value to become corrupted.

            marko Marko Mäkelä added a comment - As far as I remember, there are three types of these counters: totally racey ones, incremented and decremented without atomic memory operations or mutex protection ones that mirror another counter (typically starting with an internal name MONITOR_OLVD_ ) that may or may not be updated in a race-free way ones that use atomic increments and decrements The lock_row_lock_current_waits counter in MariaDB 10.6 is MONITOR_OVLD_ROW_LOCK_CURRENT_WAIT . It reflects the following: /* innodb_row_lock_current_waits */ case MONITOR_OVLD_ROW_LOCK_CURRENT_WAIT: // dirty read without lock_sys.wait_mutex value = lock_sys.get_wait_pending(); break ; Updates of the counter are protected by lock_sys.wait_mutex , but we are performing a dirty read here, because after we have read the value, it may be stale anyway. I do not think that a negative return value can be caused due to that. In 10.5, the code is a little different: /* innodb_row_lock_current_waits */ case MONITOR_OVLD_ROW_LOCK_CURRENT_WAIT: value = srv_stats.n_lock_wait_current_count; break ; This is a sharded atomic counter: /** Number of threads currently waiting on database locks */ MY_ALIGNED(CACHE_LINE_SIZE) Atomic_counter<ulint> n_lock_wait_current_count; In neither case should the negative value be caused by an unsafe read of the counter. Something in the InnoDB transactional lock subsystem is causing the counter value to become corrupted.
            vlad.lesin Vladislav Lesin added a comment - - edited

            The following analyses is based on 10.5. The stable test case for 10.5 is the following: n_lock_wait_current_count.test. We need also to add debug sync point for 10.5, which already presents in 10.6:

            --- a/storage/innobase/lock/lock0wait.cc
            +++ b/storage/innobase/lock/lock0wait.cc
            @@ -331,6 +331,7 @@ lock_wait_suspend_thread(
                            thd_wait_begin(trx->mysql_thd, THD_WAIT_TABLE_LOCK);
                    }
             
            +       DEBUG_SYNC_C("lock_wait_before_suspend");
                    os_event_wait(slot->event);
                    DBUG_EXECUTE_IF("small_sleep_after_lock_wait",
                            {
            

            Apart from srv_stats.n_lock_wait_current_count, as Marko noted above, there is also innodb_counter_value array. And it contains the values which then displayed in information schema, and srv_mon_process_existing_counter() fills the elements of the array.

            There are two ways to fill the element of innodb_counter_value array in srv_mon_process_existing_counter depending on MONITOR_DISPLAY_CURRENT flag in monitor_info->monitor_type:
            1) MONITOR_SET() - "Directly set a monitor counter's value."
            2) MONITOR_SET_DIFF() - "Set monitor value to the difference of value and mon_start_value compensated by mon_last_value if accumulated value is required."

            For MONITOR_OVLD_ROW_LOCK_CURRENT_WAIT MONITOR_DISPLAY_CURRENT is not set for some reason, that's why MONITOR_SET_DIFF(). It's counted by the following formula:

            /** Set monitor value to the difference of value and mon_start_value            
            compensated by mon_last_value if accumulated value is required. */              
            #define MONITOR_SET_DIFF(monitor, value)                                \       
                    MONITOR_SET_UPD_MAX_ONLY(monitor, ((value)                      \       
                    - MONITOR_VALUE_RESET(monitor)                                  \       
                    - MONITOR_FIELD(monitor, mon_start_value)                       \       
                    + MONITOR_FIELD(monitor, mon_last_value)))
            

            At the moment when the value is set to -1, we have the following innodb_counter_value array element value:

            {mon_start_time = 1695982916, mon_stop_time = 0, mon_reset_time = 0, mon_value = 0, mon_max_value = 0, mon_min_value = 9223372036854775807,
              mon_value_reset = 0, mon_max_value_start = 0, mon_min_value_start = 9223372036854775807, mon_start_value = 1, mon_last_value = 0,
              mon_status = MONITOR_STARTED}
            

            So, according to the formula above, we have:

            (value = 0) - (mon_value_reset = 0) - (mon_start_value = 1) + (mon_last_value = 0) = -1
            

            mon_start_value is set during "set global innodb_monitor_disable='lock_row_lock_current_waits'" execution.

            The above formula counts something, which should be accumulated. But srv_stats.n_lock_wait_current_count counter is not supposed to be accumulated, because it's incremented just before thread suspending and decremented just after thread wake up.

            The fix I propose is to set MONITOR_DISPLAY_CURRENT for this monitor.

            vlad.lesin Vladislav Lesin added a comment - - edited The following analyses is based on 10.5. The stable test case for 10.5 is the following: n_lock_wait_current_count.test . We need also to add debug sync point for 10.5, which already presents in 10.6: --- a/storage/innobase/lock/lock0wait.cc +++ b/storage/innobase/lock/lock0wait.cc @@ - 331 , 6 + 331 , 7 @@ lock_wait_suspend_thread( thd_wait_begin(trx->mysql_thd, THD_WAIT_TABLE_LOCK); } + DEBUG_SYNC_C( "lock_wait_before_suspend" ); os_event_wait(slot->event); DBUG_EXECUTE_IF( "small_sleep_after_lock_wait" , { Apart from srv_stats.n_lock_wait_current_count, as Marko noted above, there is also innodb_counter_value array. And it contains the values which then displayed in information schema, and srv_mon_process_existing_counter() fills the elements of the array. There are two ways to fill the element of innodb_counter_value array in srv_mon_process_existing_counter depending on MONITOR_DISPLAY_CURRENT flag in monitor_info->monitor_type: 1) MONITOR_SET() - "Directly set a monitor counter's value." 2) MONITOR_SET_DIFF() - "Set monitor value to the difference of value and mon_start_value compensated by mon_last_value if accumulated value is required." For MONITOR_OVLD_ROW_LOCK_CURRENT_WAIT MONITOR_DISPLAY_CURRENT is not set for some reason, that's why MONITOR_SET_DIFF(). It's counted by the following formula: /** Set monitor value to the difference of value and mon_start_value compensated by mon_last_value if accumulated value is required. */ #define MONITOR_SET_DIFF(monitor, value) \ MONITOR_SET_UPD_MAX_ONLY(monitor, ((value) \ - MONITOR_VALUE_RESET(monitor) \ - MONITOR_FIELD(monitor, mon_start_value) \ + MONITOR_FIELD(monitor, mon_last_value))) At the moment when the value is set to -1, we have the following innodb_counter_value array element value: {mon_start_time = 1695982916, mon_stop_time = 0, mon_reset_time = 0, mon_value = 0, mon_max_value = 0, mon_min_value = 9223372036854775807, mon_value_reset = 0, mon_max_value_start = 0, mon_min_value_start = 9223372036854775807, mon_start_value = 1, mon_last_value = 0, mon_status = MONITOR_STARTED} So, according to the formula above, we have: (value = 0) - (mon_value_reset = 0) - (mon_start_value = 1) + (mon_last_value = 0) = -1 mon_start_value is set during "set global innodb_monitor_disable='lock_row_lock_current_waits'" execution. The above formula counts something, which should be accumulated. But srv_stats.n_lock_wait_current_count counter is not supposed to be accumulated, because it's incremented just before thread suspending and decremented just after thread wake up. The fix I propose is to set MONITOR_DISPLAY_CURRENT for this monitor.

            Thank you, the fix looks good to me.

            marko Marko Mäkelä added a comment - Thank you, the fix looks good to me.

            People

              vlad.lesin Vladislav Lesin
              valerii Valerii Kravchuk
              Votes:
              1 Vote for this issue
              Watchers:
              5 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.