[MDEV-30658] lock_row_lock_current_waits counter in information_schema.innodb_metrics may become negative Created: 2023-02-15  Updated: 2023-11-28  Resolved: 2023-10-05

Status: Closed
Project: MariaDB Server
Component/s: Information Schema, Locking, Storage Engine - InnoDB
Affects Version/s: 10.5.17, 10.6.12
Fix Version/s: 10.4.32, 10.5.23, 10.6.16, 10.10.7, 10.11.6, 11.0.4, 11.1.3

Type: Bug Priority: Critical
Reporter: Valerii Kravchuk Assignee: Vladislav Lesin
Resolution: Fixed Votes: 1
Labels: None

Attachments: File n_lock_wait_current_count.test    
Issue Links:
Problem/Incident
causes MDEV-32554 mon_lock_wait_current_count test caus... Closed

 Description   

Consider the following test case involving 3 sessions. In the first one run this:

MariaDB [test]> CREATE TABLE `test` (`col` varchar(10) ) engine=InnoDB;
Query OK, 0 rows affected (0.106 sec)
 
MariaDB [test]> insert into test values ('a');
Query OK, 1 row affected (0.038 sec)
 
MariaDB [test]> select name, count,time_enabled from information_schema.innodb_metrics where name ='lock_row_lock_current_waits'; 
+-----------------------------+-------+---------------------+
| name                        | count | time_enabled        |
+-----------------------------+-------+---------------------+
| lock_row_lock_current_waits |     0 | 2023-02-14 18:31:50 |
+-----------------------------+-------+---------------------+
1 row in set (0.001 sec)

Now, in the second one update the only row in the InnoDB table created:

MariaDB [test]> begin;update test set col='c' where col='a';
Query OK, 0 rows affected (0.000 sec)
 
Query OK, 1 row affected (0.003 sec)
Rows matched: 1  Changed: 1  Warnings: 0

In the third one run similar update that will hang waiting for 50 seconds (with default settings):

MariaDB [test]> begin;update test set col='b' where col='a';

Now in the first session we can see counter increased, reset it and wait a bit:

MariaDB [test]> select name, count,time_enabled from information_schema.innodb_metrics where name ='lock_row_lock_current_waits'; 
+-----------------------------+-------+---------------------+
| name                        | count | time_enabled        |
+-----------------------------+-------+---------------------+
| lock_row_lock_current_waits |     1 | 2023-02-14 18:31:50 |
+-----------------------------+-------+---------------------+
1 row in set (0.001 sec)
 
MariaDB [test]> set global innodb_monitor_disable='lock_row_lock_current_waits';
Query OK, 0 rows affected (0.000 sec)
 
MariaDB [test]> set global innodb_monitor_reset_all='lock_row_lock_current_waits'; 
Query OK, 0 rows affected (0.000 sec)
 
MariaDB [test]> set global innodb_monitor_enable='lock_row_lock_current_waits';
Query OK, 0 rows affected (0.000 sec)
 
MariaDB [test]> select name, count,time_enabled from information_schema.innodb_metrics where name ='lock_row_lock_current_waits';
+-----------------------------+-------+---------------------+
| name                        | count | time_enabled        |
+-----------------------------+-------+---------------------+
| lock_row_lock_current_waits |     0 | 2023-02-15 18:19:42 |
+-----------------------------+-------+---------------------+
1 row in set (0.001 sec)

When we get this error message in the third session:

Query OK, 0 rows affected (0.000 sec)
 
ERROR 1205 (HY000): Lock wait timeout exceeded; try restarting transaction
MariaDB [test]> 

in the first one we get a negative value of the counter:

MariaDB [test]> select name, count,time_enabled from information_schema.innodb_metrics where name ='lock_row_lock_current_waits';
+-----------------------------+-------+---------------------+
| name                        | count | time_enabled        |
+-----------------------------+-------+---------------------+
| lock_row_lock_current_waits |    -1 | 2023-02-15 18:19:42 |
+-----------------------------+-------+---------------------+
1 row in set (0.001 sec)

This is a bug and similar bugs were reported (and some fixed) many times for MySQL:

https://bugs.mysql.com/bug.php?id=73114
https://bugs.mysql.com/bug.php?id=75966
https://bugs.mysql.com/bug.php?id=81099
https://bugs.mysql.com/bug.php?id=79703
...



 Comments   
Comment by Valerii Kravchuk [ 2023-02-16 ]

This was reported to MySQL too:

https://bugs.mysql.com/bug.php?id=110101

Comment by Marko Mäkelä [ 2023-02-27 ]

As far as I remember, there are three types of these counters:

  • totally racey ones, incremented and decremented without atomic memory operations or mutex protection
  • ones that mirror another counter (typically starting with an internal name MONITOR_OLVD_) that may or may not be updated in a race-free way
  • ones that use atomic increments and decrements

The lock_row_lock_current_waits counter in MariaDB 10.6 is MONITOR_OVLD_ROW_LOCK_CURRENT_WAIT. It reflects the following:

	/* innodb_row_lock_current_waits */
	case MONITOR_OVLD_ROW_LOCK_CURRENT_WAIT:
		// dirty read without lock_sys.wait_mutex
		value = lock_sys.get_wait_pending();
		break;

Updates of the counter are protected by lock_sys.wait_mutex, but we are performing a dirty read here, because after we have read the value, it may be stale anyway. I do not think that a negative return value can be caused due to that.

In 10.5, the code is a little different:

	/* innodb_row_lock_current_waits */
	case MONITOR_OVLD_ROW_LOCK_CURRENT_WAIT:
		value = srv_stats.n_lock_wait_current_count;
		break;

This is a sharded atomic counter:

	/** Number of threads currently waiting on database locks */
	MY_ALIGNED(CACHE_LINE_SIZE) Atomic_counter<ulint>
				n_lock_wait_current_count;

In neither case should the negative value be caused by an unsafe read of the counter. Something in the InnoDB transactional lock subsystem is causing the counter value to become corrupted.

Comment by Vladislav Lesin [ 2023-10-03 ]

The following analyses is based on 10.5. The stable test case for 10.5 is the following: n_lock_wait_current_count.test. We need also to add debug sync point for 10.5, which already presents in 10.6:

--- a/storage/innobase/lock/lock0wait.cc
+++ b/storage/innobase/lock/lock0wait.cc
@@ -331,6 +331,7 @@ lock_wait_suspend_thread(
                thd_wait_begin(trx->mysql_thd, THD_WAIT_TABLE_LOCK);
        }
 
+       DEBUG_SYNC_C("lock_wait_before_suspend");
        os_event_wait(slot->event);
        DBUG_EXECUTE_IF("small_sleep_after_lock_wait",
                {

Apart from srv_stats.n_lock_wait_current_count, as Marko noted above, there is also innodb_counter_value array. And it contains the values which then displayed in information schema, and srv_mon_process_existing_counter() fills the elements of the array.

There are two ways to fill the element of innodb_counter_value array in srv_mon_process_existing_counter depending on MONITOR_DISPLAY_CURRENT flag in monitor_info->monitor_type:
1) MONITOR_SET() - "Directly set a monitor counter's value."
2) MONITOR_SET_DIFF() - "Set monitor value to the difference of value and mon_start_value compensated by mon_last_value if accumulated value is required."

For MONITOR_OVLD_ROW_LOCK_CURRENT_WAIT MONITOR_DISPLAY_CURRENT is not set for some reason, that's why MONITOR_SET_DIFF(). It's counted by the following formula:

/** Set monitor value to the difference of value and mon_start_value            
compensated by mon_last_value if accumulated value is required. */              
#define MONITOR_SET_DIFF(monitor, value)                                \       
        MONITOR_SET_UPD_MAX_ONLY(monitor, ((value)                      \       
        - MONITOR_VALUE_RESET(monitor)                                  \       
        - MONITOR_FIELD(monitor, mon_start_value)                       \       
        + MONITOR_FIELD(monitor, mon_last_value)))

At the moment when the value is set to -1, we have the following innodb_counter_value array element value:

{mon_start_time = 1695982916, mon_stop_time = 0, mon_reset_time = 0, mon_value = 0, mon_max_value = 0, mon_min_value = 9223372036854775807,
  mon_value_reset = 0, mon_max_value_start = 0, mon_min_value_start = 9223372036854775807, mon_start_value = 1, mon_last_value = 0,
  mon_status = MONITOR_STARTED}

So, according to the formula above, we have:

(value = 0) - (mon_value_reset = 0) - (mon_start_value = 1) + (mon_last_value = 0) = -1

mon_start_value is set during "set global innodb_monitor_disable='lock_row_lock_current_waits'" execution.

The above formula counts something, which should be accumulated. But srv_stats.n_lock_wait_current_count counter is not supposed to be accumulated, because it's incremented just before thread suspending and decremented just after thread wake up.

The fix I propose is to set MONITOR_DISPLAY_CURRENT for this monitor.

Comment by Marko Mäkelä [ 2023-10-04 ]

Thank you, the fix looks good to me.

Generated at Thu Feb 08 10:17:55 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.