[MDEV-12674] Innodb_row_lock_current_waits has overflow Created: 2017-05-03  Updated: 2020-08-25  Resolved: 2017-05-12

Status: Closed
Project: MariaDB Server
Component/s: Storage Engine - InnoDB
Affects Version/s: 10.0, 10.1, 10.1.20, 10.2
Fix Version/s: 10.1.24, 10.0.31, 10.2.7

Type: Bug Priority: Major
Reporter: Chris Calender (Inactive) Assignee: Marko Mäkelä
Resolution: Fixed Votes: 0
Labels: None

Issue Links:
Problem/Incident
causes MDEV-13009 10.1.24 does not compile on architect... Closed
causes MDEV-13705 10.0.32 does not compile on architect... Closed

 Description   

Innodb_row_lock_current_waits appears to have an overflow.

MariaDB [(none)]> SHOW GLOBAL STATUS LIKE 'Innodb_row_lock_current_waits';
+-------------------------------+----------------------+
| Variable_name | Value |
+-------------------------------+----------------------+
| Innodb_row_lock_current_waits | 18446744073709551613 |
+-------------------------------+----------------------+
1 row in set (0.00 sec)
 
MariaDB [(none)]> SHOW SESSION STATUS LIKE 'Innodb_row_lock_current_waits';
+-------------------------------+----------------------+
| Variable_name | Value |
+-------------------------------+----------------------+
| Innodb_row_lock_current_waits | 18446744073709551613 |
+-------------------------------+----------------------+
1 row in set (0.00 sec)

I don't know how to reproduce it, but have access to the system showing this (and it is currently still in this state).



 Comments   
Comment by Marko Mäkelä [ 2017-05-11 ]

The underlying counter is srv_stats.n_lock_wait_current_count, which is only modified by the function lock_wait_suspend_thread(). The incrementing is protected by lock_sys->wait_mutex, but the decrementing does not appear to be protected by anything. This mismatch could allow the counter to be corrupted when a lock wait is terminating roughly at the same time with the start of a wait on (possibly another) lock.
I think that both operations should be protected by atomics and that the incrementing can be done without holding any mutex.

Comment by Marko Mäkelä [ 2017-05-11 ]

The inaccuracy seems to be by design:

/** Class for using fuzzy counters. The counter is not protected by any
mutex and the results are not guaranteed to be 100% accurate but close
enough. Creates an array of counters and separates each element by the
CACHE_LINE_SIZE bytes */
template <
	typename Type,
	int N = IB_N_SLOTS,
	template<typename, int> class Indexer = default_indexer_t>
class ib_counter_t

That said, much of the time this template is instantiated with N=1, which appears to boil down to a single counter that is padded to two cache line sizes:

typedef ib_counter_t<ulint, 64> ulint_ctr_64_t;
typedef ib_counter_t<lsn_t, 1, single_indexer_t> lsn_ctr_1_t;
typedef ib_counter_t<ulint, 1, single_indexer_t> ulint_ctr_1_t;
typedef ib_counter_t<lint, 1, single_indexer_t> lint_ctr_1_t;
typedef ib_counter_t<int64_t, 1, single_indexer_t> int64_ctr_1_t;
typedef ib_counter_t<int64_t, IB_N_SLOTS> int64_counter_t;

It seems reasonable to use atomics for the various _ctr_1_t counters when the performance impact is low. When atomic memory operations are not available, I think it would be more acceptable to have some overflows than to risk significantly reducing performance with some mutex operations.

The only multi-slot counter types are ulint_ctr_64_t (used by a number of srv_stats counters) and int64_counter_t (used by rw_lock_stats). For the rw_lock_stats we should not care about accuracy. For srv_stats, the following counters are inaccurate by design. Based on the names of these counters, they are only ever incremented, never decremented, and it would seem to be acceptable to lose a few increments:

ulint_ctr_64_t          page_compression_saved;
ulint_ctr_64_t          index_pages_written;
ulint_ctr_64_t          non_index_pages_written;
ulint_ctr_64_t          pages_page_compressed;
ulint_ctr_64_t          page_compressed_trim_op;
ulint_ctr_64_t          pages_page_decompressed;
ulint_ctr_64_t          pages_page_compression_error;
ulint_ctr_64_t          pages_encrypted;
ulint_ctr_64_t          pages_decrypted;
ulint_ctr_64_t		n_rows_read;
ulint_ctr_64_t		n_rows_updated;
ulint_ctr_64_t		n_rows_deleted;
ulint_ctr_64_t		n_rows_inserted;
ulint_ctr_64_t		n_system_rows_read;
ulint_ctr_64_t		n_system_rows_updated;
ulint_ctr_64_t		n_system_rows_deleted;
ulint_ctr_64_t		n_system_rows_inserted;
ulint_ctr_64_t		n_sec_rec_cluster_reads;
ulint_ctr_64_t		n_sec_rec_cluster_reads_avoided;
ulint_ctr_64_t		page0_read;
ulint_ctr_64_t		n_key_requests;
ulint_ctr_64_t		n_log_scrubs;
ulint_ctr_64_t		key_rotation_list_length;

Comment by Marko Mäkelä [ 2017-05-11 ]

Only the following counters are both incremented and decremented, and thus we probably should use atomic memory access for updating them:

	/** Number of threads currently waiting on database locks */
	lint_ctr_1_t		n_lock_wait_current_count;
	/** Number of writes being done to the log files */
	lint_ctr_1_t		os_log_pending_writes;

Comment by Marko Mäkelä [ 2017-05-11 ]

srv_stats.os_log_pending_writes should be fine, because it is protected by log_sys->mutex (10.0, 10.1) or log_sys->write_mutex (10.2).

Similarly, fil_n_pending_log_flushes and fil_n_pending_tablespace_flushes are protected by fil_system->mutex.
Finally, MONITOR_OS_PENDING_READS, MONITOR_OS_PENDING_WRITES and their predecessors before MDEV-12534 are safe.

So, the only overflow-prone ‘pending’ counter is Innodb_row_lock_current_waits.

Comment by Marko Mäkelä [ 2017-05-11 ]

bb-10.0-marko
bb-10.1-marko
bb-10.2-marko
There is no test case. I was not even able to repeat the overflow.
This bug could require a very long run with an application with very high record lock contention to reproduce.
These fixes should not only fix the overflow; they could also slightly improve the performance of the counters.

Comment by Jan Lindström (Inactive) [ 2017-05-12 ]

ok to push.

Generated at Thu Feb 08 07:59:33 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.