[MDEV-12674] Innodb_row_lock_current_waits has overflow - Jira

Details

Type: Bug
Status: Closed (View Workflow)
Priority: Major
Resolution: Fixed
Affects Version/s: 10.1.20, 10.0(EOL), 10.1(EOL), 10.2(EOL)
Fix Version/s: 10.1.24, 10.0.31, 10.2.7
Component/s: Storage Engine - InnoDB
Labels:
None

Description

Innodb_row_lock_current_waits appears to have an overflow.

MariaDB [(none)]> SHOW GLOBAL STATUS LIKE 'Innodb_row_lock_current_waits';

+-------------------------------+----------------------+

| Variable_name | Value |

+-------------------------------+----------------------+

| Innodb_row_lock_current_waits | 18446744073709551613 |

+-------------------------------+----------------------+

1 row in set (0.00 sec)

MariaDB [(none)]> SHOW SESSION STATUS LIKE 'Innodb_row_lock_current_waits';

+-------------------------------+----------------------+

| Variable_name | Value |

+-------------------------------+----------------------+

| Innodb_row_lock_current_waits | 18446744073709551613 |

+-------------------------------+----------------------+

1 row in set (0.00 sec)

I don't know how to reproduce it, but have access to the system showing this (and it is currently still in this state).

Attachments

Issue Links

causes

MDEV-13009 10.1.24 does not compile on architectures without 64-bit atomics

Closed

MDEV-13705 10.0.32 does not compile on architectures without 64-bit atomics

Closed

Activity

Ascending order - Click to sort in descending order

Marko Mäkelä added a comment - 2017-05-11 06:18

The underlying counter is srv_stats.n_lock_wait_current_count, which is only modified by the function lock_wait_suspend_thread(). The incrementing is protected by lock_sys->wait_mutex, but the decrementing does not appear to be protected by anything. This mismatch could allow the counter to be corrupted when a lock wait is terminating roughly at the same time with the start of a wait on (possibly another) lock.
I think that both operations should be protected by atomics and that the incrementing can be done without holding any mutex.

Marko Mäkelä added a comment - 2017-05-11 06:18 The underlying counter is srv_stats.n_lock_wait_current_count, which is only modified by the function lock_wait_suspend_thread(). The incrementing is protected by lock_sys->wait_mutex, but the decrementing does not appear to be protected by anything. This mismatch could allow the counter to be corrupted when a lock wait is terminating roughly at the same time with the start of a wait on (possibly another) lock. I think that both operations should be protected by atomics and that the incrementing can be done without holding any mutex.

Marko Mäkelä added a comment - 2017-05-11 06:46

The inaccuracy seems to be by design:

/** Class for using fuzzy counters. The counter is not protected by any

mutex and the results are not guaranteed to be 100% accurate but close

enough. Creates an array of counters and separates each element by the

CACHE_LINE_SIZE bytes */

template <

	typename Type,

	int N = IB_N_SLOTS,

	template<typename, int> class Indexer = default_indexer_t>

class ib_counter_t

That said, much of the time this template is instantiated with N=1, which appears to boil down to a single counter that is padded to two cache line sizes:

typedef ib_counter_t<ulint, 64> ulint_ctr_64_t;

typedef ib_counter_t<lsn_t, 1, single_indexer_t> lsn_ctr_1_t;

typedef ib_counter_t<ulint, 1, single_indexer_t> ulint_ctr_1_t;

typedef ib_counter_t<lint, 1, single_indexer_t> lint_ctr_1_t;

typedef ib_counter_t<int64_t, 1, single_indexer_t> int64_ctr_1_t;

typedef ib_counter_t<int64_t, IB_N_SLOTS> int64_counter_t;

It seems reasonable to use atomics for the various _ctr_1_t counters when the performance impact is low. When atomic memory operations are not available, I think it would be more acceptable to have some overflows than to risk significantly reducing performance with some mutex operations.

The only multi-slot counter types are ulint_ctr_64_t (used by a number of srv_stats counters) and int64_counter_t (used by rw_lock_stats). For the rw_lock_stats we should not care about accuracy. For srv_stats, the following counters are inaccurate by design. Based on the names of these counters, they are only ever incremented, never decremented, and it would seem to be acceptable to lose a few increments:

ulint_ctr_64_t          page_compression_saved;

ulint_ctr_64_t          index_pages_written;

ulint_ctr_64_t          non_index_pages_written;

ulint_ctr_64_t          pages_page_compressed;

ulint_ctr_64_t          page_compressed_trim_op;

ulint_ctr_64_t          pages_page_decompressed;

ulint_ctr_64_t          pages_page_compression_error;

ulint_ctr_64_t          pages_encrypted;

ulint_ctr_64_t          pages_decrypted;

ulint_ctr_64_t		n_rows_read;

ulint_ctr_64_t		n_rows_updated;

ulint_ctr_64_t		n_rows_deleted;

ulint_ctr_64_t		n_rows_inserted;

ulint_ctr_64_t		n_system_rows_read;

ulint_ctr_64_t		n_system_rows_updated;

ulint_ctr_64_t		n_system_rows_deleted;

ulint_ctr_64_t		n_system_rows_inserted;

ulint_ctr_64_t		n_sec_rec_cluster_reads;

ulint_ctr_64_t		n_sec_rec_cluster_reads_avoided;

ulint_ctr_64_t		page0_read;

ulint_ctr_64_t		n_key_requests;

ulint_ctr_64_t		n_log_scrubs;

ulint_ctr_64_t		key_rotation_list_length;

Marko Mäkelä added a comment - 2017-05-11 06:46 The inaccuracy seems to be by design: /** Class for using fuzzy counters. The counter is not protected by any mutex and the results are not guaranteed to be 100% accurate but close enough. Creates an array of counters and separates each element by the CACHE_LINE_SIZE bytes */ template < typename Type, int N = IB_N_SLOTS, template<typename, int> class Indexer = default_indexer_t> class ib_counter_t That said, much of the time this template is instantiated with N=1, which appears to boil down to a single counter that is padded to two cache line sizes: typedef ib_counter_t<ulint, 64> ulint_ctr_64_t; typedef ib_counter_t<lsn_t, 1, single_indexer_t> lsn_ctr_1_t; typedef ib_counter_t<ulint, 1, single_indexer_t> ulint_ctr_1_t; typedef ib_counter_t<lint, 1, single_indexer_t> lint_ctr_1_t; typedef ib_counter_t<int64_t, 1, single_indexer_t> int64_ctr_1_t; typedef ib_counter_t<int64_t, IB_N_SLOTS> int64_counter_t; It seems reasonable to use atomics for the various _ctr_1_t counters when the performance impact is low. When atomic memory operations are not available, I think it would be more acceptable to have some overflows than to risk significantly reducing performance with some mutex operations. The only multi-slot counter types are ulint_ctr_64_t (used by a number of srv_stats counters) and int64_counter_t (used by rw_lock_stats). For the rw_lock_stats we should not care about accuracy. For srv_stats, the following counters are inaccurate by design. Based on the names of these counters, they are only ever incremented, never decremented, and it would seem to be acceptable to lose a few increments: ulint_ctr_64_t page_compression_saved; ulint_ctr_64_t index_pages_written; ulint_ctr_64_t non_index_pages_written; ulint_ctr_64_t pages_page_compressed; ulint_ctr_64_t page_compressed_trim_op; ulint_ctr_64_t pages_page_decompressed; ulint_ctr_64_t pages_page_compression_error; ulint_ctr_64_t pages_encrypted; ulint_ctr_64_t pages_decrypted; ulint_ctr_64_t n_rows_read; ulint_ctr_64_t n_rows_updated; ulint_ctr_64_t n_rows_deleted; ulint_ctr_64_t n_rows_inserted; ulint_ctr_64_t n_system_rows_read; ulint_ctr_64_t n_system_rows_updated; ulint_ctr_64_t n_system_rows_deleted; ulint_ctr_64_t n_system_rows_inserted; ulint_ctr_64_t n_sec_rec_cluster_reads; ulint_ctr_64_t n_sec_rec_cluster_reads_avoided; ulint_ctr_64_t page0_read; ulint_ctr_64_t n_key_requests; ulint_ctr_64_t n_log_scrubs; ulint_ctr_64_t key_rotation_list_length;

Marko Mäkelä added a comment - 2017-05-11 08:30

Only the following counters are both incremented and decremented, and thus we probably should use atomic memory access for updating them:

	/** Number of threads currently waiting on database locks */

	lint_ctr_1_t		n_lock_wait_current_count;

	/** Number of writes being done to the log files */

	lint_ctr_1_t		os_log_pending_writes;

Marko Mäkelä added a comment - 2017-05-11 08:30 Only the following counters are both incremented and decremented, and thus we probably should use atomic memory access for updating them: /** Number of threads currently waiting on database locks */ lint_ctr_1_t n_lock_wait_current_count; /** Number of writes being done to the log files */ lint_ctr_1_t os_log_pending_writes;

Marko Mäkelä added a comment - 2017-05-11 13:09

srv_stats.os_log_pending_writes should be fine, because it is protected by log_sys->mutex (10.0, 10.1) or log_sys->write_mutex (10.2).

Similarly, fil_n_pending_log_flushes and fil_n_pending_tablespace_flushes are protected by fil_system->mutex.
Finally, MONITOR_OS_PENDING_READS, MONITOR_OS_PENDING_WRITES and their predecessors before ~~MDEV-12534~~ are safe.

So, the only overflow-prone ‘pending’ counter is Innodb_row_lock_current_waits.

Marko Mäkelä added a comment - 2017-05-11 13:09 srv_stats.os_log_pending_writes should be fine, because it is protected by log_sys->mutex (10.0, 10.1) or log_sys->write_mutex (10.2). Similarly, fil_n_pending_log_flushes and fil_n_pending_tablespace_flushes are protected by fil_system->mutex. Finally, MONITOR_OS_PENDING_READS, MONITOR_OS_PENDING_WRITES and their predecessors before MDEV-12534 are safe. So, the only overflow-prone ‘pending’ counter is Innodb_row_lock_current_waits.

Marko Mäkelä added a comment - 2017-05-11 19:07 - edited

bb-10.0-marko
bb-10.1-marko
bb-10.2-marko
There is no test case. I was not even able to repeat the overflow.
This bug could require a very long run with an application with very high record lock contention to reproduce.
These fixes should not only fix the overflow; they could also slightly improve the performance of the counters.

Marko Mäkelä added a comment - 2017-05-11 19:07 - edited bb-10.0-marko bb-10.1-marko bb-10.2-marko There is no test case. I was not even able to repeat the overflow. This bug could require a very long run with an application with very high record lock contention to reproduce. These fixes should not only fix the overflow; they could also slightly improve the performance of the counters.

Jan Lindström (Inactive) added a comment - 2017-05-12 07:53

ok to push.

Jan Lindström (Inactive) added a comment - 2017-05-12 07:53 ok to push.

People

Assignee:: Marko Mäkelä

Reporter:: Chris Calender (Inactive)

Votes:: 0 Vote for this issue

Watchers:: 6 Start watching this issue

Dates

Created:: 2017-05-03 22:16

Updated:: 2024-07-08 00:57

Resolved:: 2017-05-12 10:31

Git Integration

Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.

MariaDB Server