Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-12674

Innodb_row_lock_current_waits has overflow

Details

    Description

      Innodb_row_lock_current_waits appears to have an overflow.

      MariaDB [(none)]> SHOW GLOBAL STATUS LIKE 'Innodb_row_lock_current_waits';
      +-------------------------------+----------------------+
      | Variable_name | Value |
      +-------------------------------+----------------------+
      | Innodb_row_lock_current_waits | 18446744073709551613 |
      +-------------------------------+----------------------+
      1 row in set (0.00 sec)
       
      MariaDB [(none)]> SHOW SESSION STATUS LIKE 'Innodb_row_lock_current_waits';
      +-------------------------------+----------------------+
      | Variable_name | Value |
      +-------------------------------+----------------------+
      | Innodb_row_lock_current_waits | 18446744073709551613 |
      +-------------------------------+----------------------+
      1 row in set (0.00 sec)
      

      I don't know how to reproduce it, but have access to the system showing this (and it is currently still in this state).

      Attachments

        Issue Links

          Activity

            ccalender Chris Calender (Inactive) created issue -
            elenst Elena Stepanova made changes -
            Field Original Value New Value
            Description Innodb_row_lock_current_waits appears to have an overflow.

            MariaDB [(none)]> SHOW GLOBAL STATUS LIKE 'Innodb_row_lock_current_waits';
            +-------------------------------+----------------------+
            | Variable_name | Value |
            +-------------------------------+----------------------+
            | Innodb_row_lock_current_waits | 18446744073709551613 |
            +-------------------------------+----------------------+
            1 row in set (0.00 sec)

            MariaDB [(none)]> SHOW SESSION STATUS LIKE 'Innodb_row_lock_current_waits';
            +-------------------------------+----------------------+
            | Variable_name | Value |
            +-------------------------------+----------------------+
            | Innodb_row_lock_current_waits | 18446744073709551613 |
            +-------------------------------+----------------------+
            1 row in set (0.00 sec)

            I don't know how to reproduce it, but have access to the system showing this (and it is currently still in this state).
            Innodb_row_lock_current_waits appears to have an overflow.

            {code:sql}
            MariaDB [(none)]> SHOW GLOBAL STATUS LIKE 'Innodb_row_lock_current_waits';
            +-------------------------------+----------------------+
            | Variable_name | Value |
            +-------------------------------+----------------------+
            | Innodb_row_lock_current_waits | 18446744073709551613 |
            +-------------------------------+----------------------+
            1 row in set (0.00 sec)

            MariaDB [(none)]> SHOW SESSION STATUS LIKE 'Innodb_row_lock_current_waits';
            +-------------------------------+----------------------+
            | Variable_name | Value |
            +-------------------------------+----------------------+
            | Innodb_row_lock_current_waits | 18446744073709551613 |
            +-------------------------------+----------------------+
            1 row in set (0.00 sec)
            {code}

            I don't know how to reproduce it, but have access to the system showing this (and it is currently still in this state).
            elenst Elena Stepanova made changes -
            Component/s Storage Engine - InnoDB [ 10129 ]
            Fix Version/s 10.1 [ 16100 ]
            Assignee Marko Mäkelä [ marko ]
            marko Marko Mäkelä made changes -
            Affects Version/s 10.0 [ 16000 ]
            Affects Version/s 10.1 [ 16100 ]
            Affects Version/s 10.2 [ 14601 ]
            marko Marko Mäkelä made changes -
            Fix Version/s 10.0 [ 16000 ]
            Fix Version/s 10.2 [ 14601 ]
            marko Marko Mäkelä made changes -
            Status Open [ 1 ] In Progress [ 3 ]

            The underlying counter is srv_stats.n_lock_wait_current_count, which is only modified by the function lock_wait_suspend_thread(). The incrementing is protected by lock_sys->wait_mutex, but the decrementing does not appear to be protected by anything. This mismatch could allow the counter to be corrupted when a lock wait is terminating roughly at the same time with the start of a wait on (possibly another) lock.
            I think that both operations should be protected by atomics and that the incrementing can be done without holding any mutex.

            marko Marko Mäkelä added a comment - The underlying counter is srv_stats.n_lock_wait_current_count, which is only modified by the function lock_wait_suspend_thread(). The incrementing is protected by lock_sys->wait_mutex, but the decrementing does not appear to be protected by anything. This mismatch could allow the counter to be corrupted when a lock wait is terminating roughly at the same time with the start of a wait on (possibly another) lock. I think that both operations should be protected by atomics and that the incrementing can be done without holding any mutex.

            The inaccuracy seems to be by design:

            /** Class for using fuzzy counters. The counter is not protected by any
            mutex and the results are not guaranteed to be 100% accurate but close
            enough. Creates an array of counters and separates each element by the
            CACHE_LINE_SIZE bytes */
            template <
            	typename Type,
            	int N = IB_N_SLOTS,
            	template<typename, int> class Indexer = default_indexer_t>
            class ib_counter_t
            

            That said, much of the time this template is instantiated with N=1, which appears to boil down to a single counter that is padded to two cache line sizes:

            typedef ib_counter_t<ulint, 64> ulint_ctr_64_t;
            typedef ib_counter_t<lsn_t, 1, single_indexer_t> lsn_ctr_1_t;
            typedef ib_counter_t<ulint, 1, single_indexer_t> ulint_ctr_1_t;
            typedef ib_counter_t<lint, 1, single_indexer_t> lint_ctr_1_t;
            typedef ib_counter_t<int64_t, 1, single_indexer_t> int64_ctr_1_t;
            typedef ib_counter_t<int64_t, IB_N_SLOTS> int64_counter_t;
            

            It seems reasonable to use atomics for the various _ctr_1_t counters when the performance impact is low. When atomic memory operations are not available, I think it would be more acceptable to have some overflows than to risk significantly reducing performance with some mutex operations.

            The only multi-slot counter types are ulint_ctr_64_t (used by a number of srv_stats counters) and int64_counter_t (used by rw_lock_stats). For the rw_lock_stats we should not care about accuracy. For srv_stats, the following counters are inaccurate by design. Based on the names of these counters, they are only ever incremented, never decremented, and it would seem to be acceptable to lose a few increments:

            ulint_ctr_64_t          page_compression_saved;
            ulint_ctr_64_t          index_pages_written;
            ulint_ctr_64_t          non_index_pages_written;
            ulint_ctr_64_t          pages_page_compressed;
            ulint_ctr_64_t          page_compressed_trim_op;
            ulint_ctr_64_t          pages_page_decompressed;
            ulint_ctr_64_t          pages_page_compression_error;
            ulint_ctr_64_t          pages_encrypted;
            ulint_ctr_64_t          pages_decrypted;
            ulint_ctr_64_t		n_rows_read;
            ulint_ctr_64_t		n_rows_updated;
            ulint_ctr_64_t		n_rows_deleted;
            ulint_ctr_64_t		n_rows_inserted;
            ulint_ctr_64_t		n_system_rows_read;
            ulint_ctr_64_t		n_system_rows_updated;
            ulint_ctr_64_t		n_system_rows_deleted;
            ulint_ctr_64_t		n_system_rows_inserted;
            ulint_ctr_64_t		n_sec_rec_cluster_reads;
            ulint_ctr_64_t		n_sec_rec_cluster_reads_avoided;
            ulint_ctr_64_t		page0_read;
            ulint_ctr_64_t		n_key_requests;
            ulint_ctr_64_t		n_log_scrubs;
            ulint_ctr_64_t		key_rotation_list_length;
            

            marko Marko Mäkelä added a comment - The inaccuracy seems to be by design: /** Class for using fuzzy counters. The counter is not protected by any mutex and the results are not guaranteed to be 100% accurate but close enough. Creates an array of counters and separates each element by the CACHE_LINE_SIZE bytes */ template < typename Type, int N = IB_N_SLOTS, template<typename, int> class Indexer = default_indexer_t> class ib_counter_t That said, much of the time this template is instantiated with N=1, which appears to boil down to a single counter that is padded to two cache line sizes: typedef ib_counter_t<ulint, 64> ulint_ctr_64_t; typedef ib_counter_t<lsn_t, 1, single_indexer_t> lsn_ctr_1_t; typedef ib_counter_t<ulint, 1, single_indexer_t> ulint_ctr_1_t; typedef ib_counter_t<lint, 1, single_indexer_t> lint_ctr_1_t; typedef ib_counter_t<int64_t, 1, single_indexer_t> int64_ctr_1_t; typedef ib_counter_t<int64_t, IB_N_SLOTS> int64_counter_t; It seems reasonable to use atomics for the various _ctr_1_t counters when the performance impact is low. When atomic memory operations are not available, I think it would be more acceptable to have some overflows than to risk significantly reducing performance with some mutex operations. The only multi-slot counter types are ulint_ctr_64_t (used by a number of srv_stats counters) and int64_counter_t (used by rw_lock_stats). For the rw_lock_stats we should not care about accuracy. For srv_stats, the following counters are inaccurate by design. Based on the names of these counters, they are only ever incremented, never decremented, and it would seem to be acceptable to lose a few increments: ulint_ctr_64_t page_compression_saved; ulint_ctr_64_t index_pages_written; ulint_ctr_64_t non_index_pages_written; ulint_ctr_64_t pages_page_compressed; ulint_ctr_64_t page_compressed_trim_op; ulint_ctr_64_t pages_page_decompressed; ulint_ctr_64_t pages_page_compression_error; ulint_ctr_64_t pages_encrypted; ulint_ctr_64_t pages_decrypted; ulint_ctr_64_t n_rows_read; ulint_ctr_64_t n_rows_updated; ulint_ctr_64_t n_rows_deleted; ulint_ctr_64_t n_rows_inserted; ulint_ctr_64_t n_system_rows_read; ulint_ctr_64_t n_system_rows_updated; ulint_ctr_64_t n_system_rows_deleted; ulint_ctr_64_t n_system_rows_inserted; ulint_ctr_64_t n_sec_rec_cluster_reads; ulint_ctr_64_t n_sec_rec_cluster_reads_avoided; ulint_ctr_64_t page0_read; ulint_ctr_64_t n_key_requests; ulint_ctr_64_t n_log_scrubs; ulint_ctr_64_t key_rotation_list_length;

            Only the following counters are both incremented and decremented, and thus we probably should use atomic memory access for updating them:

            	/** Number of threads currently waiting on database locks */
            	lint_ctr_1_t		n_lock_wait_current_count;
            	/** Number of writes being done to the log files */
            	lint_ctr_1_t		os_log_pending_writes;
            

            marko Marko Mäkelä added a comment - Only the following counters are both incremented and decremented, and thus we probably should use atomic memory access for updating them: /** Number of threads currently waiting on database locks */ lint_ctr_1_t n_lock_wait_current_count; /** Number of writes being done to the log files */ lint_ctr_1_t os_log_pending_writes;

            srv_stats.os_log_pending_writes should be fine, because it is protected by log_sys->mutex (10.0, 10.1) or log_sys->write_mutex (10.2).

            Similarly, fil_n_pending_log_flushes and fil_n_pending_tablespace_flushes are protected by fil_system->mutex.
            Finally, MONITOR_OS_PENDING_READS, MONITOR_OS_PENDING_WRITES and their predecessors before MDEV-12534 are safe.

            So, the only overflow-prone ‘pending’ counter is Innodb_row_lock_current_waits.

            marko Marko Mäkelä added a comment - srv_stats.os_log_pending_writes should be fine, because it is protected by log_sys->mutex (10.0, 10.1) or log_sys->write_mutex (10.2). Similarly, fil_n_pending_log_flushes and fil_n_pending_tablespace_flushes are protected by fil_system->mutex. Finally, MONITOR_OS_PENDING_READS, MONITOR_OS_PENDING_WRITES and their predecessors before MDEV-12534 are safe. So, the only overflow-prone ‘pending’ counter is Innodb_row_lock_current_waits.
            marko Marko Mäkelä added a comment - - edited

            bb-10.0-marko
            bb-10.1-marko
            bb-10.2-marko
            There is no test case. I was not even able to repeat the overflow.
            This bug could require a very long run with an application with very high record lock contention to reproduce.
            These fixes should not only fix the overflow; they could also slightly improve the performance of the counters.

            marko Marko Mäkelä added a comment - - edited bb-10.0-marko bb-10.1-marko bb-10.2-marko There is no test case. I was not even able to repeat the overflow. This bug could require a very long run with an application with very high record lock contention to reproduce. These fixes should not only fix the overflow; they could also slightly improve the performance of the counters.
            marko Marko Mäkelä made changes -
            Assignee Marko Mäkelä [ marko ] Jan Lindström [ jplindst ]
            Status In Progress [ 3 ] In Review [ 10002 ]

            ok to push.

            jplindst Jan Lindström (Inactive) added a comment - ok to push.
            jplindst Jan Lindström (Inactive) made changes -
            Assignee Jan Lindström [ jplindst ] Marko Mäkelä [ marko ]
            Status In Review [ 10002 ] Stalled [ 10000 ]
            marko Marko Mäkelä made changes -
            issue.field.resolutiondate 2017-05-12 10:31:27.0 2017-05-12 10:31:27.302
            marko Marko Mäkelä made changes -
            Fix Version/s 10.0.31 [ 22501 ]
            Fix Version/s 10.1.24 [ 22541 ]
            Fix Version/s 10.2.7 [ 22543 ]
            Fix Version/s 10.2 [ 14601 ]
            Fix Version/s 10.0 [ 16000 ]
            Fix Version/s 10.1 [ 16100 ]
            Resolution Fixed [ 1 ]
            Status Stalled [ 10000 ] Closed [ 6 ]
            marko Marko Mäkelä made changes -
            marko Marko Mäkelä made changes -
            serg Sergei Golubchik made changes -
            Workflow MariaDB v3 [ 80587 ] MariaDB v4 [ 152077 ]
            mariadb-jira-automation Jira Automation (IT) made changes -
            Zendesk Related Tickets 158933

            People

              marko Marko Mäkelä
              ccalender Chris Calender (Inactive)
              Votes:
              0 Vote for this issue
              Watchers:
              6 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.