Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-17003

service_manager_extend_timeout() being called too often

Details

    Description

      The function srv_do_purge() is invoking service_manager_extend_timeout() without any rate limit, wasting CPU in both the MariaDB server and systemd.

      The call would only make sense during a slow shutdown SET GLOBAL innodb_fast_shutdown=0. Even then, it would make sense to rate-limit the output, and write an accumulated count only every N seconds. The purpose of the call is to prevent systemd from killing the server due to seeming hang, and to inform the user about the progress. It would be even more useful if there was an indication of work that remains to be done.

      Also other calls to service_manager_extend_timeout() should be appropriately rate-limited and possibly disabled outside startup or shutdown.

      danblack, as the contributor of MDEV-14705, do you know the default timeout before any EXTEND_TIMEOUT_USEC message was sent to systemd?

      Attachments

        Issue Links

          Activity

            danblack Daniel Black added a comment - Default time out is https://www.freedesktop.org/software/systemd/man/systemd.service.html#TimeoutStopSec= which falls back to https://www.freedesktop.org/software/systemd/man/systemd-system.conf.html#DefaultTimeoutStartSec= (90s).

            I checked all the InnoDB calls to service_manager_extend_timeout(), and in addition to the one in srv_purge_should_exit(), which actually looks like it should be executed on a slow shutdown only, there are calls in buf_dump() and log_write_up_to() that could be less frequent:

            diff --git a/storage/xtradb/buf/buf0dump.cc b/storage/xtradb/buf/buf0dump.cc
            index 90358d34b04..e9168d9f5d5 100644
            --- a/storage/xtradb/buf/buf0dump.cc
            +++ b/storage/xtradb/buf/buf0dump.cc
            @@ -1,7 +1,7 @@
             /*****************************************************************************
             
             Copyright (c) 2011, 2017, Oracle and/or its affiliates. All Rights Reserved.
            -Copyright (c) 2017, MariaDB Corporation. All Rights Reserved.
            +Copyright (c) 2017, 2018, MariaDB Corporation.
             
             This program is free software; you can redistribute it and/or modify it under
             the terms of the GNU General Public License as published by the Free Software
            @@ -334,7 +334,7 @@ buf_dump(
             					i + 1, srv_buf_pool_instances,
             					j + 1, n_pages);
             			}
            -			if ( (j % 1024) == 0) {
            +			if (SHUTTING_DOWN() && !(j % 1024)) {
             				service_manager_extend_timeout(INNODB_EXTEND_TIMEOUT_INTERVAL,
             					"Dumping buffer pool "
             					ULINTPF "/" ULINTPF ", "
            diff --git a/storage/xtradb/log/log0log.cc b/storage/xtradb/log/log0log.cc
            index 1420f5a3a12..c4dc89dc013 100644
            --- a/storage/xtradb/log/log0log.cc
            +++ b/storage/xtradb/log/log0log.cc
            @@ -1561,12 +1561,6 @@ log_write_up_to(
             		return;
             	}
             
            -	if (srv_shutdown_state != SRV_SHUTDOWN_NONE) {
            -		service_manager_extend_timeout(INNODB_EXTEND_TIMEOUT_INTERVAL,
            -					       "log write up to: " LSN_PF,
            -					       lsn);
            -	}
            -
             loop:
             	ut_ad(++loop_count < 100);
             
            @@ -1679,6 +1673,12 @@ log_write_up_to(
             	log_sys->buf_free += OS_FILE_LOG_BLOCK_SIZE;
             	log_sys->write_end_offset = log_sys->buf_free;
             
            +	if (UNIV_UNLIKELY(srv_shutdown_state != SRV_SHUTDOWN_NONE)) {
            +		service_manager_extend_timeout(INNODB_EXTEND_TIMEOUT_INTERVAL,
            +					       "log write up to: " LSN_PF,
            +					       lsn);
            +	}
            +
             	group = UT_LIST_GET_FIRST(log_sys->log_groups);
             
             	/* Do the write to the log files */
            

            Next, I am trying to test if srv_purge_should_exit() could emit the output outside a slow shutdown SET GLOBAL innodb_fast_shutdown=0. I am also trying to see how that output could be rate-limited.

            marko Marko Mäkelä added a comment - I checked all the InnoDB calls to service_manager_extend_timeout() , and in addition to the one in srv_purge_should_exit() , which actually looks like it should be executed on a slow shutdown only, there are calls in buf_dump() and log_write_up_to() that could be less frequent: diff --git a/storage/xtradb/buf/buf0dump.cc b/storage/xtradb/buf/buf0dump.cc index 90358d34b04..e9168d9f5d5 100644 --- a/storage/xtradb/buf/buf0dump.cc +++ b/storage/xtradb/buf/buf0dump.cc @@ -1,7 +1,7 @@ /***************************************************************************** Copyright (c) 2011, 2017, Oracle and/or its affiliates. All Rights Reserved. -Copyright (c) 2017, MariaDB Corporation. All Rights Reserved. +Copyright (c) 2017, 2018, MariaDB Corporation. This program is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software @@ -334,7 +334,7 @@ buf_dump( i + 1, srv_buf_pool_instances, j + 1, n_pages); } - if ( (j % 1024) == 0) { + if (SHUTTING_DOWN() && !(j % 1024)) { service_manager_extend_timeout(INNODB_EXTEND_TIMEOUT_INTERVAL, "Dumping buffer pool " ULINTPF "/" ULINTPF ", " diff --git a/storage/xtradb/log/log0log.cc b/storage/xtradb/log/log0log.cc index 1420f5a3a12..c4dc89dc013 100644 --- a/storage/xtradb/log/log0log.cc +++ b/storage/xtradb/log/log0log.cc @@ -1561,12 +1561,6 @@ log_write_up_to( return; } - if (srv_shutdown_state != SRV_SHUTDOWN_NONE) { - service_manager_extend_timeout(INNODB_EXTEND_TIMEOUT_INTERVAL, - "log write up to: " LSN_PF, - lsn); - } - loop: ut_ad(++loop_count < 100); @@ -1679,6 +1673,12 @@ log_write_up_to( log_sys->buf_free += OS_FILE_LOG_BLOCK_SIZE; log_sys->write_end_offset = log_sys->buf_free; + if (UNIV_UNLIKELY(srv_shutdown_state != SRV_SHUTDOWN_NONE)) { + service_manager_extend_timeout(INNODB_EXTEND_TIMEOUT_INTERVAL, + "log write up to: " LSN_PF, + lsn); + } + group = UT_LIST_GET_FIRST(log_sys->log_groups); /* Do the write to the log files */ Next, I am trying to test if srv_purge_should_exit() could emit the output outside a slow shutdown SET GLOBAL innodb_fast_shutdown=0 . I am also trying to see how that output could be rate-limited.

            It turns out that in 10.3.6, as part of this merge the call to service_manager_extend_timeout() was moved to srv_do_purge(), and it is being executed also outside a slow shutdown (innodb_fast_shutdown=0).
            Currently, my fix is only present in 10.1 and merged to 10.2. When it is merged to 10.3 (before the 10.3.10 release), we will have to review and test this carefully.
            Thanks to yctn for pointing this out!

            marko Marko Mäkelä added a comment - It turns out that in 10.3.6, as part of this merge the call to service_manager_extend_timeout() was moved to srv_do_purge() , and it is being executed also outside a slow shutdown ( innodb_fast_shutdown=0 ). Currently, my fix is only present in 10.1 and merged to 10.2. When it is merged to 10.3 (before the 10.3.10 release), we will have to review and test this carefully. Thanks to yctn for pointing this out!

            People

              marko Marko Mäkelä
              marko Marko Mäkelä
              Votes:
              0 Vote for this issue
              Watchers:
              5 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.