Details
-
Task
-
Status: Closed (View Workflow)
-
Critical
-
Resolution: Fixed
-
10.3.6-1
Description
MDEV-9202, MDEV-8509 shows cases where the systemd timeout isn't sufficient to preform initialization/shut-down.
Since https://github.com/systemd/systemd/commit/a327431bd168b2f327f3cd422379e213c643f2a5 released in system v236 Type=notify service can now advice to the systemd service manager they are still working to avoid the service timing out.
The use of EXTEND_TIMEOUT_USEC= on older services has no effect and is therefore compatible.
This needs to be included in (feel free to correct/extend):
- buffer pool dump - buf_dump (storage/innobase/buf/buf0dump.cc)
- redo log recovery - log_group_read_log_seg (storage/innobase/log/log0log.cc)
- undo
recovery - recv_recover_page_func (storage/innobase/log/log0recv.cc)
- change buffer?
- merge buffer?
I was planning on making the 15 seconds of recv_sys_t->report() more general with respect to interval, and use a define INNODB_REPORT_INTERVAL (include/univ.i?) as the basis for this form of watchdog. I'd send notify messages of INNODB_REPORT_INTERVAL * 2 as an acceptable margin.
Anywhere else or other suggestions marko, jplindst?
- galera SST scripts - donor and recipient
Any other server/engine slow points to account for?
Target 10.3 and then look at a backport?
Attachments
Issue Links
- causes
-
MDEV-16149 Failing assertion: node->modification_counter == node->flush_counter with innodb_flush_method=O_DSYNC
-
- Closed
-
-
MDEV-16150 Mariadb 10.3.6 Failing assertion on Docker
-
- Closed
-
-
MDEV-17003 service_manager_extend_timeout() being called too often
-
- Closed
-
- relates to
-
MDEV-11027 InnoDB log recovery is too noisy
-
- Closed
-
-
MDEV-12323 Rollback progress log messages during crash recovery are intermixed with unrelated log messages
-
- Closed
-
-
MDEV-12352 InnoDB shutdown should not be blocked by a large transaction rollback
-
- Closed
-
-
MDEV-15554 InnoDB page_cleaner shutdown sometimes hangs
-
- Closed
-
-
MDEV-15832 With innodb_fast_shutdown=3, skip the rollback of connected transactions
-
- Closed
-
-
MDEV-17571 Make systemd timeout behavior more compatible with long Galera SSTs
-
- Closed
-
-
MDEV-17934 Make systemd timeout behavior more compatible with longer Galera recovery times
-
- Closed
-
-
MDEV-18224 MTR's internal check of the test case 'innodb.recovery_shutdown' failed due to extra #sql-ib*.ibd files
-
- Closed
-
-
MDEV-9202 Systemd timeout is not sufficient for larger servers
-
- Closed
-
-
MDEV-11035 Restore removed disallow-writes for Galera
-
- Closed
-
-
MDEV-15606 Galera can't perform SST in 10.2.13 if systemd in use due to timeout at startup
-
- Closed
-
-
MDEV-15607 mysqld crashed few after node is being joined with sst
-
- Closed
-
-
MDEV-17571 Make systemd timeout behavior more compatible with long Galera SSTs
-
- Closed
-
Activity
Description |
Since https://github.com/systemd/systemd/commit/a327431bd168b2f327f3cd422379e213c643f2a5 released in system v236 Type=notify service can now advice to the systemd service manager they are still working to avoid the service timing out. The use of EXTEND_TIMEOUT_USEC= on older services has no effect and is therefore compatible. This needs to be included in (feel free to correct/extend): * buffer pool dump - buf_dump (storage/innobase/buf/buf0dump.cc) * redo log recovery - log_group_read_log_seg (storage/innobase/log/log0log.cc) * undo(?) recovery - recv_recover_page_func (storage/innobase/log/log0recv.cc) * change buffer? * merge buffer? I was planning on making the 15 seconds of recv_sys_t->report() more general with respect to interval, and use a define INNODB_REPORT_INTERVAL (include/univ.i?) as the basis for this form of watchdog. I'd send notify messages of INNODB_REPORT_INTERVAL * 2 as an acceptable margin. Anywhere else or other suggestions [~marko], [~jplindst]? Any other server/engine slow points to account for? Target 10.3 and then look at a backport? |
Since https://github.com/systemd/systemd/commit/a327431bd168b2f327f3cd422379e213c643f2a5 released in system v236 Type=notify service can now advice to the systemd service manager they are still working to avoid the service timing out. The use of EXTEND_TIMEOUT_USEC= on older services has no effect and is therefore compatible. This needs to be included in (feel free to correct/extend): * buffer pool dump - buf_dump (storage/innobase/buf/buf0dump.cc) * redo log recovery - log_group_read_log_seg (storage/innobase/log/log0log.cc) * undo(?) recovery - recv_recover_page_func (storage/innobase/log/log0recv.cc) * change buffer? * merge buffer? I was planning on making the 15 seconds of recv_sys_t->report() more general with respect to interval, and use a define INNODB_REPORT_INTERVAL (include/univ.i?) as the basis for this form of watchdog. I'd send notify messages of INNODB_REPORT_INTERVAL * 2 as an acceptable margin. Anywhere else or other suggestions [~marko], [~jplindst]? * galera SST scripts - donor and recipient Any other server/engine slow points to account for? Target 10.3 and then look at a backport? |
Fix Version/s | 10.2 [ 14601 ] | |
Fix Version/s | 10.1 [ 16100 ] |
Labels | contribution foundation patch |
Assignee | Marko Mäkelä [ marko ] |
Priority | Major [ 3 ] | Critical [ 2 ] |
Link |
This issue relates to |
Link |
This issue relates to |
Link |
This issue relates to |
Link |
This issue relates to |
Sprint | 10.1.32 [ 235 ] |
Status | Open [ 1 ] | In Progress [ 3 ] |
Sprint | 10.1.32 [ 235 ] | 10.3.6 [ 237 ] |
Link |
This issue relates to |
Link |
This issue relates to |
issue.field.resolutiondate | 2018-04-06 07:27:36.0 | 2018-04-06 07:27:36.08 |
Fix Version/s | 10.1.33 [ 22909 ] | |
Fix Version/s | 10.2.15 [ 23006 ] | |
Fix Version/s | 10.3.6 [ 23003 ] | |
Fix Version/s | 10.2 [ 14601 ] | |
Fix Version/s | 10.1 [ 16100 ] | |
Fix Version/s | 10.3 [ 22126 ] | |
Resolution | Fixed [ 1 ] | |
Status | In Progress [ 3 ] | Closed [ 6 ] |
Link |
This issue relates to |
Link |
This issue relates to |
Link |
This issue causes |
Link |
This issue causes |
Link |
This issue causes |
Link |
This issue relates to |
Link |
This issue relates to |
Link |
This issue relates to |
Link |
This issue relates to |
Workflow | MariaDB v3 [ 84505 ] | MariaDB v4 [ 133422 ] |
Overdue PR.