[MDEV-24609] innodb_io_capacity can exceed innodb_io_capacity_max Created: 2021-01-18  Updated: 2021-01-19  Resolved: 2021-01-19

Status: Closed
Project: MariaDB Server
Component/s: Storage Engine - InnoDB, Variables
Affects Version/s: 10.0.15, 10.1.10, 10.2.0, 10.3.0, 10.4.0, 10.5.0
Fix Version/s: 10.2.37, 10.3.28, 10.4.18, 10.5.9

Type: Bug Priority: Major
Reporter: Roel Van de Paar Assignee: Marko Mäkelä
Resolution: Fixed Votes: 0
Labels: debug, regression

Issue Links:
Problem/Incident
is caused by MDEV-7035 Remove innodb_io_capacity setting dep... Closed

 Description   

SET GLOBAL innodb_adaptive_flushing_lwm=0.0;
CREATE TABLE t (c DOUBLE) ENGINE=InnoDB;
SET GLOBAL innodb_io_capacity=18446744073709551615;
SELECT SLEEP (3);

Leads to:

10.6.0 9118fd360a3da0bba521caf2a35c424968235ac4 (Debug)

mysqld: /test/10.6_dbg/storage/innobase/buf/buf0flu.cc:1890: ulint af_get_pct_for_lsn(lsn_t): Assertion `srv_max_io_capacity >= srv_io_capacity' failed.

10.6.0 9118fd360a3da0bba521caf2a35c424968235ac4 (Debug)

Core was generated by `/test/MD010121-mariadb-10.6.0-linux-x86_64-dbg/bin/mysqld --no-defaults --core-'.
Program terminated with signal SIGABRT, Aborted.
#0  __pthread_kill (threadid=<optimized out>, signo=signo@entry=6)
    at ../sysdeps/unix/sysv/linux/pthread_kill.c:56
[Current thread is 1 (Thread 0x14f13c652700 (LWP 1553990))]
(gdb) bt
#0  __pthread_kill (threadid=<optimized out>, signo=signo@entry=6) at ../sysdeps/unix/sysv/linux/pthread_kill.c:56
#1  0x000055ee02c440d7 in my_write_core (sig=sig@entry=6) at /test/10.6_dbg/mysys/stacktrace.c:424
#2  0x000055ee023d8ab1 in handle_fatal_signal (sig=6) at /test/10.6_dbg/sql/signal_handler.cc:330
#3  <signal handler called>
#4  __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:50
#5  0x000014f145c21859 in __GI_abort () at abort.c:79
#6  0x000014f145c21729 in __assert_fail_base (fmt=0x14f145db7588 "%s%s%s:%u: %s%sAssertion `%s' failed.\n%n", assertion=0x55ee03064238 "srv_max_io_capacity >= srv_io_capacity", file=0x55ee03063238 "/test/10.6_dbg/storage/innobase/buf/buf0flu.cc", line=1890, function=<optimized out>) at assert.c:92
#7  0x000014f145c32f36 in __GI___assert_fail (assertion=assertion@entry=0x55ee03064238 "srv_max_io_capacity >= srv_io_capacity", file=file@entry=0x55ee03063238 "/test/10.6_dbg/storage/innobase/buf/buf0flu.cc", line=line@entry=1890, function=function@entry=0x55ee03064260 "ulint af_get_pct_for_lsn(lsn_t)") at assert.c:101
#8  0x000055ee02ac2529 in af_get_pct_for_lsn (age=1616) at /test/10.6_dbg/storage/innobase/buf/buf0flu.cc:1890
#9  page_cleaner_flush_pages_recommendation (dirty_pct=0.21037000371241182, dirty_blocks=17, oldest_lsn=43230, last_pages_in=0) at /test/10.6_dbg/storage/innobase/buf/buf0flu.cc:1921
#10 buf_flush_page_cleaner () at /test/10.6_dbg/storage/innobase/buf/buf0flu.cc:2170
#11 0x000014f14612f609 in start_thread (arg=<optimized out>) at pthread_create.c:477
#12 0x000014f145d1e293 in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95

Bug confirmed present in:
MariaDB: 10.5.9 (dbg), 10.6.0 (dbg)

Bug (or feature/syntax) confirmed not present in:
MariaDB: 10.2.37 (dbg), 10.2.37 (opt), 10.3.28 (dbg), 10.3.28 (opt), 10.4.18 (dbg), 10.4.18 (opt), 10.5.9 (opt), 10.6.0 (opt)
MySQL: 5.5.62 (dbg), 5.5.62 (opt), 5.6.50 (dbg), 5.6.50 (opt), 5.7.32 (dbg), 5.7.32 (opt), 8.0.22 (dbg), 8.0.22 (opt)

10.4 Does not crash at all on the same.

There is also this:

10.4.18 3454b5cf35a61e8f6cfab376638520dee4a50609 (Debug)

10.4.18>SET GLOBAL innodb_io_capacity=18446744073709551615;
Query OK, 0 rows affected, 2 warnings (0.000 sec)
 
10.4.18>SHOW WARNINGS;
+---------+------+------------------------------------------------------------------------------------------------------------+
| Level   | Code | Message                                                                                                    |
+---------+------+------------------------------------------------------------------------------------------------------------+
| Warning | 1210 | Setting innodb_io_capacity to 18446744073709551615 higher than innodb_io_capacity_max 18446744073709551614 |
| Warning | 1210 | Setting innodb_max_io_capacity to 18446744073709551614                                                     |
+---------+------+------------------------------------------------------------------------------------------------------------+
2 rows in set (0.000 sec)
 
10.4.18>SELECT @@GLOBAL.innodb_max_io_capacity;
ERROR 1193 (HY000): Unknown system variable 'innodb_max_io_capacity'
10.4.18>SELECT @@GLOBAL.innodb_io_capacity;
+-----------------------------+
| @@GLOBAL.innodb_io_capacity |
+-----------------------------+
|        18446744073709551615 |
+-----------------------------+
1 row in set (0.000 sec)

Not sure if that is a separate off-by-one bug and/or a minor functionality bug or not.

Also see MDEV-7035.



 Comments   
Comment by Marko Mäkelä [ 2021-01-18 ]

Roel, I cannot repeat the assertion failure on my system, possibly due to the fact that the version that you tested is older than the MDEV-24537 fix.
I did try to compensate for that, but the latest 10.5 still did not crash for me:

--source include/have_innodb.inc
SET GLOBAL innodb_adaptive_flushing_lwm=0.0;
SET GLOBAL innodb_max_dirty_pages_pct_lwm=0.000001;
CREATE TABLE t (c DOUBLE) ENGINE=InnoDB;
SET GLOBAL innodb_io_capacity=18446744073709551615;
SHOW WARNINGS;
SELECT @@innodb_io_capacity;
SELECT @@innodb_io_capacity_max;
SELECT SLEEP (3);

That said, can you please test the following fix?

diff --git a/storage/innobase/handler/ha_innodb.cc b/storage/innobase/handler/ha_innodb.cc
index dd52fee3f9b..3da5840b357 100644
--- a/storage/innobase/handler/ha_innodb.cc
+++ b/storage/innobase/handler/ha_innodb.cc
@@ -17151,12 +17151,17 @@ innodb_io_capacity_update(
 				    " higher than innodb_io_capacity_max %lu",
 				    in_val, srv_max_io_capacity);
 
-		srv_max_io_capacity = in_val * 2;
+		if (in_val >= ((~0UL) & (~0UL >> 1))) {
+			in_val = srv_max_io_capacity;
+		} else {
+			srv_max_io_capacity = in_val * 2;
 
-		push_warning_printf(thd, Sql_condition::WARN_LEVEL_WARN,
-				    ER_WRONG_ARGUMENTS,
-				    "Setting innodb_max_io_capacity to %lu",
-				    srv_max_io_capacity);
+			push_warning_printf(thd,
+					    Sql_condition::WARN_LEVEL_WARN,
+					    ER_WRONG_ARGUMENTS,
+					    "Setting innodb_max_io_capacity"
+					    " to %lu", srv_max_io_capacity);
+		}
 	}
 
 	srv_io_capacity = in_val;

This seems to be caused by MDEV-7035.

Comment by Roel Van de Paar [ 2021-01-19 ]

Testing with latest version + alternative patch as per discussion with marko on Slack

diff --git a/storage/innobase/handler/ha_innodb.cc b/storage/innobase/handler/ha_innodb.cc
index 0dea402b32b..739d9ed1d64 100644
--- a/storage/innobase/handler/ha_innodb.cc
+++ b/storage/innobase/handler/ha_innodb.cc
@@ -17532,7 +17532,8 @@ innodb_io_capacity_update(
 				    " higher than innodb_io_capacity_max %lu",
 				    in_val, srv_max_io_capacity);
 
-		srv_max_io_capacity = in_val * 2;
+		srv_max_io_capacity = in_val >= ~0UL / 2
+			? in_val : in_val * 2;
 
 		push_warning_printf(thd, Sql_condition::WARN_LEVEL_WARN,
 				    ER_WRONG_ARGUMENTS,

Comment by Marko Mäkelä [ 2021-01-19 ]

I think that this is simply caused by MDEV-7035, and demonstrated by a debug assertion that was added much later. We could apply an even simpler fix to prevent the maximum value (ulong)-1 from being converted to one smaller ((ulong)-2) by the multiplication:

diff --git a/storage/innobase/handler/ha_innodb.cc b/storage/innobase/handler/ha_innodb.cc
index 0dea402b32b..739d9ed1d64 100644
--- a/storage/innobase/handler/ha_innodb.cc
+++ b/storage/innobase/handler/ha_innodb.cc
@@ -17532,7 +17532,8 @@ innodb_io_capacity_update(
 				    " higher than innodb_io_capacity_max %lu",
 				    in_val, srv_max_io_capacity);
 
-		srv_max_io_capacity = in_val * 2;
+		srv_max_io_capacity = in_val >= ~0UL / 2
+			? in_val : in_val * 2;
 
 		push_warning_printf(thd, Sql_condition::WARN_LEVEL_WARN,
 				    ER_WRONG_ARGUMENTS,

Comment by Roel Van de Paar [ 2021-01-19 ]

Confirmed that the issue is not reproducible against later versions (tested 9a08fcbf60567992971262ececee8d8429c20756), as per Marko due to the MDEV-24537 fix. Previously used version was 9118fd360a3da0bba521caf2a35c424968235ac4, testing that now with fix.

Comment by Roel Van de Paar [ 2021-01-19 ]

Also confirmed that the testcase by @marko generates same issue against non-patched build (at CLI).

10.6.0 9118fd360a3da0bba521caf2a35c424968235ac4 (Debug)

10.6.0>SET GLOBAL innodb_adaptive_flushing_lwm=0.0;
Query OK, 0 rows affected (0.000 sec)
 
10.6.0>SET GLOBAL innodb_max_dirty_pages_pct_lwm=0.000001;
Query OK, 0 rows affected (0.000 sec)
 
10.6.0>CREATE TABLE t (c DOUBLE) ENGINE=InnoDB;
Query OK, 0 rows affected (0.013 sec)
 
10.6.0>SET GLOBAL innodb_io_capacity=18446744073709551615;
Query OK, 0 rows affected, 2 warnings (0.000 sec)
 
10.6.0>SHOW WARNINGS;
+---------+------+--------------------------------------------------------------------------------------------+
| Level   | Code | Message                                                                                    |
+---------+------+--------------------------------------------------------------------------------------------+
| Warning | 1210 | Setting innodb_io_capacity to 18446744073709551615 higher than innodb_io_capacity_max 2000 |
| Warning | 1210 | Setting innodb_max_io_capacity to 18446744073709551614                                     |
+---------+------+--------------------------------------------------------------------------------------------+
2 rows in set (0.000 sec)
 
10.6.0>SELECT @@innodb_io_capacity;
+----------------------+
| @@innodb_io_capacity |
+----------------------+
| 18446744073709551615 |
+----------------------+
1 row in set (0.000 sec)
 
10.6.0>SELECT @@innodb_io_capacity_max;
+--------------------------+
| @@innodb_io_capacity_max |
+--------------------------+
|     18446744073709551614 |
+--------------------------+
1 row in set (0.000 sec)
 
10.6.0>SELECT SLEEP (3);
ERROR 2013 (HY000): Lost connection to MySQL server during query

Comment by Roel Van de Paar [ 2021-01-19 ]

Shorter/last patch confirmed to be working. Tested with both testcases against same version.

10.6.0 9118fd360a3da0bba521caf2a35c424968235ac4 (Debug)

10.6.0>SET GLOBAL innodb_adaptive_flushing_lwm=0.0;
Query OK, 0 rows affected (0.000 sec)
 
10.6.0>CREATE TABLE t (c DOUBLE) ENGINE=InnoDB;
Query OK, 0 rows affected (0.011 sec)
 
10.6.0>SET GLOBAL innodb_io_capacity=18446744073709551615;
Query OK, 0 rows affected, 2 warnings (0.000 sec)
 
10.6.0>SELECT SLEEP (3);
 
+-----------+
| SLEEP (3) |
+-----------+
|         0 |
+-----------+
1 row in set (3.000 sec)
 
10.6.0>

10.6.0 9118fd360a3da0bba521caf2a35c424968235ac4 (Debug)

10.6.0>SET GLOBAL innodb_adaptive_flushing_lwm=0.0;
Query OK, 0 rows affected (0.000 sec)
 
10.6.0>SET GLOBAL innodb_max_dirty_pages_pct_lwm=0.000001;
Query OK, 0 rows affected (0.000 sec)
 
10.6.0>CREATE TABLE t (c DOUBLE) ENGINE=InnoDB;
Query OK, 0 rows affected (0.009 sec)
 
10.6.0>SET GLOBAL innodb_io_capacity=18446744073709551615;
Query OK, 0 rows affected, 2 warnings (0.000 sec)
 
10.6.0>SHOW WARNINGS;
+---------+------+--------------------------------------------------------------------------------------------+
| Level   | Code | Message                                                                                    |
+---------+------+--------------------------------------------------------------------------------------------+
| Warning | 1210 | Setting innodb_io_capacity to 18446744073709551615 higher than innodb_io_capacity_max 2000 |
| Warning | 1210 | Setting innodb_max_io_capacity to 18446744073709551615                                     |
+---------+------+--------------------------------------------------------------------------------------------+
2 rows in set (0.000 sec)
 
10.6.0>SELECT @@innodb_io_capacity;
+----------------------+
| @@innodb_io_capacity |
+----------------------+
| 18446744073709551615 |
+----------------------+
1 row in set (0.000 sec)
 
10.6.0>SELECT @@innodb_io_capacity_max;
+--------------------------+
| @@innodb_io_capacity_max |
+--------------------------+
|     18446744073709551615 |
+--------------------------+
1 row in set (0.000 sec)
 
10.6.0>SELECT SLEEP (3);
 
+-----------+
| SLEEP (3) |
+-----------+
|         0 |
+-----------+
1 row in set (3.000 sec)
 
10.6.0>

Comment by Marko Mäkelä [ 2021-01-19 ]

If innodb_io_capacity is requested to be larger than innodb_io_capacity_max and more than half the maximum value of ulong, we must not try to compute innodb_io_capacity_max=2*innodb_io_capacity because that may lead to innodb_io_capacity_max being less than innodb_io_capacity.

(It makes little sense for the parameter to be that big. Even with 32-bit uint, it will take a while until some I/O system can write more than innodb_page_size<<32 bytes per second. But I will not touch that in this bug fix.)

Generated at Thu Feb 08 09:31:17 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.