[MDEV-36149] UBSAN in X is outside the range of representable values of type 'unsigned long' | page_cleaner_flush_pages_recommendation - Jira

Details

Type: Bug
Status: Closed (View Workflow)
Priority: Major
Resolution: Fixed
Affects Version/s: 10.6, 10.11, 11.4, 11.8, 12.0
Fix Version/s: 10.11.12, 11.4.6, 11.8.2
Component/s: Storage Engine - InnoDB
Labels:

Description

--source include/have_innodb.inc

CREATE OR REPLACE TABLE t (a INT) ENGINE=INNODB;

SELECT @@innodb_io_capacity;

SELECt @@innodb_io_capacity_max;

SET GLOBAL innodb_io_capacity=18446744073709551615;

SET GLOBAL innodb_max_dirty_pages_pct=1;

CREATE OR REPLACE TABLE t (a INT) ENGINE=INNODB;

CREATE OR REPLACE TABLE t (a INT) ENGINE=INNODB;

CREATE OR REPLACE TABLE t (a INT) ENGINE=INNODB;

CREATE OR REPLACE TABLE t (a INT) ENGINE=INNODB;

CREATE OR REPLACE TABLE t (a INT) ENGINE=INNODB;

CREATE OR REPLACE TABLE t (a INT) ENGINE=INNODB;

Leads to:

CS 11.8.1 6f1161aa34cbb178b00fc24cbc46e2e0e2af767a (Optimized, UBASAN, Clang) Build 24/02/2025
/test/11.8_opt_san/storage/innobase/buf/buf0flu.cc:2302:19: runtime error: 1.85291e+19 is outside the range of representable values of type 'unsigned long'
#0 0x557d0b95b4b3 in page_cleaner_flush_pages_recommendation(unsigned long, unsigned long, double, unsigned long, double) /test/11.8_opt_san/storage/innobase/buf/buf0flu.cc:2302:19
#1 0x557d0b95b4b3 in buf_flush_page_cleaner() /test/11.8_opt_san/storage/innobase/buf/buf0flu.cc:2619:18
#2 0x14ffc48ecdb3 in execute_native_thread_routine /build/gcc-14-ig5ci0/gcc-14-14.2.0/build/x86_64-linux-gnu/libstdc++-v3/src/c++11/../../../../../src/libstdc++-v3/src/c++11/thread.cc:104:18
#3 0x557d0930695c in asan_thread_start(void*) asan_interceptors.cpp.o
#4 0x14ffc449caa3 in start_thread nptl/pthread_create.c:447:8
#5 0x14ffc4529c3b in clone3 misc/../sysdeps/unix/sysv/linux/x86_64/clone3.S:78

SUMMARY: UndefinedBehaviorSanitizer: float-cast-overflow /test/11.8_opt_san/storage/innobase/buf/buf0flu.cc:2302:19

Setup:

Compiled with a recent version of Clang (I used Clang 18.1.3) with LLVM 18. Ubuntu instructions:

  # Note: It is strongly recommended to uninstall all old Clang & LLVM packages (ref  dpkg --list | grep -iE 'clang|llvm'  and use  apt purge  and  dpkg --purge  to remove the packages), before following these steps

     # Note: llvm-17-linker-tools installs /usr/lib/llvm-17/lib/LLVMgold.so, which is needed for compilation, and LLVMgold.so is no longer included in LLVM 18

     sudo apt install clang llvm-18 llvm-18-linker-tools llvm-18-runtime llvm-18-tools llvm-18-dev libstdc++-14-dev llvm-dev llvm-17-linker-tools

     sudo ln -s /usr/lib/llvm-17/lib/LLVMgold.so /usr/lib/llvm-18/lib/LLVMgold.so

Compiled with: "-DCMAKE_C_COMPILER=/usr/bin/clang -DCMAKE_CXX_COMPILER=/usr/bin/clang++ -DCMAKE_C{,XX}_FLAGS='-march=native -mtune=native'" and:

    -DWITH_ASAN=ON -DWITH_ASAN_SCOPE=ON -DWITH_UBSAN=ON -DWSREP_LIB_WITH_ASAN=ON

Set before execution:

    export UBSAN_OPTIONS=print_stacktrace=1:report_error_type=1   # And you may also want to supress UBSAN startup issues using 'suppressions=UBSAN.filter' in UBSAN_OPTIONS. For an example of UBSAN.filter, which includes current startup issues see: https://github.com/mariadb-corporation/mariadb-qa/blob/master/UBSAN.filter

Bug confirmed present in:
MariaDB: 11.4.6 (opt), 11.8.1 (opt), 12.0.0 (opt)

Bug (or feature/syntax) confirmed not present in:
MariaDB: 10.11.12 (dbg), 10.11.12 (opt), 12.0.0 (dbg)

Attachments

Issue Links

relates to

MDEV-24369 Page cleaner sleeps indefinitely despite innodb_max_dirty_pages_pct_lwm being exceeded

Closed

Activity

Ascending order - Click to sort in descending order

Marko Mäkelä added a comment - 2025-02-24 15:02

The statement on which the error is reported is as follows:

n_pages = ulint(dirty_pct * double(srv_io_capacity));

I think that we should consider making innodb_io_capacity a 32-bit parameter. There should be no way to be able to write more than 4 billion pages per second.

Marko Mäkelä added a comment - 2025-02-24 15:02 The statement on which the error is reported is as follows: n_pages = ulint(dirty_pct * double (srv_io_capacity)); I think that we should consider making innodb_io_capacity a 32-bit parameter. There should be no way to be able to write more than 4 billion pages per second.

Marko Mäkelä added a comment - 2025-02-24 15:41

ramesh, can you please provide an rr replay trace of this? I tried compiling 10.11 and 11.4 with GCC 11.4.2 and clang 19.1.7, but I could not reproduce this. I think that this should be reproducible already in 10.5 or 10.6, but will likely depend on the exact timing. I find it a little strange if 1 * 18446744073709551615 is being evaluated as 1.85291e+19, with only 2 correct significant digits.

Marko Mäkelä added a comment - 2025-02-24 15:41 ramesh , can you please provide an rr replay trace of this? I tried compiling 10.11 and 11.4 with GCC 11.4.2 and clang 19.1.7, but I could not reproduce this. I think that this should be reproducible already in 10.5 or 10.6, but will likely depend on the exact timing. I find it a little strange if 1 * 18446744073709551615 is being evaluated as 1.85291e+19, with only 2 correct significant digits.

Roel Van de Paar added a comment - 2025-02-25 04:53

Was able to reproduce on 12.0 opt

CS 12.0.0 c92add291e636c797e6d6ddca605905541b2a441 (Optimized, UBASAN, Clang) Build 15/02/2025
/test/12.0_opt_san/storage/innobase/buf/buf0flu.cc:2302:19: runtime error: 1.85291e+19 is outside the range of representable values of type 'unsigned long'
2025-02-25 15:51:22 0 [Note] /test/UBASAN_MD150225-mariadb-12.0.0-linux-x86_64-opt/bin/mariadbd (initiated by: root[root] @ localhost []): Normal shutdown
#0 0x55ad64c205a3 in page_cleaner_flush_pages_recommendation(unsigned long, unsigned long, double, unsigned long, double) /test/12.0_opt_san/storage/innobase/buf/buf0flu.cc:2302:19
#1 0x55ad64c205a3 in buf_flush_page_cleaner() /test/12.0_opt_san/storage/innobase/buf/buf0flu.cc:2619:18
#2 0x151628eeabb3 in execute_native_thread_routine /build/gcc-14-OQFzmN/gcc-14-14-20240412/build/x86_64-linux-gnu/libstdc++-v3/src/c++11/../../../../../src/libstdc++-v3/src/c++11/thread.cc:104:18
#3 0x55ad625cc99c in asan_thread_start(void*) asan_interceptors.cpp.o
#4 0x151628a9ca93 in start_thread nptl/pthread_create.c:447:8
#5 0x151628b29c3b in clone3 misc/../sysdeps/unix/sysv/linux/x86_64/clone3.S:78

SUMMARY: UndefinedBehaviorSanitizer: float-cast-overflow /test/12.0_opt_san/storage/innobase/buf/buf0flu.cc:2302:19

If it helps:

$ cat BUILD_CMD_CMAKE

cmake . -DCMAKE_C_COMPILER=/usr/bin/clang -DCMAKE_CXX_COMPILER=/usr/bin/clang++ -DWITH_SSL=bundled -DBUILD_CONFIG=mysql_release -DWITH_TOKUDB=0 -DWITH_JEMALLOC=no -DFEATURE_SET=community -DDEBUG_EXTNAME=OFF -DWITH_EMBEDDED_SERVER=0 -DENABLE_DOWNLOADS=1 -DDOWNLOAD_BOOST=1 -DWITH_BOOST=/tmp/boost_552181 -DENABLED_LOCAL_INFILE=1 -DENABLE_DTRACE=0 -DWITH_{SAFEMALLOC,NUMA}=OFF -DWITH_UNIT_TESTS=OFF -DCONC_WITH_{UNITTEST,SSL}=OFF -DPLUGIN_PERFSCHEMA=NO -DWITH_DBUG_TRACE=OFF -DWITH_ZLIB=bundled -DWITH_ROCKSDB=1 -DWITH_PAM=ON -DWITH_MARIABACKUP=0 -DFORCE_INSOURCE_BUILD=1 -DWITH_ASAN=ON -DWITH_ASAN_SCOPE=ON -DWITH_UBSAN=ON -DWSREP_LIB_WITH_ASAN=ON -DCMAKE_C{,XX}_FLAGS='-O2 -march=native -mtune=native' -DMYSQL_MAINTAINER_MODE=OFF -DWARNING_AS_ERROR='' -DCMAKE_BUILD_TYPE=RelWithDebInfo

$ clang --version

Ubuntu clang version 18.1.3 (1)

Roel Van de Paar added a comment - 2025-02-25 04:53 Was able to reproduce on 12.0 opt CS 12.0.0 c92add291e636c797e6d6ddca605905541b2a441 (Optimized, UBASAN, Clang) Build 15/02/2025 /test/12.0_opt_san/storage/innobase/buf/buf0flu.cc:2302:19: runtime error: 1.85291e+19 is outside the range of representable values of type 'unsigned long' 2025-02-25 15:51:22 0 [Note] /test/UBASAN_MD150225-mariadb-12.0.0-linux-x86_64-opt/bin/mariadbd (initiated by: root[root] @ localhost []): Normal shutdown #0 0x55ad64c205a3 in page_cleaner_flush_pages_recommendation(unsigned long, unsigned long, double, unsigned long, double) /test/12.0_opt_san/storage/innobase/buf/buf0flu.cc:2302:19 #1 0x55ad64c205a3 in buf_flush_page_cleaner() /test/12.0_opt_san/storage/innobase/buf/buf0flu.cc:2619:18 #2 0x151628eeabb3 in execute_native_thread_routine /build/gcc-14-OQFzmN/gcc-14-14-20240412/build/x86_64-linux-gnu/libstdc++-v3/src/c++11/../../../../../src/libstdc++-v3/src/c++11/thread.cc:104:18 #3 0x55ad625cc99c in asan_thread_start(void*) asan_interceptors.cpp.o #4 0x151628a9ca93 in start_thread nptl/pthread_create.c:447:8 #5 0x151628b29c3b in clone3 misc/../sysdeps/unix/sysv/linux/x86_64/clone3.S:78 SUMMARY: UndefinedBehaviorSanitizer: float-cast-overflow /test/12.0_opt_san/storage/innobase/buf/buf0flu.cc:2302:19 If it helps: $ cat BUILD_CMD_CMAKE cmake . -DCMAKE_C_COMPILER=/usr/bin/clang -DCMAKE_CXX_COMPILER=/usr/bin/clang++ -DWITH_SSL=bundled -DBUILD_CONFIG=mysql_release -DWITH_TOKUDB=0 -DWITH_JEMALLOC=no -DFEATURE_SET=community -DDEBUG_EXTNAME=OFF -DWITH_EMBEDDED_SERVER=0 -DENABLE_DOWNLOADS=1 -DDOWNLOAD_BOOST=1 -DWITH_BOOST=/tmp/boost_552181 -DENABLED_LOCAL_INFILE=1 -DENABLE_DTRACE=0 -DWITH_{SAFEMALLOC,NUMA}=OFF -DWITH_UNIT_TESTS=OFF -DCONC_WITH_{UNITTEST,SSL}=OFF -DPLUGIN_PERFSCHEMA=NO -DWITH_DBUG_TRACE=OFF -DWITH_ZLIB=bundled -DWITH_ROCKSDB=1 -DWITH_PAM=ON -DWITH_MARIABACKUP=0 -DFORCE_INSOURCE_BUILD=1 -DWITH_ASAN=ON -DWITH_ASAN_SCOPE=ON -DWITH_UBSAN=ON -DWSREP_LIB_WITH_ASAN=ON -DCMAKE_C{,XX}_FLAGS='-O2 -march=native -mtune=native' -DMYSQL_MAINTAINER_MODE=OFF -DWARNING_AS_ERROR='' -DCMAKE_BUILD_TYPE=RelWithDebInfo $ clang --version Ubuntu clang version 18.1.3 (1)

Ramesh Sivaraman added a comment - 2025-02-25 05:38

marko rr traced saved in galapq server, trace location : /test/12.0/mariadb-12.0.0-linux-x86_64/rr

Ramesh Sivaraman added a comment - 2025-02-25 05:38 marko rr traced saved in galapq server, trace location : /test/12.0/mariadb-12.0.0-linux-x86_64/rr

Marko Mäkelä added a comment - 2025-02-25 10:03

We couldn’t produce a working rr replay trace of this. I am able to reproduce this under GDB on the executable produced by ramesh, in the same environment. I assume that having a suitably slow storage is a prerequisite for hitting this.

Thread 11 "page_cleaner" hit Breakpoint 1, 0x0000559e3d7d1370 in __ubsan::ScopedReport::ScopedReport(__ubsan::ReportOptions, __ubsan::Location, __ubsan::ErrorType) ()

(gdb) backtrace

#0  0x0000559e3d7d1370 in __ubsan::ScopedReport::ScopedReport(__ubsan::ReportOptions, __ubsan::Location, __ubsan::ErrorType) ()

#1  0x0000559e3d7d3eeb in handleFloatCastOverflow(void*, unsigned long, __ubsan::ReportOptions) ()

#2  0x0000559e3d7d3dbe in __ubsan_handle_float_cast_overflow ()

#3  0x0000559e3fde9fc4 in page_cleaner_flush_pages_recommendation (last_pages_in=0, oldest_lsn=47629, pct_lwm=<optimized out>, dirty_blocks=81, dirty_pct=<optimized out>) at /test/12.0/storage/innobase/buf/buf0flu.cc:2302

#4  buf_flush_page_cleaner () at /test/12.0/storage/innobase/buf/buf0flu.cc:2619

The value of dirty_pct is not available in any stack frame, probably because it is in a SIMD register.

   0x0000559e3fde8a65 <+6189>:	vmulsd %xmm1,%xmm0,%xmm1

   0x0000559e3fde8a69 <+6193>:	vucomisd 0xbb296f(%rip),%xmm1        # 0x559e4099b3e0

   0x0000559e3fde8a71 <+6201>:	jbe    0x559e3fde9fae <_ZL22buf_flush_page_cleanerv+11638>

   0x0000559e3fde8a77 <+6207>:	vmovsd 0xbb2969(%rip),%xmm0        # 0x559e4099b3e8

   0x0000559e3fde8a7f <+6215>:	vucomisd %xmm1,%xmm0

   0x0000559e3fde8a83 <+6219>:	jbe    0x559e3fde9fae <_ZL22buf_flush_page_cleanerv+11638>

The conditional jump target is what would report the UBSAN error. I am not sure if we have the correct values of the SIMD registers at this point of execution:

(gdb) info register xmm0

xmm0           {v8_bfloat16 = {0x0, 0x0, 0x0, 0x0, 0x9fc4, 0x3fde, 0x559e, 0x0}, v8_half = {0x0, 0x0, 0x0, 0x0, 0x9fc4, 0x3fde, 0x559e, 0x0}, v4_float = {0x0, 0x0, 0x3fde9fc4, 0x559e}, v2_double = {0x0, 0x559e3fde9fc4}, v16_int8 = {0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0xc4, 0x9f, 0xde, 0x3f, 0x9e, 0x55, 0x0, 0x0}, v8_int16 = {0x0, 0x0, 0x0, 0x0, 0x9fc4, 0x3fde, 0x559e, 0x0}, v4_int32 = {0x0, 0x0, 0x3fde9fc4, 0x559e}, v2_int64 = {0x0, 0x559e3fde9fc4}, uint128 = 0x559e3fde9fc40000000000000000}

(gdb) info register xmm1

xmm1           {v8_bfloat16 = {0x8fe, 0x0, 0x13, 0x0, 0xfcb0, 0x313f, 0x1531, 0x0}, v8_half = {0x8fe, 0x0, 0x13, 0x0, 0xfcb0, 0x313f, 0x1531, 0x0}, v4_float = {0x8fe, 0x13, 0x313ffcb0, 0x1531}, v2_double = {0x13000008fe, 0x1531313ffcb0}, v16_int8 = {0xfe, 0x8, 0x0, 0x0, 0x13, 0x0, 0x0, 0x0, 0xb0, 0xfc, 0x3f, 0x31, 0x31, 0x15, 0x0, 0x0}, v8_int16 = {0x8fe, 0x0, 0x13, 0x0, 0xfcb0, 0x313f, 0x1531, 0x0}, v4_int32 = {0x8fe, 0x13, 0x313ffcb0, 0x1531}, v2_int64 = {0x13000008fe, 0x1531313ffcb0}, uint128 = 0x1531313ffcb000000013000008fe}

(gdb) print $xmm0

$3 = {v8_bfloat16 = {0, 0, 0, 0, -8.301e-20, 1.734, 2.172e+13, 0}, v8_half = {0, 0, 0, 0, -0.0075836, 1.9668, 89.875, 0}, v4_float = {0, 0, 1.73925066, 3.07136597e-41}, v2_double = {0, 4.6510433164642935e-310}, v16_int8 = {0, 0, 0, 0, 0, 0, 0, 0, -60, -97, -34, 63, -98, 85, 0, 0}, v8_int16 = {0, 0, 0, 0, -24636, 16350, 21918, 0}, v4_int32 = {0, 0, 1071554500, 21918}, v2_int64 = {0, 94138164748228}, uint128 = 1736542632679268283177023410536448}

(gdb) print $xmm1

$4 = {v8_bfloat16 = {1.529e-33, 0, 1.745e-39, 0, -7.311e+36, 2.779e-09, 3.574e-26, 0}, v8_half = {0.00015235, 0, 1.1325e-06, 0, -nan(0xb0), 0.16394, 0.0012674, 0}, v4_float = {3.22578906e-42, 2.66246708e-44, 2.79377943e-09, 7.60204417e-42}, v2_double = {4.0317921165679291e-313, 1.1512235401086014e-310}, v16_int8 = {-2, 8, 0, 0, 19, 0, 0, 0, -80, -4, 63, 49, 49, 21, 0, 0}, v8_int16 = {2302, 0, 19, 0, -848, 12607, 5425, 0}, v4_int32 = {2302, 19, 826277040, 5425}, v2_int64 = {81604380926, 23301023857840}, uint128 = 429828023760974893715186530650366}

I see that info register does not display any value in floating point, but print does. Unfortunately, none of these values match the error message.

It would be helpful if this could be reproduced with a non-SIMD, non-ASAN executable. I can’t suggest a specific option; as far as I understand, x86-64 always implies SSE2 and therefore also some instructions for the XMM register file are always there. Possibly, omitting any -march=native might help.

Marko Mäkelä added a comment - 2025-02-25 10:03 We couldn’t produce a working rr replay trace of this. I am able to reproduce this under GDB on the executable produced by ramesh , in the same environment. I assume that having a suitably slow storage is a prerequisite for hitting this. Thread 11 "page_cleaner" hit Breakpoint 1, 0x0000559e3d7d1370 in __ubsan::ScopedReport::ScopedReport(__ubsan::ReportOptions, __ubsan::Location, __ubsan::ErrorType) () (gdb) backtrace #0 0x0000559e3d7d1370 in __ubsan::ScopedReport::ScopedReport(__ubsan::ReportOptions, __ubsan::Location, __ubsan::ErrorType) () #1 0x0000559e3d7d3eeb in handleFloatCastOverflow(void*, unsigned long, __ubsan::ReportOptions) () #2 0x0000559e3d7d3dbe in __ubsan_handle_float_cast_overflow () #3 0x0000559e3fde9fc4 in page_cleaner_flush_pages_recommendation (last_pages_in=0, oldest_lsn=47629, pct_lwm=<optimized out>, dirty_blocks=81, dirty_pct=<optimized out>) at /test/12.0/storage/innobase/buf/buf0flu.cc:2302 #4 buf_flush_page_cleaner () at /test/12.0/storage/innobase/buf/buf0flu.cc:2619 The value of dirty_pct is not available in any stack frame, probably because it is in a SIMD register. 0x0000559e3fde8a65 <+6189>: vmulsd %xmm1,%xmm0,%xmm1 0x0000559e3fde8a69 <+6193>: vucomisd 0xbb296f(%rip),%xmm1 # 0x559e4099b3e0 0x0000559e3fde8a71 <+6201>: jbe 0x559e3fde9fae <_ZL22buf_flush_page_cleanerv+11638> 0x0000559e3fde8a77 <+6207>: vmovsd 0xbb2969(%rip),%xmm0 # 0x559e4099b3e8 0x0000559e3fde8a7f <+6215>: vucomisd %xmm1,%xmm0 0x0000559e3fde8a83 <+6219>: jbe 0x559e3fde9fae <_ZL22buf_flush_page_cleanerv+11638> The conditional jump target is what would report the UBSAN error. I am not sure if we have the correct values of the SIMD registers at this point of execution: (gdb) info register xmm0 xmm0 {v8_bfloat16 = {0x0, 0x0, 0x0, 0x0, 0x9fc4, 0x3fde, 0x559e, 0x0}, v8_half = {0x0, 0x0, 0x0, 0x0, 0x9fc4, 0x3fde, 0x559e, 0x0}, v4_float = {0x0, 0x0, 0x3fde9fc4, 0x559e}, v2_double = {0x0, 0x559e3fde9fc4}, v16_int8 = {0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0xc4, 0x9f, 0xde, 0x3f, 0x9e, 0x55, 0x0, 0x0}, v8_int16 = {0x0, 0x0, 0x0, 0x0, 0x9fc4, 0x3fde, 0x559e, 0x0}, v4_int32 = {0x0, 0x0, 0x3fde9fc4, 0x559e}, v2_int64 = {0x0, 0x559e3fde9fc4}, uint128 = 0x559e3fde9fc40000000000000000} (gdb) info register xmm1 xmm1 {v8_bfloat16 = {0x8fe, 0x0, 0x13, 0x0, 0xfcb0, 0x313f, 0x1531, 0x0}, v8_half = {0x8fe, 0x0, 0x13, 0x0, 0xfcb0, 0x313f, 0x1531, 0x0}, v4_float = {0x8fe, 0x13, 0x313ffcb0, 0x1531}, v2_double = {0x13000008fe, 0x1531313ffcb0}, v16_int8 = {0xfe, 0x8, 0x0, 0x0, 0x13, 0x0, 0x0, 0x0, 0xb0, 0xfc, 0x3f, 0x31, 0x31, 0x15, 0x0, 0x0}, v8_int16 = {0x8fe, 0x0, 0x13, 0x0, 0xfcb0, 0x313f, 0x1531, 0x0}, v4_int32 = {0x8fe, 0x13, 0x313ffcb0, 0x1531}, v2_int64 = {0x13000008fe, 0x1531313ffcb0}, uint128 = 0x1531313ffcb000000013000008fe} (gdb) print $xmm0 $3 = {v8_bfloat16 = {0, 0, 0, 0, -8.301e-20, 1.734, 2.172e+13, 0}, v8_half = {0, 0, 0, 0, -0.0075836, 1.9668, 89.875, 0}, v4_float = {0, 0, 1.73925066, 3.07136597e-41}, v2_double = {0, 4.6510433164642935e-310}, v16_int8 = {0, 0, 0, 0, 0, 0, 0, 0, -60, -97, -34, 63, -98, 85, 0, 0}, v8_int16 = {0, 0, 0, 0, -24636, 16350, 21918, 0}, v4_int32 = {0, 0, 1071554500, 21918}, v2_int64 = {0, 94138164748228}, uint128 = 1736542632679268283177023410536448} (gdb) print $xmm1 $4 = {v8_bfloat16 = {1.529e-33, 0, 1.745e-39, 0, -7.311e+36, 2.779e-09, 3.574e-26, 0}, v8_half = {0.00015235, 0, 1.1325e-06, 0, -nan(0xb0), 0.16394, 0.0012674, 0}, v4_float = {3.22578906e-42, 2.66246708e-44, 2.79377943e-09, 7.60204417e-42}, v2_double = {4.0317921165679291e-313, 1.1512235401086014e-310}, v16_int8 = {-2, 8, 0, 0, 19, 0, 0, 0, -80, -4, 63, 49, 49, 21, 0, 0}, v8_int16 = {2302, 0, 19, 0, -848, 12607, 5425, 0}, v4_int32 = {2302, 19, 826277040, 5425}, v2_int64 = {81604380926, 23301023857840}, uint128 = 429828023760974893715186530650366} I see that info register does not display any value in floating point, but print does. Unfortunately, none of these values match the error message. It would be helpful if this could be reproduced with a non-SIMD, non-ASAN executable. I can’t suggest a specific option; as far as I understand, x86-64 always implies SSE2 and therefore also some instructions for the XMM register file are always there. Possibly, omitting any -march=native might help.

Marko Mäkelä added a comment - 2025-02-25 10:41

I don’t see why this would only affect 11.4 and later versions. The logic has not been changed since MariaDB Server 10.6.

GDB won’t cooperate when a floating-point value is stored in a SIMD register, but I could imagine that the value of dirty_pct could be as follows in another debugging session where I repeated this:

(gdb) p buf_pool.LRU.count

$1 = 343

(gdb) p buf_pool.free.count

$2 = 7721

(gdb) p buf_pool.flush_list.count

$3 = 81

(gdb) p 81*100.0/(343+7721)

$4 = 1.0044642857142858

In page_cleaner_flush_pages_recommendation() there is some logic that looks questionable:

	const double max_pct = srv_max_buf_pool_modified_pct;

	if (!prev_lsn || !pct_for_lsn) {

		prev_time = curr_time;

		prev_lsn = cur_lsn;

		if (max_pct > 0.0) {

			dirty_pct /= max_pct;

		n_pages = ulint(dirty_pct * double(srv_io_capacity));

We have max_pct=1, and dividing by 1 is a no-op. So, it looks like we could try to multiply ~0ULL by something larger than 1, and clearly the result would not fit in 64 bits. In the Description, we seem to have had dirty_pct=1.00446, which is close to what I observed in this debugging session.

I think that we should not only fix the limits of dirty_pct, but also change the type of innodb_io_capacity and related parameters to be INT UNSIGNED (always 32 bits). With the default innodb_page_size=16k and innodb_doublewrite=ON, the maximum write rate that can be represented by a 32-bit unsigned integer would be 2^(32+14+1) = 2⁴⁷ bytes per second, or 128 TiB/s. It will take a while before we reach such write speeds. I believe that a 16-bit innodb_io_capacity would be insufficient already today: 2^(16+14+1) = 2 GiB/s is on the slow side even for consumer-grade NVMe SSDs.

Marko Mäkelä added a comment - 2025-02-25 10:41 I don’t see why this would only affect 11.4 and later versions. The logic has not been changed since MariaDB Server 10.6. GDB won’t cooperate when a floating-point value is stored in a SIMD register, but I could imagine that the value of dirty_pct could be as follows in another debugging session where I repeated this: (gdb) p buf_pool.LRU.count $1 = 343 (gdb) p buf_pool.free.count $2 = 7721 (gdb) p buf_pool.flush_list.count $3 = 81 (gdb) p 81*100.0/(343+7721) $4 = 1.0044642857142858 In page_cleaner_flush_pages_recommendation() there is some logic that looks questionable: const double max_pct = srv_max_buf_pool_modified_pct; if (!prev_lsn || !pct_for_lsn) { prev_time = curr_time; prev_lsn = cur_lsn; if (max_pct > 0.0) { dirty_pct /= max_pct; } n_pages = ulint(dirty_pct * double (srv_io_capacity)); We have max_pct=1 , and dividing by 1 is a no-op. So, it looks like we could try to multiply ~0ULL by something larger than 1, and clearly the result would not fit in 64 bits. In the Description, we seem to have had dirty_pct=1.00446 , which is close to what I observed in this debugging session. I think that we should not only fix the limits of dirty_pct , but also change the type of innodb_io_capacity and related parameters to be INT UNSIGNED (always 32 bits). With the default innodb_page_size=16k and innodb_doublewrite=ON , the maximum write rate that can be represented by a 32-bit unsigned integer would be 2^(32+14+1) = 2⁴⁷ bytes per second, or 128 TiB/s. It will take a while before we reach such write speeds. I believe that a 16-bit innodb_io_capacity would be insufficient already today: 2^(16+14+1) = 2 GiB/s is on the slow side even for consumer-grade NVMe SSDs.

Debarun Banerjee added a comment - 2025-02-26 07:46

I could repeat the issue with 10.11 using the test scenario described.

2313│                 if (max_pct > 0.0) {

2314│                         dirty_pct /= max_pct;

2315│                 }

2316│

2317│                 n_pages = ulint(dirty_pct * double(srv_io_capacity));

2318│                 if (n_pages < dirty_blocks) {

2319│                         n_pages= std::min<ulint>(srv_io_capacity, dirty_blocks);

2320│                 }

(gdb) p srv_io_capacity

$10 = 18446744073709551615

(gdb) p dirty_pct

$11 = 1.0168650793650793

(gdb) p dirty_pct * srv_io_capacity

$13 = 1.875784987653997e+19

After the assignment

(gdb) p n_pages

$12 = 0

Now, it should be fine to go beyond innodb_io_capacity sometimes during calculation and specifically ~~MDEV-24369~~ has introduced a logic for aggressive flushing when dirty page percentage in buffer pool
exceeds innodb_max_dirty_pages_pct. Based on this, we could set our target in multiples of innodb_io_capacity once we go beyond the dirty page threshold.

1. I agree with marko that we should prevent setting io_capacity to unrealistic values and define a practical limit to it. Pull-3857 introduces limits for innodb_io_capacity_max and innodb_io_capacity to the maximum of 4 byte unsigned integer i.e. 4294967295 (2^32-1). For 16k page size this limit translates to 64 TiB/sec write IO speed which looks sufficient.

IMHO, the above patch should be sufficient to address the current issue. Hi ramesh, may I request you to test the patch and confirm that it addresses the issue adequately ?

This would also require some documentation change to update the max limit for 2 innodb configurations. It is not expected to affect any user as the limits are beyond any practical value.

https://mariadb.com/kb/en/innodb-system-variables/#innodb_io_capacity

Range: 100 to 4294967295 (2^32-1)

https://mariadb.com/kb/en/innodb-system-variables/#innodb_io_capacity_max

Range: 100 to 4294967295 (2^32-1)

2. marko While it is possible that we could improve page_cleaner_flush_pages_recommendation w.r.t. how aggressively the flush target is set, it should only be done based on good amount of performance / IO testing. Changing anything here would impact the current flushing behaviour. Since ~~MDEV-24369~~ is there from 10.5.9, it is probably best to attempt such changes as improvement work /project in latest to avoid impact on GA versions. It looks out of scope for the current bug-fix.

Debarun Banerjee added a comment - 2025-02-26 07:46 I could repeat the issue with 10.11 using the test scenario described. 2313│ if (max_pct > 0.0) { 2314│ dirty_pct /= max_pct; 2315│ } 2316│ 2317│ n_pages = ulint(dirty_pct * double(srv_io_capacity)); 2318│ if (n_pages < dirty_blocks) { 2319│ n_pages= std::min<ulint>(srv_io_capacity, dirty_blocks); 2320│ } (gdb) p srv_io_capacity $10 = 18446744073709551615 (gdb) p dirty_pct $11 = 1.0168650793650793 (gdb) p dirty_pct * srv_io_capacity $13 = 1.875784987653997e+19 After the assignment (gdb) p n_pages $12 = 0 Now, it should be fine to go beyond innodb_io_capacity sometimes during calculation and specifically MDEV-24369 has introduced a logic for aggressive flushing when dirty page percentage in buffer pool exceeds innodb_max_dirty_pages_pct. Based on this, we could set our target in multiples of innodb_io_capacity once we go beyond the dirty page threshold. 1. I agree with marko that we should prevent setting io_capacity to unrealistic values and define a practical limit to it. Pull-3857 introduces limits for innodb_io_capacity_max and innodb_io_capacity to the maximum of 4 byte unsigned integer i.e. 4294967295 (2^32-1). For 16k page size this limit translates to 64 TiB/sec write IO speed which looks sufficient. IMHO, the above patch should be sufficient to address the current issue. Hi ramesh , may I request you to test the patch and confirm that it addresses the issue adequately ? This would also require some documentation change to update the max limit for 2 innodb configurations. It is not expected to affect any user as the limits are beyond any practical value. https://mariadb.com/kb/en/innodb-system-variables/#innodb_io_capacity Range: 100 to 4294967295 (2^32-1) https://mariadb.com/kb/en/innodb-system-variables/#innodb_io_capacity_max Range: 100 to 4294967295 (2^32-1) 2. marko While it is possible that we could improve page_cleaner_flush_pages_recommendation w.r.t. how aggressively the flush target is set, it should only be done based on good amount of performance / IO testing. Changing anything here would impact the current flushing behaviour. Since MDEV-24369 is there from 10.5.9, it is probably best to attempt such changes as improvement work /project in latest to avoid impact on GA versions. It looks out of scope for the current bug-fix.

Ramesh Sivaraman added a comment - 2025-02-26 11:37

debarun Test case passes without any issues in the bug fix branch, also initiated new test run to verify the patch. No issues have been found in the test run so far. I will let you know if the test run hit any issues related to this.

Ramesh Sivaraman added a comment - 2025-02-26 11:37 debarun Test case passes without any issues in the bug fix branch, also initiated new test run to verify the patch. No issues have been found in the test run so far. I will let you know if the test run hit any issues related to this.

Marko Mäkelä added a comment - 2025-02-28 07:54

The proposed fix addresses clearly the problem on 64-bit systems. On 32-bit targets, which we still support, we would seem to need something conceptually similar to the C++26 std::saturate_cast.

Marko Mäkelä added a comment - 2025-02-28 07:54 The proposed fix addresses clearly the problem on 64-bit systems. On 32-bit targets, which we still support, we would seem to need something conceptually similar to the C++26 std::saturate_cast .

Debarun Banerjee added a comment - 2025-03-17 09:37

This MDEV requires documentation change as it restricts the maximum value of two Innodb configuration variables. It is not expected to affect any user as the limits are beyond any practical value.

https://mariadb.com/kb/en/innodb-system-variables/#innodb_io_capacity

Range: 100 to 4294967295 (2^32-1)

https://mariadb.com/kb/en/innodb-system-variables/#innodb_io_capacity_max

Range: 100 to 4294967295 (2^32-1)

Debarun Banerjee added a comment - 2025-03-17 09:37 This MDEV requires documentation change as it restricts the maximum value of two Innodb configuration variables. It is not expected to affect any user as the limits are beyond any practical value. https://mariadb.com/kb/en/innodb-system-variables/#innodb_io_capacity Range: 100 to 4294967295 (2^32-1) https://mariadb.com/kb/en/innodb-system-variables/#innodb_io_capacity_max Range: 100 to 4294967295 (2^32-1)

People

Assignee:: Debarun Banerjee

Reporter:: Ramesh Sivaraman

Votes:: 0 Vote for this issue

Watchers:: 4 Start watching this issue

Dates

Created:: 2025-02-24 08:11

Updated:: 2025-03-17 09:37

Resolved:: 2025-03-17 09:26

Git Integration

Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.