Details
-
Bug
-
Status: Closed (View Workflow)
-
Blocker
-
Resolution: Fixed
-
10.6.0, 10.6.1
-
None
Description
This is almost 100% reproducable crash on MariaDB server 10.6 as of 7e1ec1550ceff29a983bf799622d97b73b79ce43 compiled with -DWITH_URING=yes.
I run sysbench-tpcc (https://github.com/Percona-Lab/sysbench-tpcc) prepare on 40-core machine as
./tpcc.lua --mysql-host=yang04g --mysql-user=sbtest --mysql-password=sbtest --mysql-db=sbtest --time=1200 --threads=56 --report-interval=1 --tables=10 --scale=100 --use_fk=0 --mysql_table_options='DEFAULT CHARSET=utf8mb4' prepare
|
against the similar 40-core machine with the mariadb server. After several minutes of the workload the server crashes.
Backtrace:
10.6 7e1ec1550ceff29a983bf799622d97b73b79ce43 |
#0 0x00007f2e109d8aa1 in pthread_kill () from /lib64/libpthread.so.0
|
#1 0x000055af0b0902c7 in my_write_core (sig=<optimized out>) at /root/krizhanovsky/server/mysys/stacktrace.c:424
|
#2 0x000055af0abd3610 in handle_fatal_signal (sig=6) at /root/krizhanovsky/server/sql/signal_handler.cc:343
|
#3 <signal handler called>
|
#4 0x00007f2e10634387 in raise () from /lib64/libc.so.6
|
#5 0x00007f2e10635a78 in abort () from /lib64/libc.so.6
|
#6 0x000055af0a8da889 in ut_dbg_assertion_failed (expr=expr@entry=0x55af0b2a86a7 "cb->m_err == DB_SUCCESS",
|
file=file@entry=0x55af0b2a8a10 "/root/krizhanovsky/server/storage/innobase/os/os0file.cc", line=line@entry=3843)
|
at /root/krizhanovsky/server/storage/innobase/ut/ut0dbg.cc:60
|
#7 0x000055af0a8c3fe0 in io_callback (cb=<optimized out>) at /root/krizhanovsky/server/storage/innobase/os/os0file.cc:3843
|
#8 io_callback (cb=<optimized out>) at /root/krizhanovsky/server/storage/innobase/os/os0file.cc:3841
|
#9 0x000055af0b035668 in tpool::task_group::execute (this=0x55af0c8f27d0, t=0x55af0c917c78) at /root/krizhanovsky/server/tpool/task_group.cc:55
|
#10 0x000055af0b0345af in tpool::thread_pool_generic::worker_main (this=0x55af0c816320, thread_var=0x55af0c823fc0) at /root/krizhanovsky/server/tpool/tpool_generic.cc:550
|
#11 0x000055af0b0f7cff in execute_native_thread_routine ()
|
#12 0x00007f2e109d3ea5 in start_thread () from /lib64/libpthread.so.0
|
#13 0x00007f2e106fc8dd in clone () from /lib64/libc.so.6
|
Following patch
--- a/tpool/aio_liburing.cc
|
+++ b/tpool/aio_liburing.cc
|
@@ -152,6 +152,9 @@ class aio_uring final : public tpool::aio
|
if (res < 0)
|
{
|
iocb->m_err= -res;
|
+ my_printf_error(ER_UNKNOWN_ERROR,
|
+ "io_uring_cqe_get_data() returned %d\n",
|
+ ME_ERROR_LOG | ME_FATAL, res);
|
iocb->m_ret_len= 0;
|
}
|
else |
produces line
2021-05-23 11:07:09 0 [ERROR] mariadbd: io_uring_cqe_get_data() returned -11
|
in the error log.
Attachments
Issue Links
- is caused by
-
MDEV-24883 add io_uring support for tpool
-
- Closed
-
Activity
Field | Original Value | New Value |
---|---|---|
Issue Type | Task [ 3 ] | Bug [ 1 ] |
Link |
This issue is caused by |
Fix Version/s | 10.6 [ 24028 ] | |
Affects Version/s | 10.6.1 [ 24437 ] | |
Affects Version/s | 10.6.0 [ 24431 ] | |
Assignee | Eugene Kosov [ kevg ] |
Description |
This is almost 100% reproducable crash on MariaDB server 10.6 as of 7e1ec1550ceff29a983bf799622d97b73b79ce43 compiled with -DWITH_URING=yes.
I run sysbench-tpcc (https://github.com/Percona-Lab/sysbench-tpcc) prepare on 40-core machine as ./tpcc.lua --mysql-host=yang04g --mysql-user=sbtest --mysql-password=sbtest --mysql-db=sbtest --time=1200 --threads=56 --report-interval=1 --tables=10 --scale=100 --use_fk=0 --mysql_table_options='DEFAULT CHARSET=utf8mb4' prepare against the similar 40-core machine with the mariadb server. After several minutes of the workload the server crashes. Backtrace: #0 0x00007f2e109d8aa1 in pthread_kill () from /lib64/libpthread.so.0 #1 0x000055af0b0902c7 in my_write_core (sig=<optimized out>) at /root/krizhanovsky/server/mysys/stacktrace.c:424 #2 0x000055af0abd3610 in handle_fatal_signal (sig=6) at /root/krizhanovsky/server/sql/signal_handler.cc:343 #3 <signal handler called> #4 0x00007f2e10634387 in raise () from /lib64/libc.so.6 #5 0x00007f2e10635a78 in abort () from /lib64/libc.so.6 #6 0x000055af0a8da889 in ut_dbg_assertion_failed (expr=expr@entry=0x55af0b2a86a7 "cb->m_err == DB_SUCCESS", file=file@entry=0x55af0b2a8a10 "/root/krizhanovsky/server/storage/innobase/os/os0file.cc", line=line@entry=3843) at /root/krizhanovsky/server/storage/innobase/ut/ut0dbg.cc:60 #7 0x000055af0a8c3fe0 in io_callback (cb=<optimized out>) at /root/krizhanovsky/server/storage/innobase/os/os0file.cc:3843 #8 io_callback (cb=<optimized out>) at /root/krizhanovsky/server/storage/innobase/os/os0file.cc:3841 #9 0x000055af0b035668 in tpool::task_group::execute (this=0x55af0c8f27d0, t=0x55af0c917c78) at /root/krizhanovsky/server/tpool/task_group.cc:55 #10 0x000055af0b0345af in tpool::thread_pool_generic::worker_main (this=0x55af0c816320, thread_var=0x55af0c823fc0) at /root/krizhanovsky/server/tpool/tpool_generic.cc:550 #11 0x000055af0b0f7cff in execute_native_thread_routine () #12 0x00007f2e109d3ea5 in start_thread () from /lib64/libpthread.so.0 #13 0x00007f2e106fc8dd in clone () from /lib64/libc.so.6 Following patch --- a/tpool/aio_liburing.cc +++ b/tpool/aio_liburing.cc @@ -152,6 +152,9 @@ class aio_uring final : public tpool::aio if (res < 0) { iocb->m_err= -res; + my_printf_error(ER_UNKNOWN_ERROR, + "io_uring_cqe_get_data() returned %d\n", + ME_ERROR_LOG | ME_FATAL, res); iocb->m_ret_len= 0; } else produces line 2021-05-23 11:07:09 0 [ERROR] mariadbd: io_uring_cqe_get_data() returned -11 in the error log. |
This is almost 100% reproducable crash on MariaDB server 10.6 as of 7e1ec1550ceff29a983bf799622d97b73b79ce43 compiled with -DWITH_URING=yes.
I run sysbench-tpcc (https://github.com/Percona-Lab/sysbench-tpcc) prepare on 40-core machine as {code:sh} ./tpcc.lua --mysql-host=yang04g --mysql-user=sbtest --mysql-password=sbtest --mysql-db=sbtest --time=1200 --threads=56 --report-interval=1 --tables=10 --scale=100 --use_fk=0 --mysql_table_options='DEFAULT CHARSET=utf8mb4' prepare {code} against the similar 40-core machine with the mariadb server. After several minutes of the workload the server crashes. Backtrace: {noformat:title=10.6 7e1ec1550ceff29a983bf799622d97b73b79ce43} #0 0x00007f2e109d8aa1 in pthread_kill () from /lib64/libpthread.so.0 #1 0x000055af0b0902c7 in my_write_core (sig=<optimized out>) at /root/krizhanovsky/server/mysys/stacktrace.c:424 #2 0x000055af0abd3610 in handle_fatal_signal (sig=6) at /root/krizhanovsky/server/sql/signal_handler.cc:343 #3 <signal handler called> #4 0x00007f2e10634387 in raise () from /lib64/libc.so.6 #5 0x00007f2e10635a78 in abort () from /lib64/libc.so.6 #6 0x000055af0a8da889 in ut_dbg_assertion_failed (expr=expr@entry=0x55af0b2a86a7 "cb->m_err == DB_SUCCESS", file=file@entry=0x55af0b2a8a10 "/root/krizhanovsky/server/storage/innobase/os/os0file.cc", line=line@entry=3843) at /root/krizhanovsky/server/storage/innobase/ut/ut0dbg.cc:60 #7 0x000055af0a8c3fe0 in io_callback (cb=<optimized out>) at /root/krizhanovsky/server/storage/innobase/os/os0file.cc:3843 #8 io_callback (cb=<optimized out>) at /root/krizhanovsky/server/storage/innobase/os/os0file.cc:3841 #9 0x000055af0b035668 in tpool::task_group::execute (this=0x55af0c8f27d0, t=0x55af0c917c78) at /root/krizhanovsky/server/tpool/task_group.cc:55 #10 0x000055af0b0345af in tpool::thread_pool_generic::worker_main (this=0x55af0c816320, thread_var=0x55af0c823fc0) at /root/krizhanovsky/server/tpool/tpool_generic.cc:550 #11 0x000055af0b0f7cff in execute_native_thread_routine () #12 0x00007f2e109d3ea5 in start_thread () from /lib64/libpthread.so.0 #13 0x00007f2e106fc8dd in clone () from /lib64/libc.so.6 {noformat} Following patch {code:diff} --- a/tpool/aio_liburing.cc +++ b/tpool/aio_liburing.cc @@ -152,6 +152,9 @@ class aio_uring final : public tpool::aio if (res < 0) { iocb->m_err= -res; + my_printf_error(ER_UNKNOWN_ERROR, + "io_uring_cqe_get_data() returned %d\n", + ME_ERROR_LOG | ME_FATAL, res); iocb->m_ret_len= 0; } else {code} produces line {noformat} 2021-05-23 11:07:09 0 [ERROR] mariadbd: io_uring_cqe_get_data() returned -11 {noformat} in the error log. |
Priority | Major [ 3 ] | Blocker [ 1 ] |
Summary | Assertion failure on bad io_uring_cqe_get_data() return code | Assertion failure on io_uring_cqe_get_data() returning -EAGAIN |
Assignee | Eugene Kosov [ kevg ] | Marko Mäkelä [ marko ] |
issue.field.resolutiondate | 2021-06-14 10:32:40.0 | 2021-06-14 10:32:40.993 |
Fix Version/s | 10.6.2 [ 25800 ] | |
Fix Version/s | 10.6 [ 24028 ] | |
Resolution | Fixed [ 1 ] | |
Status | Open [ 1 ] | Closed [ 6 ] |
Workflow | MariaDB v3 [ 122087 ] | MariaDB v4 [ 159328 ] |