Details
-
Bug
-
Status: Closed (View Workflow)
-
Critical
-
Resolution: Fixed
-
10.1(EOL)
-
None
-
CentOS x86_64 6.8, Amazon AWS c4.4xl instance
-
10.1.18
Description
I have Galera Mariadb cluster, 6 instances, one instance where I've made sql dump import crashed with "mysqld got signal 11". The reason I suppose - /var/log/ partition where all mysqld logs and slow queries logs are set was 100% filled because of slow_query_log enabled during dump import, yes - out of disk space for log files is bad, but definitely should not cause mysqld crash.
Here is disk sized during MySQL crash:
Filesystem Size Used Avail Use% Mounted on
|
/dev/xvda1 9.9G 9.4G 0 100% /
|
tmpfs 32G 0 32G 0% /dev/shm
|
/dev/mapper/vg0-tmp 52G 11G 41G 20% /tmp
|
/dev/md127 2.0T 974G 975G 50% /var/lib/mysql
|
Here is mysqld.log output during crash:
2016-08-06 20:19:31 139895589890816 [Note] InnoDB: Online DDL : Completed
|
2016-08-07 7:46:18 139945511483136 [Note] WSREP: (46cc19ec, 'tcp://0.0.0.0:4567') turning message relay requesting on, nonlive peers: tcp://10.10.1.93:456
|
7
|
2016-08-07 7:46:19 139945511483136 [Note] WSREP: (46cc19ec, 'tcp://0.0.0.0:4567') reconnecting to c3b6c958 (tcp://10.10.1.93:4567), attempt 0
|
2016-08-07 7:46:23 139945511483136 [Note] WSREP: (46cc19ec, 'tcp://0.0.0.0:4567') turning message relay requesting off
|
2016-08-07 7:46:27 139945511483136 [Note] WSREP: (46cc19ec, 'tcp://0.0.0.0:4567') turning message relay requesting on, nonlive peers: tcp://10.10.1.93:456
|
7
|
2016-08-07 7:46:28 139945511483136 [Note] WSREP: (46cc19ec, 'tcp://0.0.0.0:4567') reconnecting to c3b6c958 (tcp://10.10.1.93:4567), attempt 0
|
2016-08-07 7:46:32 139945511483136 [Note] WSREP: (46cc19ec, 'tcp://0.0.0.0:4567') turning message relay requesting off
|
2016-08-07 7:47:01 139945511483136 [Note] WSREP: (46cc19ec, 'tcp://0.0.0.0:4567') turning message relay requesting on, nonlive peers: tcp://10.10.1.92:456
|
7
|
2016-08-07 7:47:02 139945511483136 [Note] WSREP: (46cc19ec, 'tcp://0.0.0.0:4567') reconnecting to e1ff2a6f (tcp://10.10.1.92:4567), attempt 0
|
2016-08-07 7:47:07 139945511483136 [Note] WSREP: (46cc19ec, 'tcp://0.0.0.0:4567') reconnecting to 73a721b1 (tcp://10.10.1.221:4567), attempt 0
|
2016-08-07 7:47:22 139945511483136 [Note] WSREP: (46cc19ec, 'tcp://0.0.0.0:4567') turning message relay requesting off
|
2016-08-07 7:47:32 139945511483136 [Note] WSREP: (46cc19ec, 'tcp://0.0.0.0:4567') turning message relay requesting on, nonlive peers: tcp://10.10.1.93:456
|
7
|
2016-08-07 7:47:33 139945511483136 [Note] WSREP: (46cc19ec, 'tcp://0.0.0.0:4567') reconnecting to c3b6c958 (tcp://10.10.1.93:4567), attempt 0
|
2016-08-07 7:48:05 139945511483136 [Note] WSREP: (46cc19ec, 'tcp://0.0.0.0:4567') turning message relay requesting off
|
2016-08-07 7:48:47 139945511483136 [Note] WSREP: (46cc19ec, 'tcp://0.0.0.0:4567') turning message relay requesting on, nonlive peers: tcp://10.10.1.221:45
|
67
|
2016-08-07 7:48:48 139945511483136 [Note] WSREP: (46cc19ec, 'tcp://0.0.0.0:4567') reconnecting to 73a721b1 (tcp://10.10.1.221:4567), attempt 0
|
2016-08-07 7:48:51 139945511483136 [Note] WSREP: (46cc19ec, 'tcp://0.0.0.0:4567') turning message relay requesting off
|
2016-08-07 7:49:29 139891662277376 [ERROR] Error writing file '/var/log/mysql/slow-queries.log' (errno: 1 "Operation not permitted")
|
160807 7:49:30 [ERROR] mysqld got signal 11 ;
|
This could be because you hit a bug. It is also possible that this binary
|
or one of the libraries it was linked against is corrupt, improperly built,
|
or misconfigured. This error can also be caused by malfunctioning hardware.
|
|
To report this bug, see https://mariadb.com/kb/en/reporting-bugs
|
|
We will try our best to scrape up some info that will hopefully help
|
diagnose the problem, but since we have already160807 07:49:34 mysqld_safe Number of processes running now: 0
|
160807 07:49:34 mysqld_safe WSREP: not restarting wsrep node automatically
|
160807 07:49:34 mysqld_safe mysqld from pid file /var/lib/mysql/db-east.edoctrina.org.pid ended
|
Attachments
Issue Links
- duplicates
-
MDEV-10827 Mariadb crash on out of disk space during select into outfile
-
- Closed
-
Activity
Field | Original Value | New Value |
---|---|---|
Status | Open [ 1 ] | Confirmed [ 10101 ] |
Description |
I have Galera Mariadb cluster, 6 instances, one instance where I've made sql dump import crashed with "mysqld got signal 11". The reason I suppose - /var/log/ partition where all mysqld logs and slow queries logs are set was 100% filled because of slow_query_log enabled during dump import, yes - out of disk space for log files is bad, but definitely should not cause mysqld crash.
Here is disk sized during MySQL crash: Filesystem Size Used Avail Use% Mounted on /dev/xvda1 9.9G 9.4G 0 100% / tmpfs 32G 0 32G 0% /dev/shm /dev/mapper/vg0-tmp 52G 11G 41G 20% /tmp /dev/md127 2.0T 974G 975G 50% /var/lib/mysql Here is mysqld.log output during crash: 2016-08-06 20:19:31 139895589890816 [Note] InnoDB: Online DDL : Completed 2016-08-07 7:46:18 139945511483136 [Note] WSREP: (46cc19ec, 'tcp://0.0.0.0:4567') turning message relay requesting on, nonlive peers: tcp://10.10.1.93:456 7 2016-08-07 7:46:19 139945511483136 [Note] WSREP: (46cc19ec, 'tcp://0.0.0.0:4567') reconnecting to c3b6c958 (tcp://10.10.1.93:4567), attempt 0 2016-08-07 7:46:23 139945511483136 [Note] WSREP: (46cc19ec, 'tcp://0.0.0.0:4567') turning message relay requesting off 2016-08-07 7:46:27 139945511483136 [Note] WSREP: (46cc19ec, 'tcp://0.0.0.0:4567') turning message relay requesting on, nonlive peers: tcp://10.10.1.93:456 7 2016-08-07 7:46:28 139945511483136 [Note] WSREP: (46cc19ec, 'tcp://0.0.0.0:4567') reconnecting to c3b6c958 (tcp://10.10.1.93:4567), attempt 0 2016-08-07 7:46:32 139945511483136 [Note] WSREP: (46cc19ec, 'tcp://0.0.0.0:4567') turning message relay requesting off 2016-08-07 7:47:01 139945511483136 [Note] WSREP: (46cc19ec, 'tcp://0.0.0.0:4567') turning message relay requesting on, nonlive peers: tcp://10.10.1.92:456 7 2016-08-07 7:47:02 139945511483136 [Note] WSREP: (46cc19ec, 'tcp://0.0.0.0:4567') reconnecting to e1ff2a6f (tcp://10.10.1.92:4567), attempt 0 2016-08-07 7:47:07 139945511483136 [Note] WSREP: (46cc19ec, 'tcp://0.0.0.0:4567') reconnecting to 73a721b1 (tcp://10.10.1.221:4567), attempt 0 2016-08-07 7:47:22 139945511483136 [Note] WSREP: (46cc19ec, 'tcp://0.0.0.0:4567') turning message relay requesting off 2016-08-07 7:47:32 139945511483136 [Note] WSREP: (46cc19ec, 'tcp://0.0.0.0:4567') turning message relay requesting on, nonlive peers: tcp://10.10.1.93:456 7 2016-08-07 7:47:33 139945511483136 [Note] WSREP: (46cc19ec, 'tcp://0.0.0.0:4567') reconnecting to c3b6c958 (tcp://10.10.1.93:4567), attempt 0 2016-08-07 7:48:05 139945511483136 [Note] WSREP: (46cc19ec, 'tcp://0.0.0.0:4567') turning message relay requesting off 2016-08-07 7:48:47 139945511483136 [Note] WSREP: (46cc19ec, 'tcp://0.0.0.0:4567') turning message relay requesting on, nonlive peers: tcp://10.10.1.221:45 67 2016-08-07 7:48:48 139945511483136 [Note] WSREP: (46cc19ec, 'tcp://0.0.0.0:4567') reconnecting to 73a721b1 (tcp://10.10.1.221:4567), attempt 0 2016-08-07 7:48:51 139945511483136 [Note] WSREP: (46cc19ec, 'tcp://0.0.0.0:4567') turning message relay requesting off 2016-08-07 7:49:29 139891662277376 [ERROR] Error writing file '/var/log/mysql/slow-queries.log' (errno: 1 "Operation not permitted") 160807 7:49:30 [ERROR] mysqld got signal 11 ; This could be because you hit a bug. It is also possible that this binary or one of the libraries it was linked against is corrupt, improperly built, or misconfigured. This error can also be caused by malfunctioning hardware. To report this bug, see https://mariadb.com/kb/en/reporting-bugs We will try our best to scrape up some info that will hopefully help diagnose the problem, but since we have already160807 07:49:34 mysqld_safe Number of processes running now: 0 160807 07:49:34 mysqld_safe WSREP: not restarting wsrep node automatically 160807 07:49:34 mysqld_safe mysqld from pid file /var/lib/mysql/db-east.edoctrina.org.pid ended |
I have Galera Mariadb cluster, 6 instances, one instance where I've made sql dump import crashed with "mysqld got signal 11". The reason I suppose - /var/log/ partition where all mysqld logs and slow queries logs are set was 100% filled because of slow_query_log enabled during dump import, yes - out of disk space for log files is bad, but definitely should not cause mysqld crash.
Here is disk sized during MySQL crash: {noformat} Filesystem Size Used Avail Use% Mounted on /dev/xvda1 9.9G 9.4G 0 100% / tmpfs 32G 0 32G 0% /dev/shm /dev/mapper/vg0-tmp 52G 11G 41G 20% /tmp /dev/md127 2.0T 974G 975G 50% /var/lib/mysql {noformat} Here is mysqld.log output during crash: {noformat} 2016-08-06 20:19:31 139895589890816 [Note] InnoDB: Online DDL : Completed 2016-08-07 7:46:18 139945511483136 [Note] WSREP: (46cc19ec, 'tcp://0.0.0.0:4567') turning message relay requesting on, nonlive peers: tcp://10.10.1.93:456 7 2016-08-07 7:46:19 139945511483136 [Note] WSREP: (46cc19ec, 'tcp://0.0.0.0:4567') reconnecting to c3b6c958 (tcp://10.10.1.93:4567), attempt 0 2016-08-07 7:46:23 139945511483136 [Note] WSREP: (46cc19ec, 'tcp://0.0.0.0:4567') turning message relay requesting off 2016-08-07 7:46:27 139945511483136 [Note] WSREP: (46cc19ec, 'tcp://0.0.0.0:4567') turning message relay requesting on, nonlive peers: tcp://10.10.1.93:456 7 2016-08-07 7:46:28 139945511483136 [Note] WSREP: (46cc19ec, 'tcp://0.0.0.0:4567') reconnecting to c3b6c958 (tcp://10.10.1.93:4567), attempt 0 2016-08-07 7:46:32 139945511483136 [Note] WSREP: (46cc19ec, 'tcp://0.0.0.0:4567') turning message relay requesting off 2016-08-07 7:47:01 139945511483136 [Note] WSREP: (46cc19ec, 'tcp://0.0.0.0:4567') turning message relay requesting on, nonlive peers: tcp://10.10.1.92:456 7 2016-08-07 7:47:02 139945511483136 [Note] WSREP: (46cc19ec, 'tcp://0.0.0.0:4567') reconnecting to e1ff2a6f (tcp://10.10.1.92:4567), attempt 0 2016-08-07 7:47:07 139945511483136 [Note] WSREP: (46cc19ec, 'tcp://0.0.0.0:4567') reconnecting to 73a721b1 (tcp://10.10.1.221:4567), attempt 0 2016-08-07 7:47:22 139945511483136 [Note] WSREP: (46cc19ec, 'tcp://0.0.0.0:4567') turning message relay requesting off 2016-08-07 7:47:32 139945511483136 [Note] WSREP: (46cc19ec, 'tcp://0.0.0.0:4567') turning message relay requesting on, nonlive peers: tcp://10.10.1.93:456 7 2016-08-07 7:47:33 139945511483136 [Note] WSREP: (46cc19ec, 'tcp://0.0.0.0:4567') reconnecting to c3b6c958 (tcp://10.10.1.93:4567), attempt 0 2016-08-07 7:48:05 139945511483136 [Note] WSREP: (46cc19ec, 'tcp://0.0.0.0:4567') turning message relay requesting off 2016-08-07 7:48:47 139945511483136 [Note] WSREP: (46cc19ec, 'tcp://0.0.0.0:4567') turning message relay requesting on, nonlive peers: tcp://10.10.1.221:45 67 2016-08-07 7:48:48 139945511483136 [Note] WSREP: (46cc19ec, 'tcp://0.0.0.0:4567') reconnecting to 73a721b1 (tcp://10.10.1.221:4567), attempt 0 2016-08-07 7:48:51 139945511483136 [Note] WSREP: (46cc19ec, 'tcp://0.0.0.0:4567') turning message relay requesting off {noformat} {noformat} 2016-08-07 7:49:29 139891662277376 [ERROR] Error writing file '/var/log/mysql/slow-queries.log' (errno: 1 "Operation not permitted") 160807 7:49:30 [ERROR] mysqld got signal 11 ; This could be because you hit a bug. It is also possible that this binary or one of the libraries it was linked against is corrupt, improperly built, or misconfigured. This error can also be caused by malfunctioning hardware. To report this bug, see https://mariadb.com/kb/en/reporting-bugs We will try our best to scrape up some info that will hopefully help diagnose the problem, but since we have already160807 07:49:34 mysqld_safe Number of processes running now: 0 160807 07:49:34 mysqld_safe WSREP: not restarting wsrep node automatically 160807 07:49:34 mysqld_safe mysqld from pid file /var/lib/mysql/db-east.edoctrina.org.pid ended {noformat} |
Component/s | OTHER [ 10125 ] | |
Component/s | Galera [ 10124 ] | |
Component/s | Storage Engine - InnoDB [ 10129 ] | |
Component/s | wsrep [ 11500 ] |
Fix Version/s | 10.1 [ 16100 ] | |
Affects Version/s | 10.1 [ 16100 ] | |
Affects Version/s | 10.1.16 [ 22019 ] | |
Assignee | Sergei Golubchik [ serg ] |
Link |
This issue duplicates |
Sprint | 10.1.18 [ 98 ] |
Status | Confirmed [ 10101 ] | In Progress [ 3 ] |
Status | In Progress [ 3 ] | Stalled [ 10000 ] |
Status | Stalled [ 10000 ] | In Progress [ 3 ] |
Status | In Progress [ 3 ] | Stalled [ 10000 ] |
Fix Version/s | 10.1.18 [ 22110 ] | |
Fix Version/s | 10.1 [ 16100 ] | |
Resolution | Fixed [ 1 ] | |
Status | Stalled [ 10000 ] | Closed [ 6 ] |
Workflow | MariaDB v3 [ 76590 ] | MariaDB v4 [ 150720 ] |
Thanks for the report.
It is not related to Galera, reproducible on a regular 10.1 server (and 10.2).
Stack trace from a release build (10.1.16):
#2 <signal handler called>
#3 __memcpy_sse2_unaligned () at ../sysdeps/x86_64/multiarch/memcpy-sse2-unaligned.S:116
#4 0x00007f01a0f45ff8 in memcpy (__len=18446744073709551406, __src=0x7f01a1648e60, __dest=<optimized out>) at /usr/include/x86_64-linux-gnu/bits/string3.h:52
#5 _my_b_write (info=0x7f019e44ff38, Buffer=0x7f01a1648e60 "# User@Host: ", Count=13) at /home/buildbot/buildbot/build/mysys/mf_iocache.c:508
#6 0x00007f01a0b6b70c in my_b_write (Count=13, Buffer=0x7f01a1648e60 "# User@Host: ", info=0x7f019e44ff38) at /home/buildbot/buildbot/build/include/my_sys.h:544
#7 MYSQL_QUERY_LOG::write (this=this@entry=0x7f019e44fc28, thd=thd@entry=0x7f019e7f2008, current_time=1470606593, user_host=user_host@entry=0x7f01a1649250 "root[root] @ localhost [::1]", user_host_len=<optimized out>, query_utime=192, lock_utime=lock_utime@entry=0, is_command=is_command@entry=false, sql_text=sql_text@entry=0x7f01734c3020 "select sleep(0.0001)", sql_text_len=sql_text_len@entry=20) at /home/buildbot/buildbot/build/sql/log.cc:2991
#8 0x00007f01a0b6bc6d in Log_to_file_event_handler::log_slow (this=0x7f019e44f800, thd=0x7f019e7f2008, current_time=..., user_host=0x7f01a1649250 "root[root] @ localhost [::1]", user_host_len=<optimized out>, query_utime=<optimized out>, lock_utime=0, is_command=false, sql_text=0x7f01734c3020 "select sleep(0.0001)", sql_text_len=20) at /home/buildbot/buildbot/build/sql/log.cc:1037
#9 0x00007f01a0b659e5 in LOGGER::slow_log_print (this=this@entry=0x7f01a1874e60 <logger>, thd=thd@entry=0x7f019e7f2008, query=0x7f01734c3020 "select sleep(0.0001)", query_length=20, current_utime=<optimized out>) at /home/buildbot/buildbot/build/sql/log.cc:1328
#10 0x00007f01a0b6c7ec in slow_log_print (thd=thd@entry=0x7f019e7f2008, query=<optimized out>, query_length=<optimized out>, current_utime=<optimized out>) at /home/buildbot/buildbot/build/sql/log.cc:6362
#11 0x00007f01a09449b8 in log_slow_statement (thd=thd@entry=0x7f019e7f2008) at /home/buildbot/buildbot/build/sql/sql_parse.cc:2032
#12 0x00007f01a095428d in dispatch_command (command=command@entry=COM_QUERY, thd=thd@entry=0x7f019e7f2008, packet=<optimized out>, packet_length=2709889216, packet_length@entry=20) at /home/buildbot/buildbot/build/sql/sql_parse.cc:1953
#13 0x00007f01a095668b in do_command (thd=0x7f019e7f2008) at /home/buildbot/buildbot/build/sql/sql_parse.cc:1107
#14 0x00007f01a0a12a0f in do_handle_one_connection (thd_arg=thd_arg@entry=0x7f019e7f2008) at /home/buildbot/buildbot/build/sql/sql_connect.cc:1350
#15 0x00007f01a0a12b67 in handle_one_connection (arg=0x7f019e7f2008) at /home/buildbot/buildbot/build/sql/sql_connect.cc:1262
#16 0x00007f019f87f0a4 in start_thread (arg=0x7f01a164ab00) at pthread_create.c:309
#17 0x00007f019efa887d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:111
Assertion failure on a debug build, a8c2f68fb8c7915317b12236c21a35bdd41d2092
2016-08-08 0:44:25 140423426104064 [ERROR] Error writing file '/home/elenst/11/f1.slow' (errno: 1 "Operation not permitted")
mysqld: /data/src/10.1/mysys/mf_iocache.c:507: _my_b_write: Assertion `Count >= rest_length' failed.
160808 0:44:25 [ERROR] mysqld got signal 6 ;
#7 0x00007fb6dcfad312 in __GI___assert_fail (assertion=0x7fb6e02a8edf "Count >= rest_length", file=0x7fb6e02a8cf0 "/data/src/10.1/mysys/mf_iocache.c", line=507, function=0x7fb6e02a93cb <__PRETTY_FUNCTION__.11459> "_my_b_write") at assert.c:101
#8 0x00007fb6e001ea55 in _my_b_write (info=0x7fb6dc47b4b8, Buffer=0x7fb6e06e3e50 "# User@Host: ", Count=13) at /data/src/10.1/mysys/mf_iocache.c:507
#9 0x00007fb6dfabb7f1 in my_b_write (info=0x7fb6dc47b4b8, Buffer=0x7fb6e06e3e50 "# User@Host: ", Count=13) at /data/src/10.1/include/my_sys.h:544
#10 0x00007fb6dfac2bdb in MYSQL_QUERY_LOG::write (this=0x7fb6dc47b128, thd=0x7fb6b8e71070, current_time=1470606265, user_host=0x7fb6e06e4260 "root[root] @ localhost [::1]", user_host_len=28, query_utime=206, lock_utime=0, is_command=false, sql_text=0x7fb6b004c088 "select sleep(0.0001)", sql_text_len=20) at /data/src/10.1/sql/log.cc:2991
#11 0x00007fb6dfabe054 in Log_to_file_event_handler::log_slow (this=0x7fb6dc47ac00, thd=0x7fb6b8e71070, current_time=..., user_host=0x7fb6e06e4260 "root[root] @ localhost [::1]", user_host_len=28, query_utime=206, lock_utime=0, is_command=false, sql_text=0x7fb6b004c088 "select sleep(0.0001)", sql_text_len=20) at /data/src/10.1/sql/log.cc:1037
#12 0x00007fb6dfabeafc in LOGGER::slow_log_print (this=0x7fb6e0a61f80 <logger>, thd=0x7fb6b8e71070, query=0x7fb6b004c088 "select sleep(0.0001)", query_length=20, current_utime=719255821243) at /data/src/10.1/sql/log.cc:1328
#13 0x00007fb6dfacbb8c in slow_log_print (thd=0x7fb6b8e71070, query=0x7fb6b004c088 "select sleep(0.0001)", query_length=20, current_utime=719255821243) at /data/src/10.1/sql/log.cc:6362
#14 0x00007fb6df7b0f55 in log_slow_statement (thd=0x7fb6b8e71070) at /data/src/10.1/sql/sql_parse.cc:2032
#15 0x00007fb6df7b0bae in dispatch_command (command=COM_QUERY, thd=0x7fb6b8e71070, packet=0x7fb6b93c7071 "", packet_length=20) at /data/src/10.1/sql/sql_parse.cc:1953
#16 0x00007fb6df7ae21b in do_command (thd=0x7fb6b8e71070) at /data/src/10.1/sql/sql_parse.cc:1107
#17 0x00007fb6df8e3e45 in do_handle_one_connection (thd_arg=0x7fb6b8e71070) at /data/src/10.1/sql/sql_connect.cc:1350
#18 0x00007fb6df8e3ba9 in handle_one_connection (arg=0x7fb6b8e71070) at /data/src/10.1/sql/sql_connect.cc:1262
#19 0x00007fb6deeaf0a4 in start_thread (arg=0x7fb6e06e5b00) at pthread_create.c:309
#20 0x00007fb6dd06787d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:111
5.5, 10.0 and MySQL 5.6 (at least debug versions) do not crash.