Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-10508

Mariadb crash on out of disk space during dump import

Details

    • Bug
    • Status: Closed (View Workflow)
    • Critical
    • Resolution: Fixed
    • 10.1(EOL)
    • 10.1.18
    • OTHER
    • None
    • CentOS x86_64 6.8, Amazon AWS c4.4xl instance
    • 10.1.18

    Description

      I have Galera Mariadb cluster, 6 instances, one instance where I've made sql dump import crashed with "mysqld got signal 11". The reason I suppose - /var/log/ partition where all mysqld logs and slow queries logs are set was 100% filled because of slow_query_log enabled during dump import, yes - out of disk space for log files is bad, but definitely should not cause mysqld crash.

      Here is disk sized during MySQL crash:

      Filesystem           Size  Used Avail Use% Mounted on
      /dev/xvda1           9.9G  9.4G     0 100% /         
      tmpfs                 32G     0   32G   0% /dev/shm  
      /dev/mapper/vg0-tmp   52G   11G   41G  20% /tmp      
      /dev/md127           2.0T  974G  975G  50% /var/lib/mysql
      

      Here is mysqld.log output during crash:

      2016-08-06 20:19:31 139895589890816 [Note] InnoDB: Online DDL : Completed
      2016-08-07  7:46:18 139945511483136 [Note] WSREP: (46cc19ec, 'tcp://0.0.0.0:4567') turning message relay requesting on, nonlive peers: tcp://10.10.1.93:456
      7
      2016-08-07  7:46:19 139945511483136 [Note] WSREP: (46cc19ec, 'tcp://0.0.0.0:4567') reconnecting to c3b6c958 (tcp://10.10.1.93:4567), attempt 0
      2016-08-07  7:46:23 139945511483136 [Note] WSREP: (46cc19ec, 'tcp://0.0.0.0:4567') turning message relay requesting off
      2016-08-07  7:46:27 139945511483136 [Note] WSREP: (46cc19ec, 'tcp://0.0.0.0:4567') turning message relay requesting on, nonlive peers: tcp://10.10.1.93:456
      7
      2016-08-07  7:46:28 139945511483136 [Note] WSREP: (46cc19ec, 'tcp://0.0.0.0:4567') reconnecting to c3b6c958 (tcp://10.10.1.93:4567), attempt 0
      2016-08-07  7:46:32 139945511483136 [Note] WSREP: (46cc19ec, 'tcp://0.0.0.0:4567') turning message relay requesting off
      2016-08-07  7:47:01 139945511483136 [Note] WSREP: (46cc19ec, 'tcp://0.0.0.0:4567') turning message relay requesting on, nonlive peers: tcp://10.10.1.92:456
      7
      2016-08-07  7:47:02 139945511483136 [Note] WSREP: (46cc19ec, 'tcp://0.0.0.0:4567') reconnecting to e1ff2a6f (tcp://10.10.1.92:4567), attempt 0
      2016-08-07  7:47:07 139945511483136 [Note] WSREP: (46cc19ec, 'tcp://0.0.0.0:4567') reconnecting to 73a721b1 (tcp://10.10.1.221:4567), attempt 0
      2016-08-07  7:47:22 139945511483136 [Note] WSREP: (46cc19ec, 'tcp://0.0.0.0:4567') turning message relay requesting off
      2016-08-07  7:47:32 139945511483136 [Note] WSREP: (46cc19ec, 'tcp://0.0.0.0:4567') turning message relay requesting on, nonlive peers: tcp://10.10.1.93:456
      7
      2016-08-07  7:47:33 139945511483136 [Note] WSREP: (46cc19ec, 'tcp://0.0.0.0:4567') reconnecting to c3b6c958 (tcp://10.10.1.93:4567), attempt 0
      2016-08-07  7:48:05 139945511483136 [Note] WSREP: (46cc19ec, 'tcp://0.0.0.0:4567') turning message relay requesting off
      2016-08-07  7:48:47 139945511483136 [Note] WSREP: (46cc19ec, 'tcp://0.0.0.0:4567') turning message relay requesting on, nonlive peers: tcp://10.10.1.221:45
      67
      2016-08-07  7:48:48 139945511483136 [Note] WSREP: (46cc19ec, 'tcp://0.0.0.0:4567') reconnecting to 73a721b1 (tcp://10.10.1.221:4567), attempt 0
      2016-08-07  7:48:51 139945511483136 [Note] WSREP: (46cc19ec, 'tcp://0.0.0.0:4567') turning message relay requesting off
      

      2016-08-07  7:49:29 139891662277376 [ERROR] Error writing file '/var/log/mysql/slow-queries.log' (errno: 1 "Operation not permitted")
      160807  7:49:30 [ERROR] mysqld got signal 11 ;
      This could be because you hit a bug. It is also possible that this binary
      or one of the libraries it was linked against is corrupt, improperly built,
      or misconfigured. This error can also be caused by malfunctioning hardware.
       
      To report this bug, see https://mariadb.com/kb/en/reporting-bugs
       
      We will try our best to scrape up some info that will hopefully help
      diagnose the problem, but since we have already160807 07:49:34 mysqld_safe Number of processes running now: 0
      160807 07:49:34 mysqld_safe WSREP: not restarting wsrep node automatically
      160807 07:49:34 mysqld_safe mysqld from pid file /var/lib/mysql/db-east.edoctrina.org.pid ended
      

      Attachments

        Issue Links

          Activity

            elenst Elena Stepanova added a comment - - edited

            Thanks for the report.

            It is not related to Galera, reproducible on a regular 10.1 server (and 10.2).

            Stack trace from a release build (10.1.16):

            #2  <signal handler called>
            #3  __memcpy_sse2_unaligned () at ../sysdeps/x86_64/multiarch/memcpy-sse2-unaligned.S:116
            #4  0x00007f01a0f45ff8 in memcpy (__len=18446744073709551406, __src=0x7f01a1648e60, __dest=<optimized out>) at /usr/include/x86_64-linux-gnu/bits/string3.h:52
            #5  _my_b_write (info=0x7f019e44ff38, Buffer=0x7f01a1648e60 "# User@Host: ", Count=13) at /home/buildbot/buildbot/build/mysys/mf_iocache.c:508
            #6  0x00007f01a0b6b70c in my_b_write (Count=13, Buffer=0x7f01a1648e60 "# User@Host: ", info=0x7f019e44ff38) at /home/buildbot/buildbot/build/include/my_sys.h:544
            #7  MYSQL_QUERY_LOG::write (this=this@entry=0x7f019e44fc28, thd=thd@entry=0x7f019e7f2008, current_time=1470606593, user_host=user_host@entry=0x7f01a1649250 "root[root] @ localhost [::1]", user_host_len=<optimized out>, query_utime=192, lock_utime=lock_utime@entry=0, is_command=is_command@entry=false, sql_text=sql_text@entry=0x7f01734c3020 "select sleep(0.0001)", sql_text_len=sql_text_len@entry=20) at /home/buildbot/buildbot/build/sql/log.cc:2991
            #8  0x00007f01a0b6bc6d in Log_to_file_event_handler::log_slow (this=0x7f019e44f800, thd=0x7f019e7f2008, current_time=..., user_host=0x7f01a1649250 "root[root] @ localhost [::1]", user_host_len=<optimized out>, query_utime=<optimized out>, lock_utime=0, is_command=false, sql_text=0x7f01734c3020 "select sleep(0.0001)", sql_text_len=20) at /home/buildbot/buildbot/build/sql/log.cc:1037
            #9  0x00007f01a0b659e5 in LOGGER::slow_log_print (this=this@entry=0x7f01a1874e60 <logger>, thd=thd@entry=0x7f019e7f2008, query=0x7f01734c3020 "select sleep(0.0001)", query_length=20, current_utime=<optimized out>) at /home/buildbot/buildbot/build/sql/log.cc:1328
            #10 0x00007f01a0b6c7ec in slow_log_print (thd=thd@entry=0x7f019e7f2008, query=<optimized out>, query_length=<optimized out>, current_utime=<optimized out>) at /home/buildbot/buildbot/build/sql/log.cc:6362
            #11 0x00007f01a09449b8 in log_slow_statement (thd=thd@entry=0x7f019e7f2008) at /home/buildbot/buildbot/build/sql/sql_parse.cc:2032
            #12 0x00007f01a095428d in dispatch_command (command=command@entry=COM_QUERY, thd=thd@entry=0x7f019e7f2008, packet=<optimized out>, packet_length=2709889216, packet_length@entry=20) at /home/buildbot/buildbot/build/sql/sql_parse.cc:1953
            #13 0x00007f01a095668b in do_command (thd=0x7f019e7f2008) at /home/buildbot/buildbot/build/sql/sql_parse.cc:1107
            #14 0x00007f01a0a12a0f in do_handle_one_connection (thd_arg=thd_arg@entry=0x7f019e7f2008) at /home/buildbot/buildbot/build/sql/sql_connect.cc:1350
            #15 0x00007f01a0a12b67 in handle_one_connection (arg=0x7f019e7f2008) at /home/buildbot/buildbot/build/sql/sql_connect.cc:1262
            #16 0x00007f019f87f0a4 in start_thread (arg=0x7f01a164ab00) at pthread_create.c:309
            #17 0x00007f019efa887d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:111
            

            Assertion failure on a debug build, a8c2f68fb8c7915317b12236c21a35bdd41d2092

            2016-08-08  0:44:25 140423426104064 [ERROR] Error writing file '/home/elenst/11/f1.slow' (errno: 1 "Operation not permitted")
            mysqld: /data/src/10.1/mysys/mf_iocache.c:507: _my_b_write: Assertion `Count >= rest_length' failed.
            160808  0:44:25 [ERROR] mysqld got signal 6 ;
             
            #7  0x00007fb6dcfad312 in __GI___assert_fail (assertion=0x7fb6e02a8edf "Count >= rest_length", file=0x7fb6e02a8cf0 "/data/src/10.1/mysys/mf_iocache.c", line=507, function=0x7fb6e02a93cb <__PRETTY_FUNCTION__.11459> "_my_b_write") at assert.c:101
            #8  0x00007fb6e001ea55 in _my_b_write (info=0x7fb6dc47b4b8, Buffer=0x7fb6e06e3e50 "# User@Host: ", Count=13) at /data/src/10.1/mysys/mf_iocache.c:507
            #9  0x00007fb6dfabb7f1 in my_b_write (info=0x7fb6dc47b4b8, Buffer=0x7fb6e06e3e50 "# User@Host: ", Count=13) at /data/src/10.1/include/my_sys.h:544
            #10 0x00007fb6dfac2bdb in MYSQL_QUERY_LOG::write (this=0x7fb6dc47b128, thd=0x7fb6b8e71070, current_time=1470606265, user_host=0x7fb6e06e4260 "root[root] @ localhost [::1]", user_host_len=28, query_utime=206, lock_utime=0, is_command=false, sql_text=0x7fb6b004c088 "select sleep(0.0001)", sql_text_len=20) at /data/src/10.1/sql/log.cc:2991
            #11 0x00007fb6dfabe054 in Log_to_file_event_handler::log_slow (this=0x7fb6dc47ac00, thd=0x7fb6b8e71070, current_time=..., user_host=0x7fb6e06e4260 "root[root] @ localhost [::1]", user_host_len=28, query_utime=206, lock_utime=0, is_command=false, sql_text=0x7fb6b004c088 "select sleep(0.0001)", sql_text_len=20) at /data/src/10.1/sql/log.cc:1037
            #12 0x00007fb6dfabeafc in LOGGER::slow_log_print (this=0x7fb6e0a61f80 <logger>, thd=0x7fb6b8e71070, query=0x7fb6b004c088 "select sleep(0.0001)", query_length=20, current_utime=719255821243) at /data/src/10.1/sql/log.cc:1328
            #13 0x00007fb6dfacbb8c in slow_log_print (thd=0x7fb6b8e71070, query=0x7fb6b004c088 "select sleep(0.0001)", query_length=20, current_utime=719255821243) at /data/src/10.1/sql/log.cc:6362
            #14 0x00007fb6df7b0f55 in log_slow_statement (thd=0x7fb6b8e71070) at /data/src/10.1/sql/sql_parse.cc:2032
            #15 0x00007fb6df7b0bae in dispatch_command (command=COM_QUERY, thd=0x7fb6b8e71070, packet=0x7fb6b93c7071 "", packet_length=20) at /data/src/10.1/sql/sql_parse.cc:1953
            #16 0x00007fb6df7ae21b in do_command (thd=0x7fb6b8e71070) at /data/src/10.1/sql/sql_parse.cc:1107
            #17 0x00007fb6df8e3e45 in do_handle_one_connection (thd_arg=0x7fb6b8e71070) at /data/src/10.1/sql/sql_connect.cc:1350
            #18 0x00007fb6df8e3ba9 in handle_one_connection (arg=0x7fb6b8e71070) at /data/src/10.1/sql/sql_connect.cc:1262
            #19 0x00007fb6deeaf0a4 in start_thread (arg=0x7fb6e06e5b00) at pthread_create.c:309
            #20 0x00007fb6dd06787d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:111
            

            5.5, 10.0 and MySQL 5.6 (at least debug versions) do not crash.

            elenst Elena Stepanova added a comment - - edited Thanks for the report. It is not related to Galera, reproducible on a regular 10.1 server (and 10.2). Stack trace from a release build (10.1.16): #2 <signal handler called> #3 __memcpy_sse2_unaligned () at ../sysdeps/x86_64/multiarch/memcpy-sse2-unaligned.S:116 #4 0x00007f01a0f45ff8 in memcpy (__len=18446744073709551406, __src=0x7f01a1648e60, __dest=<optimized out>) at /usr/include/x86_64-linux-gnu/bits/string3.h:52 #5 _my_b_write (info=0x7f019e44ff38, Buffer=0x7f01a1648e60 "# User@Host: ", Count=13) at /home/buildbot/buildbot/build/mysys/mf_iocache.c:508 #6 0x00007f01a0b6b70c in my_b_write (Count=13, Buffer=0x7f01a1648e60 "# User@Host: ", info=0x7f019e44ff38) at /home/buildbot/buildbot/build/include/my_sys.h:544 #7 MYSQL_QUERY_LOG::write (this=this@entry=0x7f019e44fc28, thd=thd@entry=0x7f019e7f2008, current_time=1470606593, user_host=user_host@entry=0x7f01a1649250 "root[root] @ localhost [::1]", user_host_len=<optimized out>, query_utime=192, lock_utime=lock_utime@entry=0, is_command=is_command@entry=false, sql_text=sql_text@entry=0x7f01734c3020 "select sleep(0.0001)", sql_text_len=sql_text_len@entry=20) at /home/buildbot/buildbot/build/sql/log.cc:2991 #8 0x00007f01a0b6bc6d in Log_to_file_event_handler::log_slow (this=0x7f019e44f800, thd=0x7f019e7f2008, current_time=..., user_host=0x7f01a1649250 "root[root] @ localhost [::1]", user_host_len=<optimized out>, query_utime=<optimized out>, lock_utime=0, is_command=false, sql_text=0x7f01734c3020 "select sleep(0.0001)", sql_text_len=20) at /home/buildbot/buildbot/build/sql/log.cc:1037 #9 0x00007f01a0b659e5 in LOGGER::slow_log_print (this=this@entry=0x7f01a1874e60 <logger>, thd=thd@entry=0x7f019e7f2008, query=0x7f01734c3020 "select sleep(0.0001)", query_length=20, current_utime=<optimized out>) at /home/buildbot/buildbot/build/sql/log.cc:1328 #10 0x00007f01a0b6c7ec in slow_log_print (thd=thd@entry=0x7f019e7f2008, query=<optimized out>, query_length=<optimized out>, current_utime=<optimized out>) at /home/buildbot/buildbot/build/sql/log.cc:6362 #11 0x00007f01a09449b8 in log_slow_statement (thd=thd@entry=0x7f019e7f2008) at /home/buildbot/buildbot/build/sql/sql_parse.cc:2032 #12 0x00007f01a095428d in dispatch_command (command=command@entry=COM_QUERY, thd=thd@entry=0x7f019e7f2008, packet=<optimized out>, packet_length=2709889216, packet_length@entry=20) at /home/buildbot/buildbot/build/sql/sql_parse.cc:1953 #13 0x00007f01a095668b in do_command (thd=0x7f019e7f2008) at /home/buildbot/buildbot/build/sql/sql_parse.cc:1107 #14 0x00007f01a0a12a0f in do_handle_one_connection (thd_arg=thd_arg@entry=0x7f019e7f2008) at /home/buildbot/buildbot/build/sql/sql_connect.cc:1350 #15 0x00007f01a0a12b67 in handle_one_connection (arg=0x7f019e7f2008) at /home/buildbot/buildbot/build/sql/sql_connect.cc:1262 #16 0x00007f019f87f0a4 in start_thread (arg=0x7f01a164ab00) at pthread_create.c:309 #17 0x00007f019efa887d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:111 Assertion failure on a debug build, a8c2f68fb8c7915317b12236c21a35bdd41d2092 2016-08-08 0:44:25 140423426104064 [ERROR] Error writing file '/home/elenst/11/f1.slow' (errno: 1 "Operation not permitted") mysqld: /data/src/10.1/mysys/mf_iocache.c:507: _my_b_write: Assertion `Count >= rest_length' failed. 160808 0:44:25 [ERROR] mysqld got signal 6 ;   #7 0x00007fb6dcfad312 in __GI___assert_fail (assertion=0x7fb6e02a8edf "Count >= rest_length", file=0x7fb6e02a8cf0 "/data/src/10.1/mysys/mf_iocache.c", line=507, function=0x7fb6e02a93cb <__PRETTY_FUNCTION__.11459> "_my_b_write") at assert.c:101 #8 0x00007fb6e001ea55 in _my_b_write (info=0x7fb6dc47b4b8, Buffer=0x7fb6e06e3e50 "# User@Host: ", Count=13) at /data/src/10.1/mysys/mf_iocache.c:507 #9 0x00007fb6dfabb7f1 in my_b_write (info=0x7fb6dc47b4b8, Buffer=0x7fb6e06e3e50 "# User@Host: ", Count=13) at /data/src/10.1/include/my_sys.h:544 #10 0x00007fb6dfac2bdb in MYSQL_QUERY_LOG::write (this=0x7fb6dc47b128, thd=0x7fb6b8e71070, current_time=1470606265, user_host=0x7fb6e06e4260 "root[root] @ localhost [::1]", user_host_len=28, query_utime=206, lock_utime=0, is_command=false, sql_text=0x7fb6b004c088 "select sleep(0.0001)", sql_text_len=20) at /data/src/10.1/sql/log.cc:2991 #11 0x00007fb6dfabe054 in Log_to_file_event_handler::log_slow (this=0x7fb6dc47ac00, thd=0x7fb6b8e71070, current_time=..., user_host=0x7fb6e06e4260 "root[root] @ localhost [::1]", user_host_len=28, query_utime=206, lock_utime=0, is_command=false, sql_text=0x7fb6b004c088 "select sleep(0.0001)", sql_text_len=20) at /data/src/10.1/sql/log.cc:1037 #12 0x00007fb6dfabeafc in LOGGER::slow_log_print (this=0x7fb6e0a61f80 <logger>, thd=0x7fb6b8e71070, query=0x7fb6b004c088 "select sleep(0.0001)", query_length=20, current_utime=719255821243) at /data/src/10.1/sql/log.cc:1328 #13 0x00007fb6dfacbb8c in slow_log_print (thd=0x7fb6b8e71070, query=0x7fb6b004c088 "select sleep(0.0001)", query_length=20, current_utime=719255821243) at /data/src/10.1/sql/log.cc:6362 #14 0x00007fb6df7b0f55 in log_slow_statement (thd=0x7fb6b8e71070) at /data/src/10.1/sql/sql_parse.cc:2032 #15 0x00007fb6df7b0bae in dispatch_command (command=COM_QUERY, thd=0x7fb6b8e71070, packet=0x7fb6b93c7071 "", packet_length=20) at /data/src/10.1/sql/sql_parse.cc:1953 #16 0x00007fb6df7ae21b in do_command (thd=0x7fb6b8e71070) at /data/src/10.1/sql/sql_parse.cc:1107 #17 0x00007fb6df8e3e45 in do_handle_one_connection (thd_arg=0x7fb6b8e71070) at /data/src/10.1/sql/sql_connect.cc:1350 #18 0x00007fb6df8e3ba9 in handle_one_connection (arg=0x7fb6b8e71070) at /data/src/10.1/sql/sql_connect.cc:1262 #19 0x00007fb6deeaf0a4 in start_thread (arg=0x7fb6e06e5b00) at pthread_create.c:309 #20 0x00007fb6dd06787d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:111 5.5, 10.0 and MySQL 5.6 (at least debug versions) do not crash.

            I wasn't able to repeat it on 10.1. I get the error “Error writing file ... (errno: 1 "Operation not permitted")” but no crash.

            yes, I've noticed incorrect errno, I'll fix it

            serg Sergei Golubchik added a comment - I wasn't able to repeat it on 10.1. I get the error “Error writing file ... (errno: 1 "Operation not permitted")” but no crash. yes, I've noticed incorrect errno, I'll fix it

            serg,

            Your test, but slightly modified. This way it works for me (triggers the assertion failure).

            --source include/have_sequence.inc
            create table t1 (a int, b int) engine=memory;
            insert t1 select seq, seq+1 from seq_1_to_1000;
            set global general_log=0;
            set global log_queries_not_using_indexes=1;
            set debug_dbug='+d,simulate_file_write_error';
            --disable_result_log
            --let $run= 50
            while ($run)
            {
                select * from t1 where a>10;
                dec $run;
            }
            --enable_result_log
            set debug_dbug='';
            set global general_log=1;
            drop table t1;
            

            elenst Elena Stepanova added a comment - serg , Your test, but slightly modified. This way it works for me (triggers the assertion failure). --source include/have_sequence.inc create table t1 (a int , b int ) engine=memory; insert t1 select seq, seq+1 from seq_1_to_1000; set global general_log=0; set global log_queries_not_using_indexes=1; set debug_dbug= '+d,simulate_file_write_error' ; --disable_result_log --let $run= 50 while ($run) { select * from t1 where a>10; dec $run; } --enable_result_log set debug_dbug= '' ; set global general_log=1; drop table t1;

            People

              serg Sergei Golubchik
              kpvmaria Kaidalov Pavel
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.