Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-8325

Server crash in pagecache_fwrite after killing query consuming a lot of tmp disk space

    XMLWordPrintable

Details

    Description

      killing a long running query caused MySQL to crash:

      | 7552649 | ttahon          | 10.10.0.55:49241  | PRODUCTION | Killed  |   56220 | Copying to tmp table on disk | SELECT
                 PROD_PRODUITS.LIBELLE
       
                AS libelle,
             PROD_COMMANDES.D |    0.000 |

      The above query created a very large temporary file and hence filled disk:

      150616 20:13:41 [Warning] mysqld: Disk is full writing '/tmp/#sql_2741_0.MAD' (Errcode: 28 "No space left on device"). Waiting for someone to free space... (Expect up to 60 secs delay for server to continue afte
      r freeing disk space)
      150616 20:13:41 [Warning] mysqld: Retry in 60 secs. Message reprinted in 600 secs
      150616 20:14:41 [Warning] mysqld: Disk is full writing '/tmp/#sql_2741_0.MAD' (Errcode: 28 "No space left on device"). Waiting for someone to free space... (Expect up to 60 secs delay for server to continue afte
      r freeing disk space)
      150616 20:14:41 [Warning] mysqld: Retry in 60 secs. Message reprinted in 600 secs
      150616 20:15:41 [Warning] mysqld: Disk is full writing '/tmp/#sql_2741_0.MAD' (Errcode: 28 "No space left on device"). Waiting for someone to free space... (Expect up to 60 secs delay for server to continue afte
      r freeing disk space)
      150616 20:15:41 [Warning] mysqld: Retry in 60 secs. Message reprinted in 600 secs
      150616 20:16:26 [Warning] mysqld: Disk is full writing '/tmp/#sql_2741_1.MAD' (Errcode: 28 "No space left on device"). Waiting for someone to free space... (Expect up to 60 secs delay for server to continue afte
      r freeing disk space)
      150616 20:16:26 [Warning] mysqld: Retry in 60 secs. Message reprinted in 600 secs
      150616 20:25:41 [Warning] mysqld: Retry in 60 secs. Message reprinted in 600 secs
      150616 20:26:26 [Warning] mysqld: Retry in 60 secs. Message reprinted in 600 secs
      150616 20:35:41 [Warning] mysqld: Retry in 60 secs. Message reprinted in 600 secs
      150616 20:36:26 [Warning] mysqld: Retry in 60 secs. Message reprinted in 600 secs
      150616 20:45:41 [Warning] mysqld: Retry in 60 secs. Message reprinted in 600 secs

      When we killed long running query:

      MariaDB [PRODUCTION]> kill 7552649;
      Query OK, 0 rows affected (0.00 sec)
       
      MariaDB [PRODUCTION]> show processlist;
      +---------+-----------------+-------------------+------------+---------+---------+------------------------------+------------------------------------------------------------------------------------------------------+----------+
      | Id      | User            | Host              | db         | Command | Time    | State                        | Info                                                                                                 | Progress |
      +---------+-----------------+-------------------+------------+---------+---------+------------------------------+------------------------------------------------------------------------------------------------------+----------+
      |       1 | system user     |                   | NULL       | Sleep   | 3698081 | wsrep aborter idle           | NULL                                                                                                 |    0.000 |
      |       2 | system user     |                   | NULL       | Sleep   |       0 | closing tables               | NULL                                                                                                 |    0.000 |
      |       4 | event_scheduler | localhost         | NULL       | Daemon  | 3698078 | Waiting on empty queue       | NULL                                                                                                 |    0.000 |
      | 7552649 | ttahon          | 10.10.0.55:49241  | PRODUCTION | Killed  |   56220 | Copying to tmp table on disk | SELECT
                 PROD_PRODUITS.LIBELLE
       
                AS libelle,
             PROD_COMMANDES.D |    0.000 |
      | 7562009 | user4qlickview  | 10.10.0.55:42983  | PRODUCTION | Query   |   50335 | converting HEAP to Aria      | SELECT distinct
              ID_PROD_ITEM as 'ID PROD ITEM',
              PAGES

      Few minutes later MySQL server crashed:

      150617 10:15:42 [ERROR] mysqld got signal 11 ;
      This could be because you hit a bug. It is also possible that this binary
      or one of the libraries it was linked against is corrupt, improperly built,
      or misconfigured. This error can also be caused by malfunctioning hardware.
       
      To report this bug, see http://kb.askmonty.org/en/reporting-bugs
       
      We will try our best to scrape up some info that will hopefully help
      diagnose the problem, but since we have already crashed,
      something is definitely wrong and this may fail.
       
      Server version: 10.0.14-MariaDB-1~precise-wsrep-log
      key_buffer_size=134217728
      read_buffer_size=131072
      max_used_connections=15
      max_threads=502
      thread_count=10
      It is possible that mysqld could use up to
      key_buffer_size + (read_buffer_size + sort_buffer_size)*max_threads = 5346411 K  bytes of memory
      Hope that's ok; if not, decrease some variables in the equation.
       
      Thread pointer: 0x0x7f0adfa93008
      Attempting backtrace. You can use the following information to find out
      where mysqld died. If you see no messages after this, something went
      terribly wrong...
      stack_bottom = 0x7f11c1557da0 thread_stack 0x48000
      /usr/sbin/mysqld(my_print_stacktrace+0x2b)[0xb9025b]
      /usr/sbin/mysqld(handle_fatal_signal+0x398)[0x741088]
      /lib/x86_64-linux-gnu/libpthread.so.0(+0xfcb0)[0x7f11c12dfcb0]
       
      Trying to get some variables.
       
      Some pointers may be invalid and cause the dump to abort.
      Query (0x7f0ac2c8b020): is an invalid pointer
      Connection ID (thread ID): 7552649
      Status: KILL_CONNECTION
       
      Optimizer switch: index_merge=on,index_merge_union=on,index_merge_sort_union=on,index_merge_intersection=on,index_merge_sort_intersection=off,engine_condition_pushdown=off,index_condition_pushdown=on,derived_merge=on,derived_with_keys=on,firstmatch=on,loosescan=on,materialization=on,in_to_exists=on,semijoin=on,partial_match_rowid_merge=on,partial_match_table_scan=on,subquery_cache=on,mrr=off,mrr_cost_based=off,mrr_sort_keys=off,outer_join_with_cache=on,semijoin_with_cache=on,join_cache_incremental=on,join_cache_hashed=on,join_cache_bka=on,optimize_join_buffer_size=off,table_elimination=on,extended_keys=on,exists_to_in=on
       
      The manual page at http://dev.mysql.com/doc/mysql/en/crashing.html contains
      information that should help you find out what is causing the crash.
      150617 10:15:43 mysqld_safe Number of processes running now: 0
      150617 10:15:43 mysqld_safe WSREP: not restarting wsrep node automatically
      150617 10:15:43 mysqld_safe mysqld from pid file /data/mysql/data/ifactory-sart-db-node-03.pid ended
      150617 10:20:54 mysqld_safe Starting mysqld daemon with databases from /data/mysql/data
      150617 10:20:54 mysqld_safe WSREP: Running position recovery with --log_error='/data/mysql/data/wsrep_recovery.l0XA1P' --pid-file='/data/mysql/data/ifactory-sart-db-node-03-recover.pid'
      150617 10:21:14 mysqld_safe WSREP: Recovered position 10f20163-668b-11e4-bfda-7ef6da2a3a6b:408209687
      150617 10:21:14 [Note] WSREP: wsrep_start_position var submitted: '10f20163-668b-11e4-bfda-7ef6da2a3a6b:408209687'
      150617 10:21:14 [Note] WSREP: Read nil XID from storage engines, skipping position init
      150617 10:21:14 [Note] WSREP: wsrep_load(): loading provider library '/usr/lib/galera/libgalera_smm.so'
      150617 10:21:14 [Note] WSREP: wsrep_load(): Galera 25.3.5-wheezy(rXXXX) by Codership Oy <info@codership.com> loaded successfully.
      150617 10:21:14 [Note] WSREP: CRC-32C: using hardware acceleration.
      150617 10:21:14 [Note] WSREP: Found saved state: 10f20163-668b-11e4-bfda-7ef6da2a3a6b:-1
      150617 10:21:14 [Note] WSREP: Passing config to GCS: base_host = 10.10.19.3; base_port = 4567; cert.log_conflicts = no; debug = no; evs.inactive_check_period = PT0.5S; evs.inactive_timeout = PT15S; evs.join_retrans_period = PT1S; evs.max_install_timeouts = 1; evs.send_window = 4; evs.stats_report_period = PT1M; evs.suspect_timeout = PT5S; evs.user_send_window = 2; evs.view_forget_timeout = PT24H; gcache.dir = /data/mysql/data/; gcache.keep_pages_size = 0; gcache.mem_size = 0; gcache.name = /data/mysql/data//galera.cache; gcache.page_size = 128M; gcache.size = 10G; gcs.fc_debug = 0; gcs.fc_factor = 1.0; gcs.fc_limit = 16; gcs.fc_master_slave = no; gcs.max_packet_size = 64500; gcs.max_throttle = 0.25; gcs.recv_q_hard_limit = 9223372036854775807; gcs.recv_q_soft_limit = 0.25; gcs.sync_donor = no; gmcast.segment = 0; gmcast.version = 0; pc.announce_timeout = PT3S; pc.checksum = false; pc.ignore_quorum = false; pc.ignore_sb = false; pc.npvo = false; pc.version = 0; pc.wait_prim = true; pc.wait_prim_timeout = P30S; pc.weight = 1; proton
      150617 10:21:14 [Note] WSREP: Service thread queue flushed.
       

      Attachments

        Issue Links

          Activity

            People

              monty Michael Widenius
              aftab.khan aftab khan
              Votes:
              0 Vote for this issue
              Watchers:
              7 Start watching this issue

              Dates

                Created:
                Updated:

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.