Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-18455

Crash On Table structure Change on very large table - With filesystem not having enough space, rendering the table (and server) unusable

Details

    Description

      While trying to ALTER a very large table (23M records), The server , and all galera linked servers crash. The peer server can never recover and and I have to do a manual rsync to get them to restart.

      2019-02-02 22:52:09 0x7f4960194700  InnoDB: Assertion failure in file /home/buildbot/buildbot/padding_for_CPACK_RPM_BUILD_SOURCE_DIRS_PREFIX/mariadb-10.3.12/storage/innobase/btr/btr0pcur.cc line 461
      InnoDB: Failing assertion: btr_page_get_prev(next_page, mtr) == btr_pcur_get_block(cursor)->page.id.page_no()
      InnoDB: We intentionally generate a memory trap.
      InnoDB: Submit a detailed bug report to https://jira.mariadb.org/
      InnoDB: If you get repeated assertion failures or crashes, even
      InnoDB: immediately after the mysqld startup, there may be
      InnoDB: corruption in the InnoDB tablespace. Please refer to
      InnoDB: https://mariadb.com/kb/en/library/innodb-recovery-modes/
      InnoDB: about forcing recovery.
      190202 22:52:09 [ERROR] mysqld got signal 6 ;
      This could be because you hit a bug. It is also possible that this binary
      or one of the libraries it was linked against is corrupt, improperly built,
      or misconfigured. This error can also be caused by malfunctioning hardware.
       
      To report this bug, see https://mariadb.com/kb/en/reporting-bugs
       
      We will try our best to scrape up some info that will hopefully help
      diagnose the problem, but since we have already crashed,
      something is definitely wrong and this may fail.
       
      Server version: 10.3.12-MariaDB-log
      key_buffer_size=134217728
      read_buffer_size=131072
      max_used_connections=2
      max_threads=153
      thread_count=10
      It is possible that mysqld could use up to
      key_buffer_size + (read_buffer_size + sort_buffer_size)*max_threads = 467389 K  bytes of memory
      Hope that's ok; if not, decrease some variables in the equation.
       
      Thread pointer: 0x7f491c000ac8
      Attempting backtrace. You can use the following information to find out
      where mysqld died. If you see no messages after this, something went
      terribly wrong...
      stack_bottom = 0x7f4960193d30 thread_stack 0x49000
      /usr/sbin/mysqld(my_print_stacktrace+0x2e)[0x55d09923dd1e]
      /usr/sbin/mysqld(handle_fatal_signal+0x357)[0x55d098cded97]
      sigaction.c:0(__restore_rt)[0x7f4975ac55d0]
      :0(__GI_raise)[0x7f4973d99207]
      :0(__GI_abort)[0x7f4973d9a8f8]
      ut/ut0rbt.cc:218(rbt_tree_add_child(ib_rbt_t const*, ib_rbt_bound_t*, ib_rbt_node_t*) [clone .isra.4] [clone .part.5])[0x55d098a29946]
      btr/btr0pcur.cc:459(btr_pcur_move_to_next_page(btr_pcur_t*, mtr_t*))[0x55d09902b152]
      include/btr0pcur.ic:340(btr_pcur_move_to_next(btr_pcur_t*, mtr_t*))[0x55d098a27c6b]
      row/row0sel.cc:5589(row_search_mvcc(unsigned char*, page_cur_mode_t, row_prebuilt_t*, unsigned long, unsigned long))[0x55d098f94201]
      handler/ha_innodb.cc:9537(ha_innobase::general_fetch(unsigned char*, unsigned int, unsigned int))[0x55d098ebced8]
      sql/handler.cc:2765(handler::ha_rnd_next(unsigned char*))[0x55d098ce3bc7]
      sql/records.cc:481(rr_sequential(READ_RECORD*))[0x55d098e0785c]
      sql/records.h:73(READ_RECORD::read_record())[0x55d098a1cdd0]
      sql/sql_table.cc:9881(mysql_alter_table(THD*, st_mysql_const_lex_string const*, st_mysql_const_lex_string const*, HA_CREATE_INFO*, TABLE_LIST*, Alter_info*, unsigned int, st_order*, bool))[0x55d098b9aaac]
      sql/sql_alter.cc:497(Sql_cmd_alter_table::execute(THD*))[0x55d098be7db3]
      sql/sql_parse.cc:6285(mysql_execute_command(THD*))[0x55d098b09b05]
      sql/sql_parse.cc:8093(mysql_parse(THD*, char*, unsigned int, Parser_state*, bool, bool))[0x55d098b1182b]
      sql/sql_parse.cc:7887(wsrep_mysql_parse(THD*, char*, unsigned int, Parser_state*, bool, bool))[0x55d098b121c1]
      sql/sql_parse.cc:1915(dispatch_command(enum_server_command, THD*, char*, unsigned int, bool, bool))[0x55d098b13a43]
      sql/sql_parse.cc:1398(do_command(THD*))[0x55d098b1532e]
      sql/sql_connect.cc:1402(do_handle_one_connection(CONNECT*))[0x55d098be512a]
      sql/sql_connect.cc:1310(handle_one_connection)[0x55d098be524d]
      pthread_create.c:0(start_thread)[0x7f4975abddd5]
      /lib64/libc.so.6(clone+0x6d)[0x7f4973e60ead]
       
      Trying to get some variables.
      Some pointers may be invalid and cause the dump to abort.
      Query (0x7f491c011730): ALTER TABLE contents ADD publisher_id               INT UNSIGNED, ADD cr_holder_id               INT UNSIGNED, ADD INDEX IDX_content_publishers(publisher_id), ADD INDEX IDX_content_copyright(cr_holder_id), ADD CONSTRAINT Refpublishers1691 FOREIGN KEY (publisher_id)     REFERENCES publishers(publisher_id), ADD CONSTRAINT Refcopyright_holders1701 FOREIGN KEY (cr_holder_id)     REFERENCES copyright_holders(cr_holder_id)
      Connection ID (thread ID): 11
      Status: NOT_KILLED
       
      Optimizer switch: index_merge=on,index_merge_union=on,index_merge_sort_union=on,index_merge_intersection=on,index_merge_sort_intersection=off,engine_condition_pushdown=off,index_condition_pushdown=on,derived_merge=on,derived_with_keys=on,firstmatch=on,loosescan=on,materialization=on,in_to_exists=on,semijoin=on,partial_match_rowid_merge=on,partial_match_table_scan=on,subquery_cache=on,mrr=off,mrr_cost_based=off,mrr_sort_keys=off,outer_join_with_cache=on,semijoin_with_cache=on,join_cache_incremental=on,join_cache_hashed=on,join_cache_bka=on,optimize_join_buffer_size=off,table_elimination=on,extended_keys=on,exists_to_in=on,orderby_uses_equalities=on,condition_pushdown_for_derived=on,split_materialized=on
       
      The manual page at http://dev.mysql.com/doc/mysql/en/crashing.html contains
      information that should help you find out what is causing the crash.
      2019-02-02 22:52:22 0 [Note] WSREP: Read nil XID from storage engines, skipping position init
      2019-02-02 22:52:22 0 [Note] WSREP: wsrep_load(): loading provider library '/usr/lib64/galera/libgalera_smm.so'
      2019-02-02 22:52:22 0 [Note] WSREP: wsrep_load(): Galera 25.3.25(r3836) by Codership Oy <info@codership.com> loaded successfully.
      2019-02-02 22:52:22 0 [Note] WSREP: CRC-32C: using hardware acceleration.
      2019-02-02 22:52:22 0 [Note] WSREP: Found saved state: 00000000-0000-0000-0000-000000000000:-1, safe_to_bootstrap: 0
      2019-02-02 22:52:22 0 [Note] WSREP: Recovering GCache ring buffer: version: 2, UUID: 6e8ceebd-277d-11e9-9b83-13112382c60b, offset: -1
      2019-02-02 22:52:22 0 [Note] WSREP: GCache::RingBuffer initial scan...  0.0% (        0/134217768 bytes) complete.
      2019-02-02 22:52:22 0 [Note] WSREP: Attempt to reuse the same seqno: 1. New ptr = 0x7f1087734ce8, seqno_g: 1, seqno_d: 0, size: 920, ctx: 0x56110d49a548, flags: 1. store: 1, cs: 00000000 00000000 00000000 00000000, previous ptr = 0x7f107fb99528, seqno_g: 1, seqno_d: 0, size: 936, ctx: 0x56110d49a548, flags: 1. store: 1, cs: 00000000 00000000 00000000 00000000
      2019-02-02 22:52:22 0 [Note] WSREP: Contents differ. Discarding both.
      2019-02-02 22:52:22 0 [Note] WSREP: GCache::RingBuffer initial scan...100.0% (134217768/134217768 bytes) complete.
      2019-02-02 22:52:22 0 [Note] WSREP: Recovering GCache ring buffer: found gapless sequence 2-2
      2019-02-02 22:52:22 0 [Note] WSREP: Recovering GCache ring buffer: discarding seqnos 1-1
      2019-02-02 22:52:22 0 [Note] WSREP: GCache::RingBuffer unused buffers scan...  0.0% (  0/936 bytes) complete.
      2019-02-02 22:52:22 0 [Note] WSREP: GCache::RingBuffer unused buffers scan...100.0% (936/936 bytes) complete.
      2019-02-02 22:52:22 0 [Note] WSREP: GCache DEBUG: RingBuffer::recover(): found 0/1 locked buffers
      2019-02-02 22:52:22 0 [Note] WSREP: GCache DEBUG: RingBuffer::recover(): used space: 936/134217728
      2019-02-02 22:52:22 0 [Note] WSREP: Passing config to GCS: base_dir = /var/lib/mysql/; base_host = 192.168.50.216; base_port = 4567; cert.log_conflicts = no; cert.optimistic_pa = yes; debug = no; evs.auto_evict = 0; evs.delay_margin = PT1S; evs.delayed_keep_period = PT30S; evs.inactive_check_period = PT0.5S; evs.inactive_timeout = PT15S; evs.join_retrans_period = PT1S; evs.max_install_timeouts = 3; evs.send_window = 4; evs.stats_report_period = PT1M; evs.suspect_timeout = PT5S; evs.user_send_window = 2; evs.view_forget_timeout = PT24H; gcache.dir = /var/lib/mysql/; gcache.keep_pages_size = 0; gcache.mem_size = 0; gcache.name = /var/lib/mysql//galera.cache; gcache.page_size = 128M; gcache.recover = yes; gcache.size = 128M; gcomm.thread_prio = ; gcs.fc_debug = 0; gcs.fc_factor = 1.0; gcs.fc_limit = 16; gcs.fc_master_slave = no; gcs.max_packet_size = 64500; gcs.max_throttle = 0.25; gcs.recv_q_hard_limit = 9223372036854775807; gcs.recv_q_soft_limit = 0.25; gcs.sync_donor = no; gmcast.segment = 0; gmcast.version = 0; pc.announce_timeout = PT
      2019-02-02 22:52:22 0 [Note] WSREP: GCache history reset: 6e8ceebd-277d-11e9-9b83-13112382c60b:2 -> 00000000-0000-0000-0000-000000000000:-1
      2019-02-02 22:52:22 0 [Note] WSREP: GCache DEBUG: RingBuffer::seqno_reset(): full reset
      2019-02-02 22:52:22 0 [Note] WSREP: Assign initial position for certification: -1, protocol version: -1
      2019-02-02 22:52:22 0 [Note] WSREP: wsrep_sst_grab()
      2019-02-02 22:52:22 0 [Note] WSREP: Start replication
      2019-02-02 22:52:22 0 [Note] WSREP: Setting initial position to 00000000-0000-0000-0000-000000000000:-1
      2019-02-02 22:52:22 0 [Note] WSREP: protonet asio version 0
      2019-02-02 22:52:22 0 [Note] WSREP: Using CRC-32C for message checksums.
      2019-02-02 22:52:22 0 [Note] WSREP: backend: asio
      2019-02-02 22:52:22 0 [Note] WSREP: gcomm thread scheduling priority set to other:0
      2019-02-02 22:52:22 0 [Note] WSREP: restore pc from disk successfully
      2019-02-02 22:52:22 0 [Note] WSREP: GMCast version 0
      2019-02-02 22:52:22 0 [Note] WSREP: (f766f12e, 'tcp://0.0.0.0:4567') listening at tcp://0.0.0.0:4567
      2019-02-02 22:52:22 0 [Note] WSREP: (f766f12e, 'tcp://0.0.0.0:4567') multicast: , ttl: 1
      2019-02-02 22:52:22 0 [Note] WSREP: EVS version 0
      2019-02-02 22:52:22 0 [Note] WSREP: gcomm: connecting to group '9jr DB Cluster', peer '192.168.50.30:,192.168.50.65:,192.168.50.213:,192.168.50.214:,192.168.50.225:,192.168.50.55:'
      2019-02-02 22:52:25 0 [Note] WSREP: (f766f12e, 'tcp://0.0.0.0:4567') connection to peer 00000000 with addr tcp://192.168.50.213:4567 timed out, no messages seen in PT3S
      2019-02-02 22:52:25 0 [Note] WSREP: (f766f12e, 'tcp://0.0.0.0:4567') connection to peer 00000000 with addr tcp://192.168.50.225:4567 timed out, no messages seen in PT3S
      2019-02-02 22:52:25 0 [Note] WSREP: (f766f12e, 'tcp://0.0.0.0:4567') connection to peer 00000000 with addr tcp://192.168.50.55:4567 timed out, no messages seen in PT3S
      2019-02-02 22:52:25 0 [Note] WSREP: (f766f12e, 'tcp://0.0.0.0:4567') connection to peer 00000000 with addr tcp://192.168.50.65:4567 timed out, no messages seen in PT3S
      2019-02-02 22:52:25 0 [Warning] WSREP: no nodes coming from prim view, prim not possible
      2019-02-02 22:52:25 0 [Note] WSREP: view(view_id(NON_PRIM,f766f12e,4) memb {
              f766f12e,0
      } joined {
      } left {
      } partitioned {
      })
      2019-02-02 22:52:25 0 [Note] WSREP: gcomm: connected
      2019-02-02 22:52:25 0 [Note] WSREP: Changing maximum packet size to 64500, resulting msg size: 32636
      2019-02-02 22:52:25 0 [Note] WSREP: Shifting CLOSED -> OPEN (TO: 0)
      2019-02-02 22:52:25 0 [Note] WSREP: Opened channel '9jr DB Cluster'
      2019-02-02 22:52:25 0 [Note] WSREP: Waiting for SST to complete.
      2019-02-02 22:52:25 0 [Note] WSREP: New COMPONENT: primary = no, bootstrap = no, my_idx = 0, memb_num = 1
      2019-02-02 22:52:25 0 [Note] WSREP: Flow-control interval: [16, 16]
      2019-02-02 22:52:25 0 [Note] WSREP: Trying to continue unpaused monitor
      2019-02-02 22:52:25 0 [Note] WSREP: Received NON-PRIMARY.
      2019-02-02 22:52:25 2 [Note] WSREP: New cluster view: global state: :-1, view# -1: non-Primary, number of nodes: 1, my index: 0, protocol version -1
      2019-02-02 22:52:25 2 [Note] WSREP: wsrep_notify_cmd is not defined, skipping notification.
      2019-02-02 22:52:26 0 [Warning] WSREP: last inactive check more than PT1.5S ago (PT3.50278S), skipping check
      2019-02-02 22:52:29 0 [Note] WSREP: (f766f12e, 'tcp://0.0.0.0:4567') connection to peer 00000000 with addr tcp://192.168.50.213:4567 timed out, no messages seen in PT3S
      2019-02-02 22:52:32 0 [Note] WSREP: (f766f12e, 'tcp://0.0.0.0:4567') connection established to 119c0d79 tcp://192.168.50.30:4567
      2019-02-02 22:52:32 0 [Note] WSREP: (f766f12e, 'tcp://0.0.0.0:4567') turning message relay requesting on, nonlive peers:
      2019-02-02 22:52:32 0 [Note] WSREP: declaring 119c0d79 at tcp://192.168.50.30:4567 stable
      2019-02-02 22:52:32 0 [Warning] WSREP: no nodes coming from prim view, prim not possible
      2019-02-02 22:52:32 0 [Note] WSREP: view(view_id(NON_PRIM,119c0d79,5) memb {
              119c0d79,0
              f766f12e,0
      } joined {
      } left {
      } partitioned {
      })
      2019-02-02 22:52:32 0 [Note] WSREP: promote to primary component
      2019-02-02 22:52:32 0 [Note] WSREP: view(view_id(PRIM,119c0d79,5) memb {
              119c0d79,0
              f766f12e,0
      } joined {
      } left {
      } partitioned {
      })
      2019-02-02 22:52:32 0 [Note] WSREP: save pc into disk
      2019-02-02 22:52:32 0 [Note] WSREP: New COMPONENT: primary = no, bootstrap = no, my_idx = 1, memb_num = 2
      2019-02-02 22:52:32 0 [Note] WSREP: Flow-control interval: [23, 23]
      2019-02-02 22:52:32 0 [Note] WSREP: Trying to continue unpaused monitor
      2019-02-02 22:52:32 0 [Note] WSREP: Received NON-PRIMARY.
      2019-02-02 22:52:32 2 [Note] WSREP: New cluster view: global state: :-1, view# -1: non-Primary, number of nodes: 2, my index: 1, protocol version -1
      2019-02-02 22:52:32 2 [Note] WSREP: wsrep_notify_cmd is not defined, skipping notification.
      2019-02-02 22:52:32 0 [Note] WSREP: discarding pending addr without UUID: tcp://192.168.50.213:4567
      2019-02-02 22:52:32 0 [Note] WSREP: discarding pending addr without UUID: tcp://192.168.50.214:4567
      2019-02-02 22:52:32 0 [Note] WSREP: discarding pending addr without UUID: tcp://192.168.50.225:4567
      2019-02-02 22:52:32 0 [Note] WSREP: discarding pending addr proto entry 0x7f1070001240
      2019-02-02 22:52:32 0 [Note] WSREP: discarding pending addr without UUID: tcp://192.168.50.55:4567
      2019-02-02 22:52:32 0 [Note] WSREP: discarding pending addr without UUID: tcp://192.168.50.65:4567
      2019-02-02 22:52:32 0 [Note] WSREP: clear restored view
      2019-02-02 22:52:32 0 [Warning] WSREP: non weight changing install in S_PRIM: pcmsg{ type=INSTALL, seq=0, flags= 2, node_map {  119c0d79,prim=0,un=0,last_seq=4294967295,last_prim=view_id(NON_PRIM,119c0d79,2),to_seq=-1,weight=1,segment=0
              f766f12e,prim=0,un=0,last_seq=4294967295,last_prim=view_id(NON_PRIM,119c0d79,2),to_seq=-1,weight=1,segment=0
      }}
      2019-02-02 22:52:32 0 [Note] WSREP: New COMPONENT: primary = yes, bootstrap = yes, my_idx = 1, memb_num = 2
      2019-02-02 22:52:32 0 [Note] WSREP: STATE EXCHANGE: Waiting for state UUID.
      2019-02-02 22:52:32 0 [Note] WSREP: STATE EXCHANGE: sent state msg: 47e64989-2780-11e9-a986-ef5ad4566067
      2019-02-02 22:52:32 0 [Note] WSREP: STATE EXCHANGE: got state msg: 47e64989-2780-11e9-a986-ef5ad4566067 from 0 (kvm.amar.com)
      2019-02-02 22:52:32 0 [Note] WSREP: STATE EXCHANGE: got state msg: 47e64989-2780-11e9-a986-ef5ad4566067 from 1 (db.9jr.com)
      2019-02-02 22:52:32 0 [Warning] WSREP: Quorum: No node with complete state:
       
              Version      : 4
              Flags        : 0x5
              Protocols    : 0 / 9 / 3
              State        : NON-PRIMARY
              Desync count : 0
              Prim state   : NON-PRIMARY
              Prim UUID    : 00000000-0000-0000-0000-000000000000
              Prim  seqno  : -1
              First seqno  : -1
              Last  seqno  : -1
              Prim JOINED  : 0
              State UUID   : 47e64989-2780-11e9-a986-ef5ad4566067
              Group UUID   : 00000000-0000-0000-0000-000000000000
              Name         : 'kvm.amar.com'
              Incoming addr: '192.168.50.30:3306'
       
              Version      : 4
              Flags        : 0x4
              Protocols    : 0 / 9 / 3
              State        : NON-PRIMARY
              Desync count : 0
              Prim state   : NON-PRIMARY
              Prim UUID    : 00000000-0000-0000-0000-000000000000
              Prim  seqno  : -1
              First seqno  : -1
              Last  seqno  : -1
              Prim JOINED  : 0
              State UUID   : 47e64989-2780-11e9-a986-ef5ad4566067
              Group UUID   : 00000000-0000-0000-0000-000000000000
              Name         : 'db.9jr.com'
              Incoming addr: '192.168.50.216:3306'
       
      2019-02-02 22:52:32 0 [Warning] WSREP: No re-merged primary component found.
      2019-02-02 22:52:32 0 [Note] WSREP: Bootstrapped primary 00000000-0000-0000-0000-000000000000 found: 2.
      2019-02-02 22:52:32 0 [Note] WSREP: Quorum results:
              version    = 4,
              component  = PRIMARY,
              conf_id    = -1,
              members    = 2/2 (joined/total),
              act_id     = -1,
              last_appl. = -1,
              protocols  = 0/9/3 (gcs/repl/appl),
              group UUID = 00000000-0000-0000-0000-000000000000
      2019-02-02 22:52:32 0 [Note] WSREP: Flow-control interval: [23, 23]
      2019-02-02 22:52:32 0 [Note] WSREP: Trying to continue unpaused monitor
      2019-02-02 22:52:32 0 [Note] WSREP: Restored state OPEN -> JOINED (-1)
      2019-02-02 22:52:32 2 [Note] WSREP: New cluster view: global state: :-1, view# 0: Primary, number of nodes: 2, my index: 1, protocol version 3
      2019-02-02 22:52:32 0 [ERROR] WSREP: SST failed: 1 (Operation not permitted)
      2019-02-02 22:52:32 0 [ERROR] Aborting
       
      Error in my_thread_global_end(): 1 threads didn't exit
      

      Attachments

        Issue Links

          Activity

            ADD FOREIGN KEY would normally use ALGORITHM=COPY. Only with SET foreign_key_checks=OFF it could be an instantaneous operation.

            The rbt_tree_add_child at the top of the stack trace is definitely wrong, but the rest of the stack trace looks plausible. I see that Galera is enabled; I wonder if that is playing a role for the error handling.

            I believe that we should try to repeat this with ALGORITHM=COPY on a nearly full file system.

            marko Marko Mäkelä added a comment - ADD FOREIGN KEY would normally use ALGORITHM=COPY . Only with SET foreign_key_checks=OFF it could be an instantaneous operation. The rbt_tree_add_child at the top of the stack trace is definitely wrong, but the rest of the stack trace looks plausible. I see that Galera is enabled; I wonder if that is playing a role for the error handling. I believe that we should try to repeat this with ALGORITHM=COPY on a nearly full file system.
            laxamar Jacques Amar added a comment -

            " I see that Galera is enabled;"

            Interestingly (or not!) both replicated galera servers had the same exact issue with this particular table (they were corrupted). The servers didn't crash, but the table was unusable. I tried an rsync over to my crashed server and it did start, but the table never recovered. Had to resort to a full wipe and restore from backup ... I'm guessing the InnoDB table was damaged internally.

            laxamar Jacques Amar added a comment - " I see that Galera is enabled;" Interestingly (or not!) both replicated galera servers had the same exact issue with this particular table (they were corrupted). The servers didn't crash, but the table was unusable. I tried an rsync over to my crashed server and it did start, but the table never recovered. Had to resort to a full wipe and restore from backup ... I'm guessing the InnoDB table was damaged internally.

            Got identical crash today (Oracle Linux 7.7, MariaDB 10.4.12), only that I have plently of space and wasnt altering any metadata. Furthermore, this corruption has propagated to all of the galera members except one, triggering cluster crash (except one node). (Should I report this as a separate bug ? Assertion looks identically).

            Apr 27 23:24:33 sso-vrn-db02 mysqld: 2020-04-27 23:24:33 2 [Note] WSREP: wsrep_notify_cmd is not defined, skipping notif
            ication.
            Apr 27 23:24:34 sso-vrn-db02 mysqld: 2020-04-27 23:24:34 0 [Note] WSREP: (e85c1d07, 'tcp://0.0.0.0:4567') turning messag
            e relay requesting off
            Apr 27 23:27:08 sso-vrn-db02 mysqld: 2020-04-27 23:27:08 995 [Warning] Aborted connection 995 to db: 'unconnected' user:
             'monitor' host: '188.186.154.181' (Got timeout reading communication packets)
            Apr 28 00:40:20 sso-vrn-db02 mysqld: 2020-04-28 00:40:20 0x7f5ec8153700  InnoDB: Assertion failure in file /home/buildbo
            t/buildbot/padding_for_CPACK_RPM_BUILD_SOURCE_DIRS_PREFIX/mariadb-10.4.12/storage/innobase/btr/btr0pcur.cc line 495
            Apr 28 00:40:20 sso-vrn-db02 mysqld: InnoDB: Failing assertion: btr_page_get_prev(next_page) == btr_pcur_get_block(curso
            r)->page.id.page_no()
            Apr 28 00:40:20 sso-vrn-db02 mysqld: InnoDB: We intentionally generate a memory trap.
            Apr 28 00:40:20 sso-vrn-db02 mysqld: InnoDB: Submit a detailed bug report to https://jira.mariadb.org/
            Apr 28 00:40:20 sso-vrn-db02 mysqld: InnoDB: If you get repeated assertion failures or crashes, even
            Apr 28 00:40:20 sso-vrn-db02 mysqld: InnoDB: immediately after the mysqld startup, there may be
            Apr 28 00:40:20 sso-vrn-db02 mysqld: InnoDB: corruption in the InnoDB tablespace. Please refer to
            Apr 28 00:40:20 sso-vrn-db02 mysqld: InnoDB: https://mariadb.com/kb/en/library/innodb-recovery-modes/
            Apr 28 00:40:20 sso-vrn-db02 mysqld: InnoDB: about forcing recovery.
            Apr 28 00:40:20 sso-vrn-db02 mysqld: 200428  0:40:20 [ERROR] mysqld got signal 6 ;
            Apr 28 00:40:20 sso-vrn-db02 mysqld: This could be because you hit a bug. It is also possible that this binary
            Apr 28 00:40:20 sso-vrn-db02 mysqld: or one of the libraries it was linked against is corrupt, improperly built,
            Apr 28 00:40:20 sso-vrn-db02 mysqld: or misconfigured. This error can also be caused by malfunctioning hardware.
            Apr 28 00:40:20 sso-vrn-db02 mysqld: To report this bug, see https://mariadb.com/kb/en/reporting-bugs
            Apr 28 00:40:20 sso-vrn-db02 mysqld: We will try our best to scrape up some info that will hopefully help
            Apr 28 00:40:20 sso-vrn-db02 mysqld: diagnose the problem, but since we have already crashed,
            Apr 28 00:40:20 sso-vrn-db02 mysqld: something is definitely wrong and this may fail.
            Apr 28 00:40:20 sso-vrn-db02 mysqld: Server version: 10.4.12-MariaDB
            Apr 28 00:40:20 sso-vrn-db02 mysqld: key_buffer_size=16777216
            Apr 28 00:40:20 sso-vrn-db02 mysqld: read_buffer_size=8388608
            Apr 28 00:40:20 sso-vrn-db02 mysqld: max_used_connections=143
            Apr 28 00:40:20 sso-vrn-db02 mysqld: max_threads=1026
            Apr 28 00:40:20 sso-vrn-db02 mysqld: thread_count=151
            Apr 28 00:40:20 sso-vrn-db02 mysqld: It is possible that mysqld could use up to
            Apr 28 00:40:20 sso-vrn-db02 mysqld: key_buffer_size + (read_buffer_size + sort_buffer_size)*max_threads = 8684712 K  by
            tes of memory
            Apr 28 00:40:20 sso-vrn-db02 mysqld: Hope that's ok; if not, decrease some variables in the equation.
            Apr 28 00:40:20 sso-vrn-db02 mysqld: Thread pointer: 0x7f5eb4111188
            Apr 28 00:40:20 sso-vrn-db02 mysqld: Attempting backtrace. You can use the following information to find out
            Apr 28 00:40:20 sso-vrn-db02 mysqld: where mysqld died. If you see no messages after this, something went
            Apr 28 00:40:20 sso-vrn-db02 mysqld: terribly wrong...
            Apr 28 00:40:20 sso-vrn-db02 mysqld: stack_bottom = 0x7f5ec8152cf0 thread_stack 0x49000
            Apr 28 00:40:20 sso-vrn-db02 mysqld: /usr/sbin/mysqld(my_print_stacktrace+0x2e)[0x56469e7fda1e]
            Apr 28 00:40:20 sso-vrn-db02 mysqld: /usr/sbin/mysqld(handle_fatal_signal+0x30f)[0x56469e291d8f]
            Apr 28 00:40:20 sso-vrn-db02 mysqld: /lib64/libpthread.so.0(+0xf630)[0x7f626e72e630]
            Apr 28 00:40:20 sso-vrn-db02 mysqld: /lib64/libc.so.6(gsignal+0x37)[0x7f626c9fe377]
            Apr 28 00:40:20 sso-vrn-db02 mysqld: /lib64/libc.so.6(abort+0x148)[0x7f626c9ffa68]
            Apr 28 00:40:20 sso-vrn-db02 mysqld: /usr/sbin/mysqld(+0x5b18be)[0x56469df858be]
            Apr 28 00:40:20 sso-vrn-db02 mysqld: /usr/sbin/mysqld(+0xbdbc9a)[0x56469e5afc9a]
            Apr 28 00:40:20 sso-vrn-db02 mysqld: /usr/sbin/mysqld(+0xb4ecbc)[0x56469e522cbc]
            Apr 28 00:40:20 sso-vrn-db02 mysqld: /usr/sbin/mysqld(+0xa6cd80)[0x56469e440d80]
            Apr 28 00:40:20 sso-vrn-db02 mysqld: /usr/sbin/mysqld(_ZN7handler11ha_rnd_nextEPh+0x10f)[0x56469e29688f]
            Apr 28 00:40:20 sso-vrn-db02 mysqld: /usr/sbin/mysqld(_Z13rr_sequentialP11READ_RECORD+0x1c)[0x56469e3c584c]
            Apr 28 00:40:20 sso-vrn-db02 mysqld: /usr/sbin/mysqld(_Z10sub_selectP4JOINP13st_join_tableb+0x24f)[0x56469e0b843f]
            Apr 28 00:40:20 sso-vrn-db02 mysqld: /usr/sbin/mysqld(_ZN4JOIN10exec_innerEv+0xb9c)[0x56469e0da7fc]
            Apr 28 00:40:21 sso-vrn-db02 mysqld: /usr/sbin/mysqld(_ZN4JOIN4execEv+0x33)[0x56469e0daa53]
            Apr 28 00:40:21 sso-vrn-db02 mysqld: /usr/sbin/mysqld(_Z12mysql_selectP3THDP10TABLE_LISTjR4ListI4ItemEPS4_jP8st_orderS9_
            S7_S9_yP13select_resultP18st_select_lex_unitP13st_select_lex+0x186)[0x56469e0d8d56]
            Apr 28 00:40:21 sso-vrn-db02 mysqld: /usr/sbin/mysqld(_Z13handle_selectP3THDP3LEXP13select_resultm+0x1d7)[0x56469e0d98c7
            ]
            Apr 28 00:40:21 sso-vrn-db02 mysqld: /usr/sbin/mysqld(+0x59f3d4)[0x56469df733d4]
            Apr 28 00:40:21 sso-vrn-db02 mysqld: /usr/sbin/mysqld(_Z21mysql_execute_commandP3THD+0x467d)[0x56469e07f4fd]
            Apr 28 00:40:21 sso-vrn-db02 mysqld: /usr/sbin/mysqld(_Z11mysql_parseP3THDPcjP12Parser_statebb+0x24b)[0x56469e0844eb]
            Apr 28 00:40:21 sso-vrn-db02 mysqld: /usr/sbin/mysqld(+0x6b0d99)[0x56469e084d99]
            Apr 28 00:40:21 sso-vrn-db02 mysqld: /usr/sbin/mysqld(_Z16dispatch_command19enum_server_commandP3THDPcjbb+0x2b06)[0x5646
            9e088296]
            Apr 28 00:40:21 sso-vrn-db02 mysqld: /usr/sbin/mysqld(_Z10do_commandP3THD+0x11c)[0x56469e088a4c]
            Apr 28 00:40:21 sso-vrn-db02 mysqld: /usr/sbin/mysqld(_Z24do_handle_one_connectionP7CONNECT+0x1fa)[0x56469e16732a]
            Apr 28 00:40:21 sso-vrn-db02 mysqld: /usr/sbin/mysqld(handle_one_connection+0x3d)[0x56469e16740d]
            Apr 28 00:40:21 sso-vrn-db02 mysqld: /lib64/libpthread.so.0(+0x7ea5)[0x7f626e726ea5]
            Apr 28 00:40:21 sso-vrn-db02 mysqld: /lib64/libc.so.6(clone+0x6d)[0x7f626cac68cd]
            Apr 28 00:40:21 sso-vrn-db02 mysqld: Trying to get some variables.
            Apr 28 00:40:21 sso-vrn-db02 mysqld: Some pointers may be invalid and cause the dump to abort.
            Apr 28 00:40:21 sso-vrn-db02 mysqld: Query (0x7f5eb4217eb0): select autolockno0_.id as id1_7_, autolockno0_.sended_at as
             sended_a2_7_, autolockno0_.status as status3_7_, autolockno0_.type as type4_7_, autolockno0_.user_id as user_id5_7_ fro
            m AUTO_LOCK_NOTIFICATION autolockno0_ where autolockno0_.status='PREPARE' for update
            Apr 28 00:40:21 sso-vrn-db02 mysqld: Connection ID (thread ID): 1211
            Apr 28 00:40:21 sso-vrn-db02 mysqld: Status: NOT_KILLED
            Apr 28 00:40:21 sso-vrn-db02 mysqld: Optimizer switch: index_merge=on,index_merge_union=on,index_merge_sort_union=on,ind
            ex_merge_intersection=on,index_merge_sort_intersection=off,engine_condition_pushdown=off,index_condition_pushdown=on,der
            ived_merge=on,derived_with_keys=on,firstmatch=on,loosescan=on,materialization=on,in_to_exists=on,semijoin=on,partial_mat
            ch_rowid_merge=on,partial_match_table_scan=on,subquery_cache=on,mrr=off,mrr_cost_based=off,mrr_sort_keys=off,outer_join_
            with_cache=on,semijoin_with_cache=on,join_cache_incremental=on,join_cache_hashed=on,join_cache_bka=on,optimize_join_buff
            er_size=on,table_elimination=on,extended_keys=on,exists_to_in=on,orderby_uses_equalities=on,condition_pushdown_for_deriv
            ed=on,split_materialized=on,condition_pushdown_for_subquery=on,rowid_filter=on,condition_pushdown_from_having=on
            Apr 28 00:40:21 sso-vrn-db02 mysqld: The manual page at http://dev.mysql.com/doc/mysql/en/crashing.html contains
            Apr 28 00:40:21 sso-vrn-db02 mysqld: information that should help you find out what is causing the crash.
            Apr 28 00:40:21 sso-vrn-db02 mysqld: Writing a core file...
            Apr 28 00:40:21 sso-vrn-db02 mysqld: Working directory at /var/lib/mysql
            Apr 28 00:40:21 sso-vrn-db02 mysqld: Resource Limits:
            Apr 28 00:40:21 sso-vrn-db02 mysqld: Limit                     Soft Limit           Hard Limit           Units
            Apr 28 00:40:21 sso-vrn-db02 mysqld: Max cpu time              unlimited            unlimited            seconds
            Apr 28 00:40:21 sso-vrn-db02 mysqld: Max file size             unlimited            unlimited            bytes
            Apr 28 00:40:21 sso-vrn-db02 mysqld: Max data size             unlimited            unlimited            bytes
            Apr 28 00:40:21 sso-vrn-db02 mysqld: Max stack size            8388608              unlimited            bytes
            Apr 28 00:40:21 sso-vrn-db02 mysqld: Max core file size        0                    unlimited            bytes
            Apr 28 00:40:21 sso-vrn-db02 mysqld: Max resident set          unlimited            unlimited            bytes
            Apr 28 00:40:21 sso-vrn-db02 mysqld: Max processes             63374                63374                processes
            Apr 28 00:40:21 sso-vrn-db02 mysqld: Max open files            16364                16364                files
            Apr 28 00:40:21 sso-vrn-db02 mysqld: Max locked memory         65536                65536                bytes
            Apr 28 00:40:21 sso-vrn-db02 mysqld: Max address space         unlimited            unlimited            bytes
            Apr 28 00:40:21 sso-vrn-db02 mysqld: Max file locks            unlimited            unlimited            locks
            Apr 28 00:40:21 sso-vrn-db02 mysqld: Max pending signals       63374                63374                signals
            Apr 28 00:40:21 sso-vrn-db02 mysqld: Max msgqueue size         819200               819200               bytes
            Apr 28 00:40:21 sso-vrn-db02 mysqld: Max nice priority         0                    0
            Apr 28 00:40:21 sso-vrn-db02 mysqld: Max realtime priority     0                    0
            Apr 28 00:40:21 sso-vrn-db02 mysqld: Max realtime timeout      unlimited            unlimited            us
            Apr 28 00:40:21 sso-vrn-db02 mysqld: Core pattern: core
            

            emz Eugene Z. Zheganin added a comment - Got identical crash today (Oracle Linux 7.7, MariaDB 10.4.12), only that I have plently of space and wasnt altering any metadata. Furthermore, this corruption has propagated to all of the galera members except one, triggering cluster crash (except one node). (Should I report this as a separate bug ? Assertion looks identically). Apr 27 23:24:33 sso-vrn-db02 mysqld: 2020-04-27 23:24:33 2 [Note] WSREP: wsrep_notify_cmd is not defined, skipping notif ication. Apr 27 23:24:34 sso-vrn-db02 mysqld: 2020-04-27 23:24:34 0 [Note] WSREP: (e85c1d07, 'tcp://0.0.0.0:4567') turning messag e relay requesting off Apr 27 23:27:08 sso-vrn-db02 mysqld: 2020-04-27 23:27:08 995 [Warning] Aborted connection 995 to db: 'unconnected' user: 'monitor' host: '188.186.154.181' (Got timeout reading communication packets) Apr 28 00:40:20 sso-vrn-db02 mysqld: 2020-04-28 00:40:20 0x7f5ec8153700 InnoDB: Assertion failure in file /home/buildbo t/buildbot/padding_for_CPACK_RPM_BUILD_SOURCE_DIRS_PREFIX/mariadb-10.4.12/storage/innobase/btr/btr0pcur.cc line 495 Apr 28 00:40:20 sso-vrn-db02 mysqld: InnoDB: Failing assertion: btr_page_get_prev(next_page) == btr_pcur_get_block(curso r)->page.id.page_no() Apr 28 00:40:20 sso-vrn-db02 mysqld: InnoDB: We intentionally generate a memory trap. Apr 28 00:40:20 sso-vrn-db02 mysqld: InnoDB: Submit a detailed bug report to https://jira.mariadb.org/ Apr 28 00:40:20 sso-vrn-db02 mysqld: InnoDB: If you get repeated assertion failures or crashes, even Apr 28 00:40:20 sso-vrn-db02 mysqld: InnoDB: immediately after the mysqld startup, there may be Apr 28 00:40:20 sso-vrn-db02 mysqld: InnoDB: corruption in the InnoDB tablespace. Please refer to Apr 28 00:40:20 sso-vrn-db02 mysqld: InnoDB: https://mariadb.com/kb/en/library/innodb-recovery-modes/ Apr 28 00:40:20 sso-vrn-db02 mysqld: InnoDB: about forcing recovery. Apr 28 00:40:20 sso-vrn-db02 mysqld: 200428 0:40:20 [ERROR] mysqld got signal 6 ; Apr 28 00:40:20 sso-vrn-db02 mysqld: This could be because you hit a bug. It is also possible that this binary Apr 28 00:40:20 sso-vrn-db02 mysqld: or one of the libraries it was linked against is corrupt, improperly built, Apr 28 00:40:20 sso-vrn-db02 mysqld: or misconfigured. This error can also be caused by malfunctioning hardware. Apr 28 00:40:20 sso-vrn-db02 mysqld: To report this bug, see https://mariadb.com/kb/en/reporting-bugs Apr 28 00:40:20 sso-vrn-db02 mysqld: We will try our best to scrape up some info that will hopefully help Apr 28 00:40:20 sso-vrn-db02 mysqld: diagnose the problem, but since we have already crashed, Apr 28 00:40:20 sso-vrn-db02 mysqld: something is definitely wrong and this may fail. Apr 28 00:40:20 sso-vrn-db02 mysqld: Server version: 10.4.12-MariaDB Apr 28 00:40:20 sso-vrn-db02 mysqld: key_buffer_size=16777216 Apr 28 00:40:20 sso-vrn-db02 mysqld: read_buffer_size=8388608 Apr 28 00:40:20 sso-vrn-db02 mysqld: max_used_connections=143 Apr 28 00:40:20 sso-vrn-db02 mysqld: max_threads=1026 Apr 28 00:40:20 sso-vrn-db02 mysqld: thread_count=151 Apr 28 00:40:20 sso-vrn-db02 mysqld: It is possible that mysqld could use up to Apr 28 00:40:20 sso-vrn-db02 mysqld: key_buffer_size + (read_buffer_size + sort_buffer_size)*max_threads = 8684712 K by tes of memory Apr 28 00:40:20 sso-vrn-db02 mysqld: Hope that's ok; if not, decrease some variables in the equation. Apr 28 00:40:20 sso-vrn-db02 mysqld: Thread pointer: 0x7f5eb4111188 Apr 28 00:40:20 sso-vrn-db02 mysqld: Attempting backtrace. You can use the following information to find out Apr 28 00:40:20 sso-vrn-db02 mysqld: where mysqld died. If you see no messages after this, something went Apr 28 00:40:20 sso-vrn-db02 mysqld: terribly wrong... Apr 28 00:40:20 sso-vrn-db02 mysqld: stack_bottom = 0x7f5ec8152cf0 thread_stack 0x49000 Apr 28 00:40:20 sso-vrn-db02 mysqld: /usr/sbin/mysqld(my_print_stacktrace+0x2e)[0x56469e7fda1e] Apr 28 00:40:20 sso-vrn-db02 mysqld: /usr/sbin/mysqld(handle_fatal_signal+0x30f)[0x56469e291d8f] Apr 28 00:40:20 sso-vrn-db02 mysqld: /lib64/libpthread.so.0(+0xf630)[0x7f626e72e630] Apr 28 00:40:20 sso-vrn-db02 mysqld: /lib64/libc.so.6(gsignal+0x37)[0x7f626c9fe377] Apr 28 00:40:20 sso-vrn-db02 mysqld: /lib64/libc.so.6(abort+0x148)[0x7f626c9ffa68] Apr 28 00:40:20 sso-vrn-db02 mysqld: /usr/sbin/mysqld(+0x5b18be)[0x56469df858be] Apr 28 00:40:20 sso-vrn-db02 mysqld: /usr/sbin/mysqld(+0xbdbc9a)[0x56469e5afc9a] Apr 28 00:40:20 sso-vrn-db02 mysqld: /usr/sbin/mysqld(+0xb4ecbc)[0x56469e522cbc] Apr 28 00:40:20 sso-vrn-db02 mysqld: /usr/sbin/mysqld(+0xa6cd80)[0x56469e440d80] Apr 28 00:40:20 sso-vrn-db02 mysqld: /usr/sbin/mysqld(_ZN7handler11ha_rnd_nextEPh+0x10f)[0x56469e29688f] Apr 28 00:40:20 sso-vrn-db02 mysqld: /usr/sbin/mysqld(_Z13rr_sequentialP11READ_RECORD+0x1c)[0x56469e3c584c] Apr 28 00:40:20 sso-vrn-db02 mysqld: /usr/sbin/mysqld(_Z10sub_selectP4JOINP13st_join_tableb+0x24f)[0x56469e0b843f] Apr 28 00:40:20 sso-vrn-db02 mysqld: /usr/sbin/mysqld(_ZN4JOIN10exec_innerEv+0xb9c)[0x56469e0da7fc] Apr 28 00:40:21 sso-vrn-db02 mysqld: /usr/sbin/mysqld(_ZN4JOIN4execEv+0x33)[0x56469e0daa53] Apr 28 00:40:21 sso-vrn-db02 mysqld: /usr/sbin/mysqld(_Z12mysql_selectP3THDP10TABLE_LISTjR4ListI4ItemEPS4_jP8st_orderS9_ S7_S9_yP13select_resultP18st_select_lex_unitP13st_select_lex+0x186)[0x56469e0d8d56] Apr 28 00:40:21 sso-vrn-db02 mysqld: /usr/sbin/mysqld(_Z13handle_selectP3THDP3LEXP13select_resultm+0x1d7)[0x56469e0d98c7 ] Apr 28 00:40:21 sso-vrn-db02 mysqld: /usr/sbin/mysqld(+0x59f3d4)[0x56469df733d4] Apr 28 00:40:21 sso-vrn-db02 mysqld: /usr/sbin/mysqld(_Z21mysql_execute_commandP3THD+0x467d)[0x56469e07f4fd] Apr 28 00:40:21 sso-vrn-db02 mysqld: /usr/sbin/mysqld(_Z11mysql_parseP3THDPcjP12Parser_statebb+0x24b)[0x56469e0844eb] Apr 28 00:40:21 sso-vrn-db02 mysqld: /usr/sbin/mysqld(+0x6b0d99)[0x56469e084d99] Apr 28 00:40:21 sso-vrn-db02 mysqld: /usr/sbin/mysqld(_Z16dispatch_command19enum_server_commandP3THDPcjbb+0x2b06)[0x5646 9e088296] Apr 28 00:40:21 sso-vrn-db02 mysqld: /usr/sbin/mysqld(_Z10do_commandP3THD+0x11c)[0x56469e088a4c] Apr 28 00:40:21 sso-vrn-db02 mysqld: /usr/sbin/mysqld(_Z24do_handle_one_connectionP7CONNECT+0x1fa)[0x56469e16732a] Apr 28 00:40:21 sso-vrn-db02 mysqld: /usr/sbin/mysqld(handle_one_connection+0x3d)[0x56469e16740d] Apr 28 00:40:21 sso-vrn-db02 mysqld: /lib64/libpthread.so.0(+0x7ea5)[0x7f626e726ea5] Apr 28 00:40:21 sso-vrn-db02 mysqld: /lib64/libc.so.6(clone+0x6d)[0x7f626cac68cd] Apr 28 00:40:21 sso-vrn-db02 mysqld: Trying to get some variables. Apr 28 00:40:21 sso-vrn-db02 mysqld: Some pointers may be invalid and cause the dump to abort. Apr 28 00:40:21 sso-vrn-db02 mysqld: Query (0x7f5eb4217eb0): select autolockno0_.id as id1_7_, autolockno0_.sended_at as sended_a2_7_, autolockno0_.status as status3_7_, autolockno0_.type as type4_7_, autolockno0_.user_id as user_id5_7_ fro m AUTO_LOCK_NOTIFICATION autolockno0_ where autolockno0_.status='PREPARE' for update Apr 28 00:40:21 sso-vrn-db02 mysqld: Connection ID (thread ID): 1211 Apr 28 00:40:21 sso-vrn-db02 mysqld: Status: NOT_KILLED Apr 28 00:40:21 sso-vrn-db02 mysqld: Optimizer switch: index_merge=on,index_merge_union=on,index_merge_sort_union=on,ind ex_merge_intersection=on,index_merge_sort_intersection=off,engine_condition_pushdown=off,index_condition_pushdown=on,der ived_merge=on,derived_with_keys=on,firstmatch=on,loosescan=on,materialization=on,in_to_exists=on,semijoin=on,partial_mat ch_rowid_merge=on,partial_match_table_scan=on,subquery_cache=on,mrr=off,mrr_cost_based=off,mrr_sort_keys=off,outer_join_ with_cache=on,semijoin_with_cache=on,join_cache_incremental=on,join_cache_hashed=on,join_cache_bka=on,optimize_join_buff er_size=on,table_elimination=on,extended_keys=on,exists_to_in=on,orderby_uses_equalities=on,condition_pushdown_for_deriv ed=on,split_materialized=on,condition_pushdown_for_subquery=on,rowid_filter=on,condition_pushdown_from_having=on Apr 28 00:40:21 sso-vrn-db02 mysqld: The manual page at http://dev.mysql.com/doc/mysql/en/crashing.html contains Apr 28 00:40:21 sso-vrn-db02 mysqld: information that should help you find out what is causing the crash. Apr 28 00:40:21 sso-vrn-db02 mysqld: Writing a core file... Apr 28 00:40:21 sso-vrn-db02 mysqld: Working directory at /var/lib/mysql Apr 28 00:40:21 sso-vrn-db02 mysqld: Resource Limits: Apr 28 00:40:21 sso-vrn-db02 mysqld: Limit Soft Limit Hard Limit Units Apr 28 00:40:21 sso-vrn-db02 mysqld: Max cpu time unlimited unlimited seconds Apr 28 00:40:21 sso-vrn-db02 mysqld: Max file size unlimited unlimited bytes Apr 28 00:40:21 sso-vrn-db02 mysqld: Max data size unlimited unlimited bytes Apr 28 00:40:21 sso-vrn-db02 mysqld: Max stack size 8388608 unlimited bytes Apr 28 00:40:21 sso-vrn-db02 mysqld: Max core file size 0 unlimited bytes Apr 28 00:40:21 sso-vrn-db02 mysqld: Max resident set unlimited unlimited bytes Apr 28 00:40:21 sso-vrn-db02 mysqld: Max processes 63374 63374 processes Apr 28 00:40:21 sso-vrn-db02 mysqld: Max open files 16364 16364 files Apr 28 00:40:21 sso-vrn-db02 mysqld: Max locked memory 65536 65536 bytes Apr 28 00:40:21 sso-vrn-db02 mysqld: Max address space unlimited unlimited bytes Apr 28 00:40:21 sso-vrn-db02 mysqld: Max file locks unlimited unlimited locks Apr 28 00:40:21 sso-vrn-db02 mysqld: Max pending signals 63374 63374 signals Apr 28 00:40:21 sso-vrn-db02 mysqld: Max msgqueue size 819200 819200 bytes Apr 28 00:40:21 sso-vrn-db02 mysqld: Max nice priority 0 0 Apr 28 00:40:21 sso-vrn-db02 mysqld: Max realtime priority 0 0 Apr 28 00:40:21 sso-vrn-db02 mysqld: Max realtime timeout unlimited unlimited us Apr 28 00:40:21 sso-vrn-db02 mysqld: Core pattern: core
            marko Marko Mäkelä added a comment - - edited

            emz, an ALTER TABLE with ALGORITHM=COPY is similar to a INSERT…SELECT operation that copies an entire table. What you are experiencing might share the same root cause.

            For the out-of-space handling, we have an open bug MDEV-13680, but I do not think that we are hitting it here.

            This corruption is occurring in the B-trees, which are below the transaction layer, which Galera is interfacing with. I would think that Galera is an unlikely culprit.

            I do not remember our support customers encountering this problem. I am happy to fix this, provided that someone creates a reasonably repeatable test case.

            I have a couple of culprit candidates: InnoDB adaptive hash index and buffer pool resizing. We disabled the adaptive hash index by default in 10.5 (MDEV-20487). We have found several race conditions related to buffer pool resizing, one of them being MDEV-23693. Could these have been involved?

            marko Marko Mäkelä added a comment - - edited emz , an ALTER TABLE with ALGORITHM=COPY is similar to a INSERT…SELECT operation that copies an entire table. What you are experiencing might share the same root cause. For the out-of-space handling, we have an open bug MDEV-13680 , but I do not think that we are hitting it here. This corruption is occurring in the B-trees, which are below the transaction layer, which Galera is interfacing with. I would think that Galera is an unlikely culprit. I do not remember our support customers encountering this problem. I am happy to fix this, provided that someone creates a reasonably repeatable test case. I have a couple of culprit candidates: InnoDB adaptive hash index and buffer pool resizing. We disabled the adaptive hash index by default in 10.5 ( MDEV-20487 ). We have found several race conditions related to buffer pool resizing, one of them being MDEV-23693 . Could these have been involved?

            I can able to repeat the user given stack trace in 10.4.12 CS.

            #0  rec_get_offsets_func (rec=rec@entry=0x0, index=index@entry=0x5d0da2cba0e0, offsets=offsets@entry=0x7c28904861e0, leaf=leaf@entry=true, 
                n_fields=n_fields@entry=1, heap=heap@entry=0x7c28904861a8) at /home/thiru/source_file/server/10.4/storage/innobase/rem/rem0rec.cc:822
            #1  0x00005d0d9d83328b in btr_search_update_hash_on_insert (cursor=cursor@entry=0x7c2890486570, ahi_latch=0x5d0da280e070)
                at /home/thiru/source_file/server/10.4/storage/innobase/btr/btr0sea.cc:166
            #2  0x00005d0d9d821af5 in btr_cur_pessimistic_insert (flags=flags@entry=3, cursor=cursor@entry=0x7c2890486570, 
                offsets=offsets@entry=0x7c2890486560, heap=heap@entry=0x7c2890486558, entry=entry@entry=0x5d0da2c81a90, rec=rec@entry=0x7c2890486568, 
                big_rec=0x7c2890486550, n_ext=<optimized out>, thr=0x5d0da2cca6e0, mtr=0x7c2890486cb0)
                at /home/thiru/source_file/server/10.4/storage/innobase/btr/btr0cur.cc:3841
            #3  0x00005d0d9d79b81d in row_ins_clust_index_entry_low (flags=flags@entry=3, mode=<optimized out>, mode@entry=33, 
                index=index@entry=0x5d0da2cba0e0, n_uniq=n_uniq@entry=0, entry=entry@entry=0x5d0da2c81a90, n_ext=n_ext@entry=0, thr=0x5d0da2cca6e0)
                at /home/thiru/source_file/server/10.4/storage/innobase/row/row0ins.cc:2780
            #4  0x00005d0d9d79faf7 in row_ins_clust_index_entry (index=0x5d0da2cba0e0, entry=0x5d0da2c81a90, thr=thr@entry=0x5d0da2cca6e0, 
                n_ext=n_ext@entry=0) at /home/thiru/source_file/server/10.4/storage/innobase/row/row0ins.cc:3251
            #5  0x00005d0d9d7a1866 in row_ins_index_entry (thr=<optimized out>, entry=<optimized out>, index=<optimized out>)
                at /home/thiru/source_file/server/10.4/storage/innobase/row/row0ins.cc:3359
            #6  row_ins_index_entry_step (thr=<optimized out>, node=<optimized out>)
                at /home/thiru/source_file/server/10.4/storage/innobase/row/row0ins.cc:3528
            #7  row_ins (thr=<optimized out>, node=<optimized out>) at /home/thiru/source_file/server/10.4/storage/innobase/row/row0ins.cc:3664
            #8  row_ins_step (thr=thr@entry=0x5d0da2cca6e0) at /home/thiru/source_file/server/10.4/storage/innobase/row/row0ins.cc:3807
            #9  0x00005d0d9d7af2e9 in row_insert_for_mysql (mysql_rec=<optimized out>, prebuilt=0x5d0da2cc9e40, ins_mode=ROW_INS_NORMAL)
                at /home/thiru/source_file/server/10.4/storage/innobase/row/row0mysql.cc:1463
            #10 0x00005d0d9d70b34b in ha_innobase::write_row (this=0x5d0da2cb8230, record=0x5d0da2cb7e48 "\371д\001")
                at /home/thiru/source_file/server/10.4/storage/innobase/handler/ha_innodb.cc:8035
            #11 0x00005d0d9d573b9f in handler::ha_write_row (this=0x5d0da2cb8230, buf=0x5d0da2cb7e48 "\371д\001")
                at /home/thiru/source_file/server/10.4/sql/handler.cc:6675
            #12 0x00005d0d9d400c75 in copy_data_between_tables (thd=thd@entry=0x5d0da2c64758, from=from@entry=0x5d0da2c7dbe8, 
                to=to@entry=0x5d0da2cc8ff8, create=..., ignore=ignore@entry=false, order_num=order_num@entry=0, order=0x0, copied=0x7c2890488ab8, 
                deleted=0x7c2890488ac0, keys_onoff=Alter_info::LEAVE_AS_IS, alter_ctx=0x7c289048a3b0)
                at /home/thiru/source_file/server/10.4/sql/sql_table.cc:10807
            

            Error report was handled in 10.6 version by MDEV-13542.
            There was missing error handling function while allocating the page. But that is also fixed in MDEV-34118.
            so closing this bug as fixed.

            thiru Thirunarayanan Balathandayuthapani added a comment - - edited I can able to repeat the user given stack trace in 10.4.12 CS. #0 rec_get_offsets_func (rec=rec@entry=0x0, index=index@entry=0x5d0da2cba0e0, offsets=offsets@entry=0x7c28904861e0, leaf=leaf@entry=true, n_fields=n_fields@entry=1, heap=heap@entry=0x7c28904861a8) at /home/thiru/source_file/server/10.4/storage/innobase/rem/rem0rec.cc:822 #1 0x00005d0d9d83328b in btr_search_update_hash_on_insert (cursor=cursor@entry=0x7c2890486570, ahi_latch=0x5d0da280e070) at /home/thiru/source_file/server/10.4/storage/innobase/btr/btr0sea.cc:166 #2 0x00005d0d9d821af5 in btr_cur_pessimistic_insert (flags=flags@entry=3, cursor=cursor@entry=0x7c2890486570, offsets=offsets@entry=0x7c2890486560, heap=heap@entry=0x7c2890486558, entry=entry@entry=0x5d0da2c81a90, rec=rec@entry=0x7c2890486568, big_rec=0x7c2890486550, n_ext=<optimized out>, thr=0x5d0da2cca6e0, mtr=0x7c2890486cb0) at /home/thiru/source_file/server/10.4/storage/innobase/btr/btr0cur.cc:3841 #3 0x00005d0d9d79b81d in row_ins_clust_index_entry_low (flags=flags@entry=3, mode=<optimized out>, mode@entry=33, index=index@entry=0x5d0da2cba0e0, n_uniq=n_uniq@entry=0, entry=entry@entry=0x5d0da2c81a90, n_ext=n_ext@entry=0, thr=0x5d0da2cca6e0) at /home/thiru/source_file/server/10.4/storage/innobase/row/row0ins.cc:2780 #4 0x00005d0d9d79faf7 in row_ins_clust_index_entry (index=0x5d0da2cba0e0, entry=0x5d0da2c81a90, thr=thr@entry=0x5d0da2cca6e0, n_ext=n_ext@entry=0) at /home/thiru/source_file/server/10.4/storage/innobase/row/row0ins.cc:3251 #5 0x00005d0d9d7a1866 in row_ins_index_entry (thr=<optimized out>, entry=<optimized out>, index=<optimized out>) at /home/thiru/source_file/server/10.4/storage/innobase/row/row0ins.cc:3359 #6 row_ins_index_entry_step (thr=<optimized out>, node=<optimized out>) at /home/thiru/source_file/server/10.4/storage/innobase/row/row0ins.cc:3528 #7 row_ins (thr=<optimized out>, node=<optimized out>) at /home/thiru/source_file/server/10.4/storage/innobase/row/row0ins.cc:3664 #8 row_ins_step (thr=thr@entry=0x5d0da2cca6e0) at /home/thiru/source_file/server/10.4/storage/innobase/row/row0ins.cc:3807 #9 0x00005d0d9d7af2e9 in row_insert_for_mysql (mysql_rec=<optimized out>, prebuilt=0x5d0da2cc9e40, ins_mode=ROW_INS_NORMAL) at /home/thiru/source_file/server/10.4/storage/innobase/row/row0mysql.cc:1463 #10 0x00005d0d9d70b34b in ha_innobase::write_row (this=0x5d0da2cb8230, record=0x5d0da2cb7e48 "\371д\001") at /home/thiru/source_file/server/10.4/storage/innobase/handler/ha_innodb.cc:8035 #11 0x00005d0d9d573b9f in handler::ha_write_row (this=0x5d0da2cb8230, buf=0x5d0da2cb7e48 "\371д\001") at /home/thiru/source_file/server/10.4/sql/handler.cc:6675 #12 0x00005d0d9d400c75 in copy_data_between_tables (thd=thd@entry=0x5d0da2c64758, from=from@entry=0x5d0da2c7dbe8, to=to@entry=0x5d0da2cc8ff8, create=..., ignore=ignore@entry=false, order_num=order_num@entry=0, order=0x0, copied=0x7c2890488ab8, deleted=0x7c2890488ac0, keys_onoff=Alter_info::LEAVE_AS_IS, alter_ctx=0x7c289048a3b0) at /home/thiru/source_file/server/10.4/sql/sql_table.cc:10807 Error report was handled in 10.6 version by MDEV-13542 . There was missing error handling function while allocating the page. But that is also fixed in MDEV-34118 . so closing this bug as fixed.

            People

              thiru Thirunarayanan Balathandayuthapani
              laxamar Jacques Amar
              Votes:
              1 Vote for this issue
              Watchers:
              9 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.