[MDEV-17675] Crash whilst deleting row Created: 2018-11-12  Updated: 2019-01-28  Resolved: 2019-01-27

Status: Closed
Project: MariaDB Server
Component/s: Data Manipulation - Delete
Affects Version/s: 10.1.36
Fix Version/s: N/A

Type: Bug Priority: Major
Reporter: Brendon Abbott Assignee: Unassigned
Resolution: Duplicate Votes: 0
Labels: need_feedback
Environment:

Galera cluster, 3 nodes. Ubuntu 14.04.
AWS EC2. Each node in same region, but different availability zone.


Attachments: File mysql-syslog.txt.gz    
Issue Links:
Relates
relates to MDEV-17378 mysqld got exception 0xc0000005 Closed

 Description   

Server crashed during normal operation. All writes to the Galera cluster, would have been directed at this one node.

Full stack dump attached - but seemingly relevant part:

Nov  9 13:42:53 ip-10-38-166-188 mysqld: /usr/sbin/mysqld(+0x80f510)[0x7f911133e510]
Nov  9 13:42:53 ip-10-38-166-188 mysqld: /usr/sbin/mysqld(_ZN7handler13ha_delete_rowEPKh+0xff)[0x7f91110ffccf]
Nov  9 13:42:53 ip-10-38-166-188 mysqld: /usr/sbin/mysqld(_Z12mysql_deleteP3THDP10TABLE_LISTP4ItemP10SQL_I_ListI8st_orderEyyP13select_result+0x112c)[0x7f91112206cc]
Nov  9 13:42:53 ip-10-38-166-188 mysqld: /usr/sbin/mysqld(_Z21mysql_execute_commandP3THD+0x3bdb)[0x7f9110f693cb]
Nov  9 13:42:53 ip-10-38-166-188 mysqld: /usr/sbin/mysqld(_Z11mysql_parseP3THDPcjP12Parser_state+0x321)[0x7f9110f6ee91]
Nov  9 13:42:53 ip-10-38-166-188 mysqld: /usr/sbin/mysqld(+0x4407bd)[0x7f9110f6f7bd]
Nov  9 13:42:53 ip-10-38-166-188 mysqld: /usr/sbin/mysqld(_Z16dispatch_command19enum_server_commandP3THDPcj+0x1f9a)[0x7f9110f7211a]
Nov  9 13:42:53 ip-10-38-166-188 mysqld: /usr/sbin/mysqld(_Z10do_commandP3THD+0x177)[0x7f9110f73017]



 Comments   
Comment by Elena Stepanova [ 2018-11-12 ]

A slightly bigger relevant part:

10.1.36-MariaDB-1~trusty

Nov  9 13:42:52 ip-10-38-166-188 mysqld: 2018-11-09 13:42:52 7f26fa944b00  InnoDB: Assertion failure in thread 139805389507328 in file rem0rec.cc line 581
Nov  9 13:42:52 ip-10-38-166-188 mysqld: InnoDB: We intentionally generate a memory trap.
 
Nov  9 13:42:53 ip-10-38-166-188 mysqld: *** buffer overflow detected ***: /usr/sbin/mysqld terminated
Nov  9 13:42:53 ip-10-38-166-188 mysqld: ======= Backtrace: =========
Nov  9 13:42:53 ip-10-38-166-188 mysqld: /lib/x86_64-linux-gnu/libc.so.6(+0x7329f)[0x7f910eabe29f]
Nov  9 13:42:53 ip-10-38-166-188 mysqld: /lib/x86_64-linux-gnu/libc.so.6(__fortify_fail+0x5c)[0x7f910eb5987c]
Nov  9 13:42:53 ip-10-38-166-188 mysqld: /lib/x86_64-linux-gnu/libc.so.6(+0x10d750)[0x7f910eb58750]
Nov  9 13:42:53 ip-10-38-166-188 mysqld: /lib/x86_64-linux-gnu/libc.so.6(+0x10e7c7)[0x7f910eb597c7]
Nov  9 13:42:53 ip-10-38-166-188 mysqld: /usr/sbin/mysqld(my_addr_resolve+0xd0)[0x7f91115e7310]
Nov  9 13:42:53 ip-10-38-166-188 mysqld: /usr/sbin/mysqld(my_print_stacktrace+0x1c2)[0x7f91115d35d2]
Nov  9 13:42:53 ip-10-38-166-188 mysqld: /usr/sbin/mysqld(handle_fatal_signal+0x305)[0x7f91110f4915]
Nov  9 13:42:53 ip-10-38-166-188 mysqld: /lib/x86_64-linux-gnu/libpthread.so.0(+0x10330)[0x7f910f632330]
Nov  9 13:42:53 ip-10-38-166-188 mysqld: /lib/x86_64-linux-gnu/libc.so.6(gsignal+0x37)[0x7f910ea81c37]
Nov  9 13:42:53 ip-10-38-166-188 mysqld: /lib/x86_64-linux-gnu/libc.so.6(abort+0x148)[0x7f910ea85028]
Nov  9 13:42:53 ip-10-38-166-188 mysqld: /usr/sbin/mysqld(+0x8a47db)[0x7f91113d37db]
Nov  9 13:42:53 ip-10-38-166-188 mysqld: /usr/sbin/mysqld(+0x8a6def)[0x7f91113d5def]
Nov  9 13:42:53 ip-10-38-166-188 mysqld: /usr/sbin/mysqld(+0x80d60e)[0x7f911133c60e]
Nov  9 13:42:53 ip-10-38-166-188 mysqld: /usr/sbin/mysqld(+0x8b62b0)[0x7f91113e52b0]
Nov  9 13:42:53 ip-10-38-166-188 mysqld: /usr/sbin/mysqld(+0x8b89c3)[0x7f91113e79c3]
Nov  9 13:42:53 ip-10-38-166-188 mysqld: /usr/sbin/mysqld(+0x8e5758)[0x7f9111414758]
Nov  9 13:42:53 ip-10-38-166-188 mysqld: /usr/sbin/mysqld(+0x8ea224)[0x7f9111419224]
Nov  9 13:42:53 ip-10-38-166-188 mysqld: /usr/sbin/mysqld(+0x8ea8b7)[0x7f91114198b7]
Nov  9 13:42:53 ip-10-38-166-188 mysqld: /usr/sbin/mysqld(+0x8c98e4)[0x7f91113f88e4]
Nov  9 13:42:53 ip-10-38-166-188 mysqld: /usr/sbin/mysqld(+0x80f510)[0x7f911133e510]
Nov  9 13:42:53 ip-10-38-166-188 mysqld: /usr/sbin/mysqld(_ZN7handler13ha_delete_rowEPKh+0xff)[0x7f91110ffccf]
Nov  9 13:42:53 ip-10-38-166-188 mysqld: /usr/sbin/mysqld(_Z12mysql_deleteP3THDP10TABLE_LISTP4ItemP10SQL_I_ListI8st_orderEyyP13select_result+0x112c)[0x7f91112206cc]
Nov  9 13:42:53 ip-10-38-166-188 mysqld: /usr/sbin/mysqld(_Z21mysql_execute_commandP3THD+0x3bdb)[0x7f9110f693cb]
Nov  9 13:42:53 ip-10-38-166-188 mysqld: /usr/sbin/mysqld(_Z11mysql_parseP3THDPcjP12Parser_state+0x321)[0x7f9110f6ee91]
Nov  9 13:42:53 ip-10-38-166-188 mysqld: /usr/sbin/mysqld(+0x4407bd)[0x7f9110f6f7bd]
Nov  9 13:42:53 ip-10-38-166-188 mysqld: /usr/sbin/mysqld(_Z16dispatch_command19enum_server_commandP3THDPcj+0x1f9a)[0x7f9110f7211a]
Nov  9 13:42:53 ip-10-38-166-188 mysqld: /usr/sbin/mysqld(_Z10do_commandP3THD+0x177)[0x7f9110f73017]
Nov  9 13:42:53 ip-10-38-166-188 mysqld: /usr/sbin/mysqld(_Z24do_handle_one_connectionP3THD+0x18a)[0x7f911103fb0a]
Nov  9 13:42:53 ip-10-38-166-188 mysqld: /usr/sbin/mysqld(handle_one_connection+0x40)[0x7f911103fcb0]
Nov  9 13:42:53 ip-10-38-166-188 mysqld: /usr/sbin/mysqld(+0x754a4d)[0x7f9111283a4d]
Nov  9 13:42:53 ip-10-38-166-188 mysqld: /lib/x86_64-linux-gnu/libpthread.so.0(+0x8184)[0x7f910f62a184]
Nov  9 13:42:53 ip-10-38-166-188 mysqld: /lib/x86_64-linux-gnu/libc.so.6(clone+0x6d)[0x7f910eb4903d]
Nov  9 13:42:53 ip-10-38-166-188 mysqld: ======= Memory map: ========
Nov  9 13:42:53 ip-10-38-166-188 mysqld: 7f26efd50000-7f26efd51000 ---p 00000000 00:00 0 
...

Comment by Elena Stepanova [ 2018-11-12 ]

brendon,

Have you had any errors about database corruption (or any errors from mysqld at all) prior to the failure? The attached portion of the log only shows the failure itself.

Comment by Brendon Abbott [ 2018-11-12 ]

There were no immediately previous errors. I have looked back through the logs, and there are things which are presumably benign

  • "Got an error reading communication packets)"
  • The odd deadlock warning (we have deadlock logging on)

I have spoke to an engineer who knows this system, and had some useful info. The system was upgraded from 10.1.32 to 10.1.36 on Nov 7th. This was done as a rolling upgrade of each node. The system did an IST to bring the node back up to date. Additionally at this time, there was an adjustment to the memory allocation to MariaDB to increase it.

I couldn't see anything drastic in the logs, although there was a false start where it was started by the install process, and couldn't set gcomm.thread_prio=rr:2 due to not enough privileges.

Comment by Elena Stepanova [ 2018-12-29 ]

The failure seems remarkably similar to MDEV-17378.
As was suggested at some point in MDEV-17378, could you please run innochecksum on all ibd files to see if there is any detectable corruption?

Comment by Elena Stepanova [ 2019-01-27 ]

In the absence of a response, closing as a duplicate of MDEV-17378.

Comment by Brendon Abbott [ 2019-01-28 ]

I am happy with you closing this.

Apologies for not getting back - I thought I had. Unfortunately, I was not permitted to run the checksum as the system was in lockdown for the holiday period. I suspect any useful info is now lost anyway.

Generated at Thu Feb 08 08:38:15 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.