[MDEV-5673] Crash while parallel dropping multiple tables under heavy load Created: 2014-02-14  Updated: 2014-07-25  Resolved: 2014-07-23

Status: Closed
Project: MariaDB Server
Component/s: None
Affects Version/s: 5.5.34, 5.5.35
Fix Version/s: 5.5.39

Type: Bug Priority: Major
Reporter: Tóth István Assignee: Jan Lindström (Inactive)
Resolution: Cannot Reproduce Votes: 0
Labels: None
Environment:

SW: Centos 6.5 x86_64 + Official MariaDB repo HW: 2xIntel(R) Xeon(R) CPU E5-2430, 64GB RAM, 4x2GB 7200RPM SATA Disk in RAID10


Attachments: Zip Archive columbo.tarr.hu.err-innodb_stats_update_need_lock=0.zip     File columbo.tarr.hu.err.zip     File server.cnf    

 Description   

When the crash happens, there are two sets of queries running on the server.

One set is populating Business Intelligence aggregate tables, the other is refreshing a development copy of a database, using parallel loading.

The two sets of queries are running on different, independent databases.

The crash does not occur when only one set of queries (e.g. only the db load queries, or only the BI aggregator queries ) are running.

The tables that I try to drop are in compressed format.

The system is IO bound when the crash occurs (iostat consistently shows 100% disk utilization), but not CPU bound.

The crash occurs when I try to drop several big (multi gigabyte) tables in parallel, as you can see in the attached error log.



 Comments   
Comment by Tóth István [ 2014-02-20 ]

We have upgraded to 10.0.8, with the same configiuration, and haven't experienced the problem since then.

Comment by Elena Stepanova [ 2014-02-20 ]

This is a "long semaphore wait" crash.
Most waits are dict_operation_lock, so my guess is that the contention is around XtraDB statistics. It is logical that it's gone after migrating to 10.0, since 10.0 has InnoDB by default (and even if you switched to XtraDB, the Percona statistics implementation has apparently changed in 5.6).

If you still have the 5.5 instance, please try to set innodb_stats_update_need_lock=0 and see if it helps.

Comment by Tóth István [ 2014-02-21 ]

Thank you. I am restoring 5.5 now, I should have results by Monday.

Comment by Tóth István [ 2014-02-28 ]

Unfortunately, I did not help.

I added

# This group is only read by MariaDB-5.5 servers.
# If you use the same .cnf file for MariaDB of different versions,
# use this group for options that older servers don't understand
[mariadb-5.5]
#MDEV-5673
innodb_stats_update_need_lock=0

to the end of the config file, and I double checked using show variables that it does get set 0, but I still get a similar crash.

I am attaching the new error.log

Comment by Tóth István [ 2014-02-28 ]

The error log with innodb_stats_update_need_lock=0

Comment by Tóth István [ 2014-03-04 ]

I also tested with MariaDB 5.5.35+InnoDB plugin, and it did not crash either.

Comment by Tóth István [ 2014-03-10 ]

10.0.8 with XtraDB enabled does not exhibit the bug the either.

Comment by Elena Stepanova [ 2014-03-10 ]

Thanks for the info.
Did you happen to try 5.5.36?

Comment by Tóth István [ 2014-03-10 ]

No, we settled for 10.0.8 for the time being.
If you believe that 5.5.36 has fixes for it, I can revert, and make another try.

Comment by Elena Stepanova [ 2014-03-10 ]

Thanks.
No, I don't have any particular reason to think that it was fixed in 5.5.36.
I asked because the problem was/is apparently on the upstream XtraDB side, and although I haven't found any similar bug reports in their tracker so far, it could have been fixed in XtraDB along with other changes, in which case it would have disappeared after the pre-release merge into MariaDB 5.5.36.

Comment by Elena Stepanova [ 2014-03-24 ]

Long semaphore waits for Jan's expert reading (logs are attached).

Comment by Tóth István [ 2014-05-16 ]

FYI, I've just had a very similar crash on Oracle Mysql 5.6.17. I can attach the logs for it if it is interesting to you.

Comment by Jan Lindström (Inactive) [ 2014-05-16 ]

Hi,

Why they are running these huge inserts like:

---TRANSACTION 705FC, ACTIVE 1492 sec
mysql tables in use 9, locked 9
69176 lock struct(s), heap size 6175160, 3804151 row lock(s)

If these inserts can't be splitted, you need to increase the long semaphore wait to higher number.

R: Jan

Comment by Tóth István [ 2014-05-16 ]

These are the queries that transform the raw data to the star schema required by the BI software. It may be possible to re-write them, but since mysql 5.5, and MariaDB 10.x does not exhibit the crash under the same circumstances, they are probably triggering a bug specific to mysql 5.6 (sometimes crashes) and MariaDB 5.5 (crashes every time).

However, accoring to our tests the inserts in themselves are not the problem, the crash only happens when they are running in parallel with the drop table statements.

Comment by Jan Lindström (Inactive) [ 2014-07-23 ]

revno: 4230
committer: Jan Lindström <jplindst@mariadb.org>
branch nick: 5.5
timestamp: Wed 2014-07-23 09:04:59 +0300
message:
MDEV-5673: Crash while parallel dropping multiple tables under heavy load

Improve long semaphore wait output to include all semaphore waits
and try to find out if there is a sequence of waiters.

Generated at Thu Feb 08 07:06:11 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.