[MDEV-8069] DROP or rebuild of a large table may lock up InnoDB - Jira

Details

Type: Bug
Status: Closed (View Workflow)
Priority: Critical
Resolution: Fixed
Affects Version/s: 10.0.16, 10.1.34, 10.2.27
Fix Version/s: 10.5.4
Component/s: Storage Engine - InnoDB, Storage Engine - XtraDB
Labels:
Environment:
Red Hat Enterprise Linux

Sprint:
10.3.1-1

Description

DROP DATABASE IF EXSITS executed on database with more that several thousand tables and over several TBs data in it is unreasonably slow.
A huge amount of writing to disk has been observed too.

Crash description:
executed:

SET unique_checks = 0; SET foreign_key_checks = 0; SET GLOBAL innodb_stats_on_metadata = 0; DROP DATABASE IF EXISTS HUGE_TABLE'

InnoDB: Error: semaphore wait has lasted > 600 seconds

InnoDB: We intentionally crash the server, because it appears to be hung.

2015-04-27 12:21:26 7f07de3ff700  InnoDB: Assertion failure in thread 139671770232576 in file srv0srv.cc line 2196

InnoDB: We intentionally generate a memory trap.

InnoDB: Submit a detailed bug report to http://bugs.mysql.com.

InnoDB: If you get repeated assertion failures or crashes, even

InnoDB: immediately after the mysqld startup, there may be

InnoDB: corruption in the InnoDB tablespace. Please refer to

InnoDB: http://dev.mysql.com/doc/refman/5.6/en/forcing-innodb-recovery.html

InnoDB: about forcing recovery.

150427 12:21:26 [ERROR] mysqld got signal 6 ;

This could be because you hit a bug. It is also possible that this binary

or one of the libraries it was linked against is corrupt, improperly built,

or misconfigured. This error can also be caused by malfunctioning hardware.

To report this bug, see http://kb.askmonty.org/en/reporting-bugs

We will try our best to scrape up some info that will hopefully help

diagnose the problem, but since we have already crashed,

something is definitely wrong and this may fail.

Server version: 10.0.16-MariaDB-log

key_buffer_size=53687091200

read_buffer_size=131072

max_used_connections=42

max_threads=402

thread_count=21

It is possible that mysqld could use up to

key_buffer_size + (read_buffer_size + sort_buffer_size)*max_threads = 52900205 K  bytes of memory

Hope that's ok; if not, decrease some variables in the equation.

Thread pointer: 0x0x0

Attempting backtrace. You can use the following information to find out

where mysqld died. If you see no messages after this, something went

terribly wrong...

stack_bottom = 0x0 thread_stack 0x48000

/usr/sbin/mysqld(my_print_stacktrace+0x2b)[0xb73d3b]

/usr/sbin/mysqld(handle_fatal_signal+0x398)[0x726518]

/lib64/libpthread.so.0(+0xf710)[0x7f5096c43710]

/lib64/libc.so.6(gsignal+0x35)[0x7f509529f625]

/lib64/libc.so.6(abort+0x175)[0x7f50952a0e05]

/usr/sbin/mysqld[0x936f9c]

/lib64/libpthread.so.0(+0x79d1)[0x7f5096c3b9d1]

/lib64/libc.so.6(clone+0x6d)[0x7f50953558fd]

The manual page at http://dev.mysql.com/doc/mysql/en/crashing.html contains

information that should help you find out what is causing the crash.

150427 12:48:54 mysqld_safe Number of processes running now: 0

150427 12:48:54 mysqld_safe mysqld restarted

150427 12:49:16 [Note] InnoDB: Using mutexes to ref count buffer pool pages

150427 12:49:16 [Note] InnoDB: The InnoDB memory heap is disabled

150427 12:49:16 [Note] InnoDB: Mutexes and rw_locks use GCC atomic builtins

150427 12:49:16 [Note] InnoDB: Memory barrier is not used

150427 12:49:16 [Note] InnoDB: Compressed tables use zlib 1.2.3

150427 12:49:16 [Note] InnoDB: Using Linux native AIO

150427 12:49:16 [Note] InnoDB: Using CPU crc32 instructions

150427 12:49:16 [Note] InnoDB: Initializing buffer pool, size = 225.0G

150427 12:49:27 [Note] InnoDB: Completed initialization of buffer pool

150427 12:49:29 [Note] InnoDB: Highest supported file format is Barracuda.

150427 12:49:29 [Note] InnoDB: Log scan progressed past the checkpoint lsn 277389716102041

150427 12:49:29 [Note] InnoDB: Database was not shutdown normally!

150427 12:49:29 [Note] InnoDB: Starting crash recovery.

150427 12:49:29 [Note] InnoDB: Reading tablespace information from the .ibd files...

(END)

Please do not ignore the Feature Request for optimizing the DROP DATABASE operation, even this ticket is marked as a bug.

Attachments

Issue Links

is blocked by

MDEV-14585 Automatically remove #sql- tables in innodb dictionary during recovery

Closed

relates to

MDEV-11655 Transactional data dictionary

Open

MDEV-18518 Implement atomic multi-table (or multi-partition) CREATE TABLE for InnoDB

Closed

MDEV-18572 Thread executing DROP TABLE listed twice in SHOW ENGINE INNODB STATUS output

Open

MDEV-18613 Optimization for dropping table

Closed

MDEV-19552 DROP TABLE locks SHOW CREATE TABLE and SELECT with sorting since 10.1.34

Closed

MDEV-22456 Dropping the adaptive hash index may cause DDL to lock up InnoDB

Closed

MDEV-32786 Support NBO for DROP TABLE in Galera

Open

MDEV-9459 Truncate table causes innodb stalls

Closed

MDEV-16796 TRUNCATE TABLE slowdown with innodb_file_per_table=ON

Closed

MDEV-34988 InnoDB locks dict_sys.latch for a long time during ALTER TABLE

Open

links to

MySQL Bug #91977 Dropping Large Table Causes Semaphore Waits; No Other Work Possible

(6 relates to, 1 links to)

Activity

Ascending order - Click to sort in descending order

View 15 older comments

Marko Mäkelä added a comment - 2020-01-28 05:36

MySQL Bug #91977 could report the same problem.

Marko Mäkelä added a comment - 2020-01-28 05:36 MySQL Bug #91977 could report the same problem.

Marko Mäkelä added a comment - 2020-05-04 13:37

There are two issues affecting DROP or rebuild operations of large partitions, tables or databases. (Tables or partitions are internally dropped as part of operations that rebuild the table or partition: TRUNCATE and some forms of OPTIMIZE or ALTER TABLE, sometimes even CREATE INDEX or DROP INDEX.)

The problem reported in this ticket is that the InnoDB data dictionary cache (dict_sys) is being locked while the data file is being deleted, and deleting a large data file may take a lot of time, especially for a fragmented data file.

A related problem affects not only dropping or rebuilding entire tables or partitions, but also DROP INDEX operations that are executed in-place:
~~MDEV-22456~~ Dropping the adaptive hash index may cause DDL to lock up InnoDB

Marko Mäkelä added a comment - 2020-05-04 13:37 There are two issues affecting DROP or rebuild operations of large partitions, tables or databases. (Tables or partitions are internally dropped as part of operations that rebuild the table or partition: TRUNCATE and some forms of OPTIMIZE or ALTER TABLE , sometimes even CREATE INDEX or DROP INDEX .) The problem reported in this ticket is that the InnoDB data dictionary cache ( dict_sys ) is being locked while the data file is being deleted, and deleting a large data file may take a lot of time, especially for a fragmented data file. A related problem affects not only dropping or rebuilding entire tables or partitions, but also DROP INDEX operations that are executed in-place: MDEV-22456 Dropping the adaptive hash index may cause DDL to lock up InnoDB

Marko Mäkelä added a comment - 2020-05-20 09:46

~~MDEV-22456~~ recently removed one bottleneck already. We still have 2 remaining:

InnoDB is invoking unlink() to delete the data file while holding some mutexes. This must be fixed as part of this ticket.

On some file systems (most notably, on Linux), unlink() of a large file can block any concurrent usage of the entire file system. A workaround for this may be implemented in ~~MDEV-18613~~.

Marko Mäkelä added a comment - 2020-05-20 09:46 MDEV-22456 recently removed one bottleneck already. We still have 2 remaining: InnoDB is invoking unlink() to delete the data file while holding some mutexes. This must be fixed as part of this ticket. On some file systems (most notably, on Linux), unlink() of a large file can block any concurrent usage of the entire file system. A workaround for this may be implemented in MDEV-18613 .

Marko Mäkelä added a comment - 2020-05-26 08:50

I think that we should try to rely on delete-on-close semantics. That is, copy the handle to the to-be-deleted file, hold it open across the deletion, and close the handle after releasing the dict_sys latches. As far as I understand, file system recovery should guarantee that the file be deleted.

Marko Mäkelä added a comment - 2020-05-26 08:50 I think that we should try to rely on delete-on-close semantics. That is, copy the handle to the to-be-deleted file, hold it open across the deletion, and close the handle after releasing the dict_sys latches. As far as I understand, file system recovery should guarantee that the file be deleted.

Matthias Leich added a comment - 2020-06-10 12:50

Results of RQG testing

-----------------------------------

origin/bb-10.5-kevgs 3b08527d8c0907b06dad4179b009ca76efdd4aad 2020-06-09T04:34:10+03:00  containing MDEV-8069, Build with ASAN

versus

~ actual 10.5, Build with ASAN

1.  Test battery for broad range coverage

      The tree containing MDEV-8069 performed neither better nor worse than actual 10.5

2. Some new test where concurrent connections fiddle (CREATE OR REPLACE/ALTER/DROP)

     with one 500000 rows table and several 1000 rows tables per connection

     innodb_fatal_semaphore_wait_threshold values tried were 2 and 200

     100 RQG test runs per tree with binaries

      actual 10.5 :  Two times   long semaphore wait hit      the actual value for that was 2s

     MDEV-8069:  Four times  long semaphore wait hit      the actual value for that was 2s

     IMHO the amount of tests does not allow to conclude that MDEV-8069 is  clear better.

Matthias Leich added a comment - 2020-06-10 12:50 Results of RQG testing ----------------------------------- origin/bb-10.5-kevgs 3b08527d8c0907b06dad4179b009ca76efdd4aad 2020-06-09T04:34:10+03:00 containing MDEV-8069, Build with ASAN versus ~ actual 10.5, Build with ASAN 1. Test battery for broad range coverage The tree containing MDEV-8069 performed neither better nor worse than actual 10.5 2. Some new test where concurrent connections fiddle (CREATE OR REPLACE/ALTER/DROP) with one 500000 rows table and several 1000 rows tables per connection innodb_fatal_semaphore_wait_threshold values tried were 2 and 200 100 RQG test runs per tree with binaries actual 10.5 : Two times long semaphore wait hit the actual value for that was 2s MDEV-8069: Four times long semaphore wait hit the actual value for that was 2s IMHO the amount of tests does not allow to conclude that MDEV-8069 is clear better.

MariaDB Server

DROP or rebuild of a large table may lock up InnoDB

Details

Description

Attachments

Issue Links

Activity

People

Dates

Git Integration