[MDEV-6032] Slow DELETE Query (> 7000 seconds) on small table (60k rows) Created: 2014-04-07  Updated: 2014-08-23  Resolved: 2014-08-23

Status: Closed
Project: MariaDB Server
Component/s: None
Affects Version/s: 10.0.11
Fix Version/s: 10.0.11

Type: Bug Priority: Minor
Reporter: Tobias Feldhaus Assignee: Sergei Petrunia
Resolution: Incomplete Votes: 0
Labels: optimizer
Environment:

FreeBSD 9.2-RELEASE-p3 amd64

CPU: Intel(R) Xeon(R) CPU E5-4603 0 @ 2.00GHz (2000.05-MHz K8-class CPU)
FreeBSD/SMP: Multiprocessor System Detected: 32 CPUs

512 GiB Memory

ZFS Filesystem:
8 x SSDs for tank
2 x separate SSDs for ZIL and L2ARC

zpool config:
NAME STATE READ WRITE CKSUM
tank ONLINE 0 0 0
mirror-0 ONLINE 0 0 0
gpt/disk0 ONLINE 0 0 0
gpt/disk1 ONLINE 0 0 0
mirror-1 ONLINE 0 0 0
gpt/disk2 ONLINE 0 0 0
gpt/disk3 ONLINE 0 0 0
mirror-2 ONLINE 0 0 0
gpt/disk4 ONLINE 0 0 0
gpt/disk5 ONLINE 0 0 0
mirror-3 ONLINE 0 0 0
gpt/disk6 ONLINE 0 0 0
gpt/disk7 ONLINE 0 0 0
logs
mirror-4 ONLINE 0 0 0
gpt/zil0 ONLINE 0 0 0
gpt/zil1 ONLINE 0 0 0
cache
gpt/l2arc0 ONLINE 0 0 0
gpt/l2arc1 ONLINE 0 0 0



 Description   

Two tables:

aggregated_data.xx_aggregated_ad_revenue (~1150 rows) and xx_tracker.ad_revenue (~60,000 rows), running the following delete query as a stored procedure on our old machine with Percona Server (GPL), Release rel29.0, Revision 315 (5.5.27-29.0-log) takes seconds up to 1-2 minutes, same query on MariaDB 10 takes more than 7000 seconds.

-- DB: aggregated_data
-- Command: Query
-- Time: 7103
-- State: updating
DELETE FROM aggregated_data.xx_aggregated_ad_revenue
			WHERE network = 'networkA' AND booking_date IN (
				SELECT CONCAT(SUBSTRING(a.booking_date,-4), SUBSTRING(a.booking_date,3,2), SUBSTRING(a.booking_date,1,2)) AS booking_date
				FROM xx_tracker.ad_revenue a
				JOIN
				(
					SELECT CONCAT(SUBSTRING(booking_date,-4), SUBSTRING(booking_date,3,2), SUBSTRING(booking_date,1,2)) AS booking_date, MAX(created_ts) AS max_ts
					FROM xx_tracker.ad_revenue b
					WHERE network = 'networkA'
					GROUP BY b.booking_date
				) AS i
				ON CONCAT(SUBSTRING(a.booking_date,-4), SUBSTRING(a.booking_date,3,2), SUBSTRING(a.booking_date,1,2)) = i.booking_date AND i.max_ts = a.created_ts
				WHERE a.created_ts >= NOW() - INTERVAL 1 DAY
			)

Even though the same indices exist on Percona and MariaDB (tables have been created by dumping the definitions are exactly the same) - MariaDB is using a key (EXPLAIN of subselect: http://puu.sh/7Zjuh.png), where Percona does not (EXPLAIN of subselect: http://puu.sh/7ZjuT.png - but Percona is way faster.
EXPLAIN DELETE for MariaDB 10: http://puu.sh/7ZjDl.png

Indices on aggregated_data.xx_aggregated_ad_revenue:
1: PRIMARY id, created_date
2: created_date
3: network

Indices on xx_tracker.ad_revenue:
1. PRIMARY id, created_date
2. created_date
3. network

Running the Subselect of the DELETE alone is fast (< 1-2 seconds). The query is a stored procedure which spends 95% of the time in the "Sending data" state.



 Comments   
Comment by Tobias Feldhaus [ 2014-04-07 ]

Dropping the index on the (network) column in xx_tracker.ad_revenue and creating an index on (booking_date, network) fixed the performance problem and leads to the following query plan:

http://puu.sh/7Zw3q.png

Comment by Sergei Petrunia [ 2014-04-09 ]

If dropping/creating an index has changed the query plan, this is an indication that poor query plan was chosen because the storage engine (I guess this was InnoDB?) provided the optimizer with bad estimates.

Then, one may ask why InnoDB in Percona didn't return bad estimates while InnoDB in MariaDB did. I have no answer for this.

Comment by Sergei Petrunia [ 2014-04-09 ]

A single-table DELETE ... WHERE x IN (...) is unable to make use of subquery optimizations (common missing feature of MariaDB 5.5+ and MySQL 5.6+).

james_woods, could you try a "fake" multi-table DELETE? Instead of

DELETE FROM aggregated_data.xx_aggregated_ad_revenue WHERE

run

DELETE aggregated_data.xx_aggregated_ad_revenue FROM aggregated_data.xx_aggregated_ad_revenue WHERE

Please first note the EXPLAIN and then compare execution speed..

Comment by Sergei Petrunia [ 2014-04-09 ]

Another question: is the issue repeatable? If one takes a mysqldump and loads it into server, do they get a good query plan or a bad one? (Or, is there any way to reliably get a bad query plan? If yes, this will make it possible to debug)

Comment by Sergei Petrunia [ 2014-04-09 ]

Yet another thing: one of subqueries has "GROUP BY b.booking_date". With current optimizer, GROUP BY is going prevent FROM-subquery from being merged into the upper query. Can you remove the "GROUP BY" part and see what happens?

(And this too is unrelated to the question of why it worked bad before dropping an index and started work better after re-creating it. However, it could be that there is little one could do about that problem, so it's better to work around it)

Comment by Tobias Feldhaus [ 2014-08-22 ]

Sorry for not coming back to this issue for so long. We ended up optimizing the query and getting rid of non-date types in the table (to no longer need to parse through varchar fields via the SUBSTRING field). This made the query run in under a minute.

Comment by Sergei Petrunia [ 2014-08-23 ]

Good to know there was some solution. I guess, it's not possible to get the dataset that demonstrated the problem anymore? In this case, it would be hard to analyze if there was something we could improve in the optimizer, so I have to close this issue.

Generated at Thu Feb 08 07:08:51 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.