[MDEV-30385] Myisam / aria multithread contention Created: 2023-01-11  Updated: 2023-02-04

Status: Open
Project: MariaDB Server
Component/s: Storage Engine - Aria, Storage Engine - MyISAM
Affects Version/s: 10.5.11, 10.5.16
Fix Version/s: None

Type: Bug Priority: Major
Reporter: Slawomir Pryczek Assignee: Lena Startseva
Resolution: Unresolved Votes: 0
Labels: None
Environment:

Debian / FC


Attachments: File create_tables.sql     PNG File img_run_parallel.png     PNG File img_run_parallel_aria.png     PNG File img_run_parallel_myisam.png     PNG File img_run_single_query.png     File run_parallel.sh     File run_parallel_aria.sh     File run_parallel_myisam.sh     File run_serial.sh     File run_single_query.sh    

 Description   

Hi Guys,
found out some rather unexpected behaviour when trying to optimize some queries. Basically i have ID->LABEL mapping tables and another large statistical innodb table with several billion rows.

When trying to optimize it, I just changed one large query to get data for eg. single month, to 30 queries, running it 10 at a time. Then the data is merged on the client. Works around 10x faster.

Interesting thing is that it only works when all tables are innodb or toku. For myisam and aria when I run 10 threads simultaneously, only the load is ~x8 the normal... the execution time is as slow as if these 10 queries were run in serial not in parallel. For aria results are around 20% better than myisam. If I use innodb it's close to 10 times (1000%) faster.

Unfortunately can't do testing using recent server version now, as the main table is close to 6b rows and over half TB in size. Could build some test case if anyone is interested in looking into this. Question is if myisam/aria are still developed or is it abandonware and we should just use innodb for everything?

foreach ($dates as $date) {
			$data = dbpool::fill_array("SELECT _subnet, user_id, sum(clicks) as clicks, sum(spending) as spending
				FROM stats_creative
				INNER JOIN {$this->tmp_domains} ON domain_id = {$this->tmp_domains}.id
				INNER JOIN {$this->tmp_creative_to_uid} ON creative_id = {$this->tmp_creative_to_uid}.id
				WHERE date IN ($$)
				GROUP BY _subnet, user_id", $date);
		}



 Comments   
Comment by Sergei Golubchik [ 2023-01-17 ]

To start from the question, myisam/aria are supported and maintained, so not abandonware, bugs are being fixed. Not much of the new development is done there, though, but those queries of yours shouldn't be serialized, so it looks like a bug that needs to be fixed.

How your tables are defined, can you paste the result of SHOW CREATE TABLE for them?

Comment by Slawomir Pryczek [ 2023-01-19 ]

Hi Sergei, thanks for looking into this. Great that other engines are still being developed.

Have prepared a test case, so you can just run create_tables SQL and then you can use shell scripts to run the stats in serial and in parallel. That was on different server so characteristics are little different but still similar.

Innodb is much faster (~x5), and is taking advantage of multiple cores (47->11s runtime) when converting just the creative table which has only 50k rows to myisam or aria it gets x2.5 slower (close to 2 minutes) while still eating around 14 CPU cores.

myisam and aria are faster than innodb when run on single core only (~44s).

Parallel tests are using 15 threads at once.

Comment by Slawomir Pryczek [ 2023-01-19 ]

Specs for the numbers:
AMD Ryzen 9 5950X 16-Core Processor
Distrib 10.6.3-MariaDB, for debian-linux-gnu (x86_64) using readline 5.2

Not a huge issue for me and this test was not optimized, so just x5 increase with multiple queries but probably fixing it will improve server multicore perf in more cases...

Generated at Thu Feb 08 10:15:51 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.