[MDEV-20555] Transactional Aria tables are slower than MyISAM on inserts. Created: 2019-09-10  Updated: 2023-03-27  Resolved: 2023-03-27

Status: Closed
Project: MariaDB Server
Component/s: Storage Engine - Aria
Affects Version/s: 10.1.29, 10.0.38, 10.2.26, 10.1.41, 10.3.15, 10.3.17, 10.4.7
Fix Version/s: N/A

Type: Bug Priority: Major
Reporter: Juan Assignee: Michael Widenius
Resolution: Won't Fix Votes: 1
Labels: None
Environment:

CentOS 7.5 & SuSE Leap 15.0


Issue Links:
Relates
relates to MDEV-23326 aria TRANSACTIONAL=1 significantly sl... Closed

 Description   

Performing bulk inserts into identical tables using MyISAM, Aria, and InnoDB, the performance of Aria when TRANSACTIONAL=1 (the default) is over 100x slower than MyISAM or InnoDB (single transaction) on 10.4.7.

When TRANSACTIONAL=0, performance is approximately 75% of MyISAM & InnoDB

Setting ROW_FORMAT=FIXED results in a 20% improvement in 10.1, but from 10.3 forward ROW_FORMAT=PAGE loses very little to FIXED.

The problem is that not only does this make Aria with TRANSACTIONAL=1 almost unusable for updates, but since 10.4 uses Aria with TRANSACTIONAL=1 for system tables, system operations in 10.4 are seriously affected. Adding users, for example, is 52% slower in 10.3 than in 10.1, but 511% slower in 10.4.

create table simple (
	id int(10) unsigned not null auto_increment,
	modified timestamp NOT NULL DEFAULT current_timestamp() ON UPDATE current_timestamp(),
	val int(10) unsigned DEFAULT NULL,
	blah varchar(511) DEFAULT NULL,
	PRIMARY KEY (id),
	KEY simple_modified (modified),
	KEY simple_blah100 (blah(100)) 
) ENGINE=
default charset=utf8mb4 collate utf8mb4_unicode_ci;
 
followed by:
for i in {1..1000000};
do 
	echo "insert into simple(val, blah) 
	values( $RANDOM$RANDOM, '$RANDOM$RANDOM$RANDOM$RANDOM$RANDOM$RANDOM$RANDOM$RANDOM$RANDOM$RANDOM');">>popsimp.sql
done
 
cat popsimp.sql | pv -bat | mysql -ABN test
 
MyISAM / 10.1.37
	1.52MiB/s
	
Aria / 10.1.37
	23.6KiB/s
 
MyISAM / 10.3.15
	1.47MiB/s
 
Aria / 10.3.15
	21.8KiB/s
 
Aria / 10.1.37 / ROW_FORMAT=PAGE TRANSACTIONAL=0
	1.03MiB/s
 
Aria / 10.1.37 / ROW_FORMAT=FIXED TRANSACTIONAL=0
	1.20MiB/s
 
InnoDB / 10.3.15
	1.74MiB/s
 
Aria / 10.3.15 / ROW_FORMAT=FIXED TRANSACTIONAL=0
	1.30MiB/s
 
Aria / 10.3.15 / ROW_FORMAT=PAGE TRANSACTIONAL=0
	1.27MiB/s
 
MyISAM / 10.4.7
	2.13MiB/s
 
Aria / 10.4.7
	20.9KiB/s
 
Aria / 10.4.7 / ROW_FORMAT=FIXED TRANSACTIONAL=0
	1.55MiB/s
 
Aria / 10.4.7 / ROW_FORMAT=PAGE TRANSACTIONAL=0
	1.31MiB/s
 
InnoDB / 10.4.7
	1.79MiB/s
 
InnoDB  / 10.4.7 / Single Transaction
	2.17MiB/s
 
~# time for i in {1..1000}; do mysql -Ae "grant all privileges on *.* to 'test'@'127.0.0.1' identified by 'terriblylongpassword-weo48u' with grant option; flush privileges; drop user 'test'@'127.0.0.1';"; done
 
10.1.29:
real	0m3.388s
user	0m2.475s
sys	0m0.463s
 
10.3.15:
real	0m5.177s
user	0m3.647s
sys	0m1.098s
 
10.4.7:
real	0m17.335s
user	0m5.412s
sys	0m3.418s

*All tests performed on 1GB RAM VMs with aria_pagecache_buffer_size=300M, key_buffer_size=300M, and innodb-buffer-pool-size=300M



 Comments   
Comment by Sergei Golubchik [ 2019-09-14 ]

juan.vera, you must have something really misconfigured or broken in your system. While in my first quick test Aria is, indeed, slower than MySQL, it's nowhere near 100x. Still, it is slower, so let's keep this bug open to investigate it.

I've run the following mtr test on 10.4:

create table simple (
        id int(10) unsigned not null auto_increment,
        modified timestamp not null default current_timestamp() on update current_timestamp(),
        val int(10) unsigned default null,
        blah varchar(511) default null,
        primary key (id),
        key simple_modified (modified),
        key simple_blah100 (blah(100)) 
) engine=aria default charset=utf8mb4 collate utf8mb4_unicode_ci transactional=1;
--disable_query_log
--let $a=1000000
--while ($a) {
insert simple(val, blah) values( RAND()*10000000, repeat(rand(), 10));
--dec $a
}
--enable_query_log
drop table simple;

with the opt file

--aria-pagecache-buffer-size=300M
--key-buffer-size=300M
--loose-innodb-buffer-pool-size=300M

Aria, transactional=1 405.590
MyISAM 189.696
Aria, transactional=0 376.955
InnoDB 11814.816

apparently InnoDB isn't tuned for performance in mtr

Comment by Juan [ 2019-09-17 ]

serg Your numbers do not coincide with the tests I ran on all these versions on both VirtualBox VMs and Digital Ocean instances. The instances and configurations are not haphazard. I went ahead & re-ran the tests on 10.3 & 10.4 for your peace of mind. The results are the same as before although there is a clear shift from better MyISAM & InnoDB performance to better Aria T=1 performance which seems to be a function of the paravirtualization interface - I don't know if this has to do with faster memory, but the differences are not as dramatically attenuated as yours:

Type Ver MyISAM Aria T=0 Aria T=1 InnoDB
VB 10.3.17 1.35MB/s 988KB/s 15.4KB/s 1.37MB/s
VB 10.4.7 1.37MB/s 1016KB/s 15.2KB/s 1.37MB/s
DO 10.3.17 733KB/s 545KB/s 79KB/s 752KB/s
DO 10.4.7 389KB/s 498KB/s 89KB/s 695KB/s

The instances are all correct and comparable in terms of I/O, RAM, and CPU: 1GB RAM 1 CPU CentOS 7.5 instances on VirtualBox running on an Ubuntu 19.04 Core i9 32GB RAM NVMe test machine, and 1GB RAM 1 CPU CentOS 7.5 instances on DigitalOcean:

	local VirtualBox VMs (VB) ioping:
		--- / (xfs /dev/dm-0) ioping statistics ---
		9 requests completed in 4.29 ms, 36 KiB read, 2.10 k iops, 8.19 MiB/s
		generated 10 requests in 9.00 s, 40 KiB, 1 iops, 4.44 KiB/s
		min/avg/max/mdev = 358.5 us / 477.1 us / 550.6 us / 55.5 us
 
	DigitaOcean AMS3/1G  (DO)ioping:
		--- / (xfs /dev/vda1) ioping statistics ---
		9 requests completed in 4.50 ms, 36 KiB read, 2 k iops, 7.81 MiB/s
		generated 10 requests in 9.00 s, 40 KiB, 1 iops, 4.44 KiB/s
		min/avg/max/mdev = 397.7 us / 499.9 us / 635.4 us / 77.2 us
 
	VB fio r/w:
		read: IOPS=23.4k, BW=91.5MiB/s (95.9MB/s)(3070MiB/33565msec)
		write: IOPS=7825, BW=30.6MiB/s (32.1MB/s)(1026MiB/33565msec)
	
	DO fio r/w:
		read: IOPS=16.5k, BW=64.4MiB/s (67.5MB/s)(3070MiB/47702msec)
		write: IOPS=5506, BW=21.5MiB/s (22.6MB/s)(1026MiB/47702msec)

and tests are done both with the original testing configuration:

[mysqld]
aria_pagecache_buffer_size=300M
key_buffer_size=300M
innodb-buffer-pool-size=300M
 
innodb-flush-log-at-trx-commit=0
sync-binlog=1000
transaction-isolation=READ-COMMITTED
 
log_bin
log_basename=cluster
binlog_format=ROW
binlog_row_image=MINIMAL
#log_slave_updates=1
expire_logs_days=2
default-storage-engine=innodb
innodb_autoinc_lock_mode=2
innodb_fast_shutdown=0
bind-address=0.0.0.0
server-id=1
gtid-domain-id=10
 
log-error=/var/log/mariadb-err.log
log-warnings=3
 
gtid_strict_mode=0

and with a configuration reflecting only the parameters you listed above:

[mysqld]
aria_pagecache_buffer_size=300M
key_buffer_size=300M
innodb-buffer-pool-size=300M

which produced consistent results with each other and with prior results using either server.cnf:

10.3.17 Transactional=0: 1.30MiB/s
10.3.17 Transactional=1: 15.6KiB/s

10.4.7 Transactional=0: 1.32MiB/s
10.4.7 Transactional=1: 15.6KiB/s

Comment by Daniel Black [ 2020-08-11 ]

Repeating a restructure of MDEV-23326 tests against mysql_tzinfo_to_sql

Alternate table structures created with:

create altername timezone data tables in new database

create database if not exists aria_non_transactional;
use aria_non_transactional;
 
create table time_zone                 like mysql.time_zone                ;
create table time_zone_leap_second     like mysql.time_zone_leap_second    ;
create table time_zone_name            like mysql.time_zone_name           ;
create table time_zone_transition      like mysql.time_zone_transition     ;
create table time_zone_transition_type like mysql.time_zone_transition_type;
 
alter table time_zone                 transactional=0;
alter table time_zone_leap_second     transactional=0;
alter table time_zone_name            transactional=0;
alter table time_zone_transition      transactional=0;
alter table time_zone_transition_type transactional=0;

Includes MDEV-23440 patch to add a transaction wrapper around the insert statements.

default server statup

scripts/mysql_install_db --no-defaults --srcdir=${OLDPWD} --builddir=${PWD} --datadir=/home/dan/datadir/MDEV-20555_test 
sql/mysqld --no-defaults --skip-networking --datadir=/home/dan/datadir/MDEV-20555_test --socket /tmp/s.sock --lc-messages-dir=${PWD}/sql/share --plugin-dir=./storage/rocksdb --verbose

test execution

for db in mysql  aria_non_transactional  innodb  rocksdb myisam; do echo $db ;  time  sql/mysql_tzinfo_to_sql  /usr/share/zoneinfo/  | client/mysql -S /tmp/s.sock $db ; done

Rotational old 5400rpm WD disk mysql_tzinfo_to_sql Performance for storage engines

datadir spec -old disk

=== START OF INFORMATION SECTION ===
Model Family:     Western Digital Green
Device Model:     WDC WD40EZRX-00SPEB0
Serial Number:    WD-WCC4E5000UCH
LU WWN Device Id: 5 0014ee 2b568678d
Firmware Version: 80.00A80
User Capacity:    4,000,787,030,016 bytes [4.00 TB]
Sector Sizes:     512 bytes logical, 4096 bytes physical
Rotation Rate:    5400 rpm

MariaDB-10.4.14: Real Time Seconds to complete. mysql_tzinfo_to_sql load

Table engine definition Time to load
Aria T=1 17m44.511s
Aria T=0 0m1.588s
Innodb 0m24.773s
RocksDB 0m4.195s
MyISAM 0m4.195s

postcheck

 for db in mysql  aria_non_transactional  innodb  rocksdb myisam; do echo $db ;  client/mysql -S /tmp/s.sock -e "show create table $db.time_zone" | grep -i engine  ; done
mysql
time_zone	CREATE TABLE `time_zone` (\n  `Time_zone_id` int(10) unsigned NOT NULL AUTO_INCREMENT,\n  `Use_leap_seconds` enum('Y','N') NOT NULL DEFAULT 'N',\n  PRIMARY KEY (`Time_zone_id`)\n) ENGINE=Aria AUTO_INCREMENT=1784 DEFAULT CHARSET=utf8 PAGE_CHECKSUM=1 TRANSACTIONAL=1 COMMENT='Time zones'
aria_non_transactional
time_zone	CREATE TABLE `time_zone` (\n  `Time_zone_id` int(10) unsigned NOT NULL AUTO_INCREMENT,\n  `Use_leap_seconds` enum('Y','N') NOT NULL DEFAULT 'N',\n  PRIMARY KEY (`Time_zone_id`)\n) ENGINE=Aria AUTO_INCREMENT=1784 DEFAULT CHARSET=utf8 PAGE_CHECKSUM=1 TRANSACTIONAL=0 COMMENT='Time zones'
innodb
time_zone	CREATE TABLE `time_zone` (\n  `Time_zone_id` int(10) unsigned NOT NULL AUTO_INCREMENT,\n  `Use_leap_seconds` enum('Y','N') NOT NULL DEFAULT 'N',\n  PRIMARY KEY (`Time_zone_id`)\n) ENGINE=InnoDB AUTO_INCREMENT=1784 DEFAULT CHARSET=utf8 PAGE_CHECKSUM=1 TRANSACTIONAL=1 COMMENT='Time zones'
rocksdb
time_zone	CREATE TABLE `time_zone` (\n  `Time_zone_id` int(10) unsigned NOT NULL AUTO_INCREMENT,\n  `Use_leap_seconds` enum('Y','N') NOT NULL DEFAULT 'N',\n  PRIMARY KEY (`Time_zone_id`)\n) ENGINE=ROCKSDB AUTO_INCREMENT=1784 DEFAULT CHARSET=utf8 PAGE_CHECKSUM=1 TRANSACTIONAL=1 COMMENT='Time zones'
myisam
time_zone	CREATE TABLE `time_zone` (\n  `Time_zone_id` int(10) unsigned NOT NULL AUTO_INCREMENT,\n  `Use_leap_seconds` enum('Y','N') NOT NULL DEFAULT 'N',\n  PRIMARY KEY (`Time_zone_id`)\n) ENGINE=MyISAM AUTO_INCREMENT=1784 DEFAULT CHARSET=utf8 PAGE_CHECKSUM=1 TRANSACTIONAL=0 COMMENT='Time zones'

NVME M.2 SSD mysql_tzinfo_to_sql Performance for storage engines

Model Number:                       Samsung SSD 970 EVO Plus 500GB
Firmware Version:                   1B2QEXM7

Table engine definition Time to load
Aria T=1 0m15.714s
Aria T=0 0m0.813s
Innodb 0m1.634s
Innodb (without MDEV-23440 patch) 0m26.869s
RocksDB 0m1.641s
MyISAM 0m0.786s

So performance is better, but Aria transaction is still very slow as our users noticed - https://github.com/docker-library/mariadb/issues/262

MDEV-23326 show aria T=1 has significant time in biolatency for Sync-Write IO operations. How hard is it to make Aria recognise transactions and perform check-pointing only on COMMIT (or internal buffers being full?)

Comment by Daniel Black [ 2020-08-11 ]

perf record -g  -e syscalls:sys_enter_fdatasync  -a -o /tmp/aria-mysql-tz.perf  -- sleep  15
-   98.78%    98.78%  fd: 0x00000005                                                                                                                                                                         ▒
     fdatasync                                                                                                                                                                                               ▒
     0.46%     0.46%  fd: 0x0000003c                                                                                                                                                                         ▒
     0.33%     0.33%  fd: 0x0000003d                                                                                                                                                                         ▒
     0.13%     0.13%  fd: 0x00000040                                                                                                                                                                         ▒
     0.11%     0.11%  fd: 0x00000041                                                                                                                                                                         ▒
     0.05%     0.05%  fd: 0x0000003e                                                                                                                                                                         ▒
     0.05%     0.05%  fd: 0x0000003f                                                                                                                                                                         ▒
     0.02%     0.02%  fd: 0x00000003                                                                                                                                                                         ▒
     0.02%     0.02%  fd: 0x00000038                                                                                                                                                                         ◆
     0.02%     0.02%  fd: 0x00000039                                                                                                                                                                         ▒
     0.02%     0.02%  fd: 0x0000003a                                                                                                                                                                         ▒
     0.02%     0.02%  fd: 0x0000003b                                                                                                                                                                         ▒
     0.02%     0.02%  fd: 0x00000042    

file descriptor 5 is lrwx------. 1 dan dan 64 Aug 11 18:38 5 -> /home/dan/datadir/MDEV-20555_test/aria_log.00000001

unfortunately no stack trace upwards.

Comment by Daniel Black [ 2020-08-24 ]

Tip from monty, LOCK TABLES ... WRITE will defer the sync until the UNLOCK TABLES and allow a faster insert where there are multiple INSERT statements on the same table(s) in a sequence that need to be updated.

Comment by Daniel Black [ 2020-11-07 ]

The summary is the improvement to performance here needs to reduce the number of fsync for large series of inserts. fsyncs of different hardware sectors may have a benefit per one of the MariaDB Server Fest 2020 talks.

Alternately implementing Aria with transaction rollback and becoming a full MVCC compliant system will allow syncs to happen less often.

Comment by Michael Widenius [ 2023-03-27 ]

This thing can be closed.
If one uses single row transactions, then thing will of course be slow, not much that can be done about that.

Generated at Thu Feb 08 09:00:22 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.