[MDEV-34703] Innodb bulk load : high IO, crashes on Linux, OOM on Linux, as tested with LOAD DATA INFILE - Jira

Details

Type: Bug
Status: Closed (View Workflow)
Priority: Critical
Resolution: Fixed
Affects Version/s: 10.11, 11.4, 11.6(EOL)
Fix Version/s: 10.11.11, 11.4.5, 11.7.2
Component/s: Storage Engine - InnoDB
Labels:
None

Description

Found by accident, trying to understand why sysbench loads its data faster than
mariadb-import --parallel. Mostly likely present before 11.6, did not try.

TLDR: for me, Bulk load turns out to be 1.5 times slower (on fast NVME), and more than 15 times more IO intensive.
I tried on Windows, and it is just slow, and this case is below

I tried on Linux, hoping that maybe BULK relies on temp directory being in memory, but got crash (tmpdir exhausted), or when
I allowed tmpfs to be 32GB (whole memory), then server process got killed, probably Linux peculiarity the OOM killer.
If tmpdir points to persistent disk, effects are about the same on Linux as they are on Windows

Below, my test case:

Step 0 - server parameters

We run load "in-memory", and both bufferpool and redo log will be larger than the IBD file.

[mysqld]

innodb_log_file_size=20G

innodb_buffer_pool_size=20G

innodb-buffer-pool-load-at-startup=OFF

innodb_buffer_pool_dump_at_shutdown=OFF

innodb_stats_auto_recalc=0

Step 1. Using sysbench, load a single table with 50M rows (~11GB ibd file)

sysbench oltp_read_write --mysql-user=root  --tables=1 --table-size=50000000 --threads=1  prepare

Elapsed time (seconds): 273.673
Process IO Read Bytes : 14387837038
Process IO Write Bytes : 19291281550

Step 2: backup sbtest.sbtest1 table

mariadb -uroot -e "select * into outfile 'C:/work/sbtest1.data' from sbtest.sbtest1"

Elapsed time (seconds): 36.482

restart server

Step 3: Load this data again, without bulk optimization

mariadb -uroot -e "truncate sbtest.sbtest1; load data infile 'C:/work/sbtest1.data'  into table sbtest.sbtest1"

Elapsed time (seconds): 240.996
Process Read Bytes 9916879781
Process Write Bytes 15149246158

Restart server again,

Step 4: Load the same data once more, with bulk optimization (enforced by setting unique_checks=0,foreign_key_checks=0)

mariadb -uroot -e "truncate sbtest.sbtest1; set unique_checks=0,foreign_key_checks=0; load data infile 'C:/work/sbtest1.data'  into table sbtest.sbtest1"

This is the step that is very slow, pegs the disk at ~1GB reads and writes, and this is where it crashes on Linux, if /tmp is memory.
When it is slow, CPU is at about 3%, busy with some reads and writes to temp. files.

Elapsed time (seconds): 371.664
Process Read Bytes: 167433868189
Process Write Bytes: 170475124430

Compare execution times and IO usage from Step 3 (without bulk) and Step 4(with bulk) , and sysbench "prepare"

	Execution Time	IO-read Gbytes	IO-write Gbytes
Sysbench prepare	273	13.4	17.97
LOAD DATA NO-Bulk	240	9.24	14.11
LOAD DATA Bulk	371	155.93	158.77

Attachments

- Sort By Name
- Sort By Date
- Ascending
- Descending
- Thumbnails
- List
- Download All

bulk_flame.png
60 kB
2024-08-05 07:59

Issue Links

duplicates

MDEV-34719 Disable purge for LOAD DATA INFILE into empty table

Closed

relates to

MDEV-24621 In bulk insert, pre-sort and build indexes one page at a time

Closed

MDEV-34719 Disable purge for LOAD DATA INFILE into empty table

Closed

MDEV-34832 Support adding AUTO_INCREMENT flag to existing numeric using INPLACE

Open

MDEV-33188 Enhance mariadb-dump and mariadb-import capabilities similar to MyDumper

Stalled

Activity

Ascending order - Click to sort in descending order

Marko Mäkelä added a comment - 2024-08-05 08:35

Did you try to specify a larger innodb_sort_buffer_size to prevent the use of temporary files for merge sorting?

If the data is typically sorted by PRIMARY KEY, I think that we would benefit from a parameter or statement attribute that specifies this, so that the bulk load could avoid sorting the data, and simply throw an error when unsorted data is encountered. Any secondary indexes would still have to use a sort buffer. For ALTER TABLE and OPTIMIZE TABLE there is some logic in innobase_pk_order_preserved() to skip the unnecessary sorting.

Marko Mäkelä added a comment - 2024-08-05 08:35 Did you try to specify a larger innodb_sort_buffer_size to prevent the use of temporary files for merge sorting? If the data is typically sorted by PRIMARY KEY , I think that we would benefit from a parameter or statement attribute that specifies this, so that the bulk load could avoid sorting the data, and simply throw an error when unsorted data is encountered. Any secondary indexes would still have to use a sort buffer. For ALTER TABLE and OPTIMIZE TABLE there is some logic in innobase_pk_order_preserved() to skip the unnecessary sorting.

Marko Mäkelä added a comment - 2024-08-29 06:33

As noted in MDEV-34832, another bottleneck are AUTO_INCREMENT columns.

Marko Mäkelä added a comment - 2024-08-29 06:33 As noted in MDEV-34832 , another bottleneck are AUTO_INCREMENT columns.

Marko Mäkelä added a comment - 2024-10-08 14:03

Would providing a mechanism to enable only ~~MDEV-515~~ but not ~~MDEV-24621~~ for LOAD statements address both this and ~~MDEV-34719~~?

Marko Mäkelä added a comment - 2024-10-08 14:03 Would providing a mechanism to enable only MDEV-515 but not MDEV-24621 for LOAD statements address both this and MDEV-34719 ?

Vladislav Vaintroub added a comment - 2024-10-08 14:21

marko , I think it would, for my use case.

I'm still wondering how bulk is supposed with mysqldump style inserts. Is ~~MDEV-24621~~ sufficiently well tested, on tens of GB inputs files?

Vladislav Vaintroub added a comment - 2024-10-08 14:21 marko , I think it would, for my use case. I'm still wondering how bulk is supposed with mysqldump style inserts. Is MDEV-24621 sufficiently well tested, on tens of GB inputs files?

Marko Mäkelä added a comment - 2024-10-09 05:03

Unfortunately, to my knowledge there is no user interface for claiming that the data is already sorted. If we had that, we could skip the merge sorting step and proceed straight to building the clustered index B-tree page by page. An optimization to skip sorting for the PRIMARY KEY is only implemented for ha_innobase::inplace_alter_table() when it is rebuilding the table in such a way that the ordering of records is not changing. This even allows some redefinition of the PRIMARY KEY, such as removing the last columns of a multi-column PRIMARY KEY.

Marko Mäkelä added a comment - 2024-10-09 05:03 Unfortunately, to my knowledge there is no user interface for claiming that the data is already sorted. If we had that, we could skip the merge sorting step and proceed straight to building the clustered index B-tree page by page. An optimization to skip sorting for the PRIMARY KEY is only implemented for ha_innobase::inplace_alter_table() when it is rebuilding the table in such a way that the ordering of records is not changing. This even allows some redefinition of the PRIMARY KEY , such as removing the last columns of a multi-column PRIMARY KEY .

Vladislav Vaintroub added a comment - 2024-10-09 06:40

Ok, so , in practice, this ~~MDEV-24621~~ would turn to pessimization, and an attempt to force it switching autocommit off in mysqldump will turn out in a disaster in any sizable dump, right?

Vladislav Vaintroub added a comment - 2024-10-09 06:40 Ok, so , in practice, this MDEV-24621 would turn to pessimization, and an attempt to force it switching autocommit off in mysqldump will turn out in a disaster in any sizable dump, right?

Marko Mäkelä added a comment - 2024-12-03 13:21

I think that to improve this, we should do the following:

Disable the sorting for the PRIMARY KEY within LOAD DATA, similar to what we do in some forms of ALTER TABLE (innobase_pk_order_preserved()).
Ensure that the debug assertion in PageBulk::insertPage() for misordered records will not be tripped.
- That is, make the bulk insert tolerate unsorted data in some way.
- Maybe, if unsorted data is encountered, finish the bulk insert part and continue by inserting records one by one.
- There must not be any extra record comparisons for any other operation than {{LOAD DATA}, and not for any secondary index.

Marko Mäkelä added a comment - 2024-12-03 13:21 I think that to improve this, we should do the following: Disable the sorting for the PRIMARY KEY within LOAD DATA , similar to what we do in some forms of ALTER TABLE ( innobase_pk_order_preserved() ). Ensure that the debug assertion in PageBulk::insertPage() for misordered records will not be tripped. That is, make the bulk insert tolerate unsorted data in some way. Maybe, if unsorted data is encountered, finish the bulk insert part and continue by inserting records one by one. There must not be any extra record comparisons for any other operation than {{LOAD DATA}, and not for any secondary index.

People

Assignee:: Thirunarayanan Balathandayuthapani

Reporter:: Vladislav Vaintroub

Votes:: 0 Vote for this issue

Watchers:: 6 Start watching this issue

Dates

Created:: 2024-08-05 06:57

Updated:: 2025-01-21 11:15

Resolved:: 2025-01-16 08:22

Git Integration

Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.

MariaDB Server

Details

Description

Step 0 - server parameters

Step 1. Using sysbench, load a single table with 50M rows (~11GB ibd file)

Step 2: backup sbtest.sbtest1 table

Step 3: Load this data again, without bulk optimization

Step 4: Load the same data once more, with bulk optimization (enforced by setting unique_checks=0,foreign_key_checks=0)

Compare execution times and IO usage from Step 3 (without bulk) and Step 4(with bulk) , and sysbench "prepare"

Attachments

Attachments

Issue Links

Activity

People

Dates

Git Integration