[MDEV-29035] review InnoDB page fill factor - Jira

Details

Type: Task
Status: Open (View Workflow)
Priority: Minor
Resolution: Unresolved
Fix Version/s: None
Component/s: Storage Engine - InnoDB
Labels:
None

Description

Tests undertaken in PERF-206 show significant performance improvements of 30% for sysbench-tpcc when INSERT uses a column default value instead of NULL. The reason is that the row is UPDATEd later, filling proper values for the columns inserted as NULL. This leads to a growing row image which eventually leads to a page split. The page split in turn leads to mutex contention.

The issue has been observed for MySQL as well, as explained here

We should review if the hardcoded page fill factor of 15/16 for InnoDB can be chosen better. Other DBMS use a lower fill factor of i.e.70% (PostgreSQL) or make it configurable (Oracle DB, configurable per table).

Attachments

- Sort By Name
- Sort By Date
- Ascending
- Descending
- Thumbnails
- List
- Download All

fill_factor_dss.png
2022-08-16 12:13
27 kB
Axel Schwenke
sysbench_medium_comp.png
2022-07-05 17:04
10 kB
Axel Schwenke
sysbench_medium_comp2.png
2022-07-07 07:43
13 kB
Axel Schwenke
sysbench_read_write.png
2022-08-16 12:13
13 kB
Axel Schwenke

Issue Links

relates to

MDEV-26740 Inplace alter rebuild increases file size

Closed

MDEV-34096 REPAIR sometimes de-optimizes an InnoDB table

Open

Activity

Ascending order - Click to sort in descending order

Axel Schwenke added a comment - 2022-07-05 18:12 - edited

I just found a KB entry about a global variable innodb_fill_factor. But it says it's only used for bulk operations. And indeed grepping for fill_factor in the InnoDB code gives only meaningful hits in btr0bulk.cc. What is more: the variable defaults to 100. WTF?

I think it should be also used for INSERT to the tail of a table. If such an INSERT would fill the page beyond the innodb_fill_factor, the row should go to a new page instead. This would cover most real world usecases.

Axel Schwenke added a comment - 2022-07-05 18:12 - edited I just found a KB entry about a global variable innodb_fill_factor . But it says it's only used for bulk operations. And indeed grepping for fill_factor in the InnoDB code gives only meaningful hits in btr0bulk.cc . What is more: the variable defaults to 100. WTF? I think it should be also used for INSERT to the tail of a table. If such an INSERT would fill the page beyond the innodb_fill_factor , the row should go to a new page instead. This would cover most real world usecases.

Axel Schwenke added a comment - 2022-07-07 07:47

This shows the impact of setting innodb_fill_factor=90 instead of leaving it at the default of 100. For INSERT NULL the effect is clearly visible even though it impacts only the preloaded data (and not rows inserted during the benchmark).

Axel Schwenke added a comment - 2022-07-07 07:47 This shows the impact of setting innodb_fill_factor=90 instead of leaving it at the default of 100. For INSERT NULL the effect is clearly visible even though it impacts only the preloaded data (and not rows inserted during the benchmark).

Axel Schwenke added a comment - 2022-07-07 15:04 - edited

The data set size as reported by INFORMATION_SCHEMA for 1000 (10x 100) warehouses is not grown as one would expect:

fill_factor=100		fill_factor=90
INSERT NULL	INSERT DEFAULT	INSERT NULL	INSERT DEFAULT
72,21	72,94	72,05	72,48

The variant using DEFAULT is indeed a little bigger, but the difference between a fill factor of 90% vs. 100% is not visible. It should also be in the order of 10%.

This could be a side effect of the measuring method. It uses:

SELECT SUM(DATA_LENGTH)/1024/1024/1024

FROM INFORMATION_SCHEMA.TABLES WHERE TABLE_SCHEMA = '${SCHEMA}'

although KB says:
For InnoDB/XtraDB, the index size, in pages, multiplied by the page size.
so if the smaller fill factor would result in more pages, it should show here.

Axel Schwenke added a comment - 2022-07-07 15:04 - edited The data set size as reported by INFORMATION_SCHEMA for 1000 (10x 100) warehouses is not grown as one would expect: fill_factor=100 fill_factor=90 INSERT NULL INSERT DEFAULT INSERT NULL INSERT DEFAULT 72,21 72,94 72,05 72,48 The variant using DEFAULT is indeed a little bigger, but the difference between a fill factor of 90% vs. 100% is not visible. It should also be in the order of 10%. This could be a side effect of the measuring method. It uses: SELECT SUM (DATA_LENGTH)/1024/1024/1024 FROM INFORMATION_SCHEMA.TABLES WHERE TABLE_SCHEMA = '${SCHEMA}' although KB says: For InnoDB/XtraDB, the index size, in pages, multiplied by the page size. so if the smaller fill factor would result in more pages, it should show here.

Marko Mäkelä added a comment - 2022-08-02 07:45

Since MariaDB Server 10.2 (and MySQL 5.7), there are two code paths for inserting records into index trees. As noted in ~~MDEV-26740~~, the fill factor is worse in the newer "bulk load" code path that is used by ALTER TABLE and since ~~MDEV-24621~~ (10.6) also for inserting into an empty table or partition.

Postgres CREATE TABLE defaults to fillfactor=100 (complete packing). There is a Postgres benchmark that studies the impact of the fill factor.

Marko Mäkelä added a comment - 2022-08-02 07:45 Since MariaDB Server 10.2 (and MySQL 5.7), there are two code paths for inserting records into index trees. As noted in MDEV-26740 , the fill factor is worse in the newer "bulk load" code path that is used by ALTER TABLE and since MDEV-24621 (10.6) also for inserting into an empty table or partition. Postgres CREATE TABLE defaults to fillfactor=100 (complete packing). There is a Postgres benchmark that studies the impact of the fill factor .

Axel Schwenke added a comment - 2022-08-02 09:43 - edited

MDEV-29035 is about many issues:

innodb_fill_factor not having any effect. At least none that would be visible by number of pages filled. It could be due to how the tables are loaded. sysbench-tpcc loads tables using "normal" inserts (like most real world apps will). It however always appends rows. I would assume that an insert to the physical end of a table respects innodb_fill_factor.
when inserting NULL to a row and (much) later updating the row with the real value results in much worse performance compared to inserting the row with a default value. Growing the size of a row even by 1 byte can make a huge difference. This is what the initial writeup (for MySQL, by DimK) is all about.
the default value of innodb_fill_factor = 100 makes no sense at all. I remembered a value of 15/16, but obviously this was ages ago.

Axel Schwenke added a comment - 2022-08-02 09:43 - edited MDEV-29035 is about many issues: innodb_fill_factor not having any effect. At least none that would be visible by number of pages filled. It could be due to how the tables are loaded. sysbench-tpcc loads tables using "normal" inserts (like most real world apps will). It however always appends rows. I would assume that an insert to the physical end of a table respects innodb_fill_factor. when inserting NULL to a row and (much) later updating the row with the real value results in much worse performance compared to inserting the row with a default value. Growing the size of a row even by 1 byte can make a huge difference. This is what the initial writeup (for MySQL, by DimK) is all about. the default value of innodb_fill_factor = 100 makes no sense at all. I remembered a value of 15/16, but obviously this was ages ago.

Marko Mäkelä added a comment - 2022-08-02 10:03

In June 2021, I submitted a Sysbench pull request to enable the use of the ~~MDEV-515~~ bulk load mechanism (until MDEV-25036 provides a nicer interface). Any real world applications that care about data load performance should enable that.
Only in ROW_FORMAT=REDUNDANT, SQL NULL values for fixed-length columns will allocate the same fixed length. In ROW_FORMAT=COMPACT (the default since MySQL 5.0.3) and later formats, NULL values only consume 1 bit in the null-flag bitmap. Similarly, after instant ADD COLUMN (~~MDEV-11369~~), updating instantly added columns from or to their initially specified default values will change the size of the records. This is a design constraint of the file formats, and not anything that could be changed easily.
Indeed, if innodb_fill_factor=100 is the default, it does not seem to have the desired effect, based on the larger file sizes that have been observed in ~~MDEV-26740~~.

I would suggest you to test the bulk load mechanism. The scope of this MDEV-29035 could be reduced to ensure that innodb_fill_factor will affect the normal DML code path as well. Maybe we should also introduce a proper index option for the MERGE_THRESHOLD, which was introduced as an index and table comment hack in MySQL 5.7 and MariaDB Server 10.2.

Marko Mäkelä added a comment - 2022-08-02 10:03 In June 2021, I submitted a Sysbench pull request to enable the use of the MDEV-515 bulk load mechanism (until MDEV-25036 provides a nicer interface). Any real world applications that care about data load performance should enable that. Only in ROW_FORMAT=REDUNDANT , SQL NULL values for fixed-length columns will allocate the same fixed length. In ROW_FORMAT=COMPACT (the default since MySQL 5.0.3 ) and later formats, NULL values only consume 1 bit in the null-flag bitmap. Similarly, after instant ADD COLUMN ( MDEV-11369 ), updating instantly added columns from or to their initially specified default values will change the size of the records. This is a design constraint of the file formats, and not anything that could be changed easily. Indeed, if innodb_fill_factor=100 is the default, it does not seem to have the desired effect, based on the larger file sizes that have been observed in MDEV-26740 . I would suggest you to test the bulk load mechanism. The scope of this MDEV-29035 could be reduced to ensure that innodb_fill_factor will affect the normal DML code path as well. Maybe we should also introduce a proper index option for the MERGE_THRESHOLD , which was introduced as an index and table comment hack in MySQL 5.7 and MariaDB Server 10.2.

Axel Schwenke added a comment - 2022-08-16 12:20

I patched support for bulk loading (~~MDEV-515~~) into sysbench. I also added an option to create the main payload (column c=CHAR(120)) with variable length. Running a modified read-write workload with length(c) initially being rand(99,119) and later on constant size of 119 gave me those intructive results:

With fill_factor=95 all subsequent UPDATEs and INSERTs fit onto the page. The dataset has no need to grow and the performance is much better than with the default fill_factor=100.

Axel Schwenke added a comment - 2022-08-16 12:20 I patched support for bulk loading ( MDEV-515 ) into sysbench. I also added an option to create the main payload (column c=CHAR(120)) with variable length. Running a modified read-write workload with length(c) initially being rand(99,119) and later on constant size of 119 gave me those intructive results: With fill_factor=95 all subsequent UPDATEs and INSERTs fit onto the page. The dataset has no need to grow and the performance is much better than with the default fill_factor=100.

Axel Schwenke added a comment - 2022-08-16 12:25

Note: OLTP read-write 50:50 = 10 point selects, 4 index updates, 4 non-index update, 1 delete, 1 insert.

Axel Schwenke added a comment - 2022-08-16 12:25 Note: OLTP read-write 50:50 = 10 point selects, 4 index updates, 4 non-index update, 1 delete, 1 insert.

People

Assignee:: Axel Schwenke

Reporter:: Axel Schwenke

Votes:: 0 Vote for this issue

Watchers:: 2 Start watching this issue

Dates

Created:: 2022-07-05 17:03

Updated:: 2024-12-18 09:58

Git Integration

Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.