[MDEV-29035] review InnoDB page fill factor Created: 2022-07-05  Updated: 2022-08-16

Status: Open
Project: MariaDB Server
Component/s: Storage Engine - InnoDB
Fix Version/s: None

Type: Task Priority: Minor
Reporter: Axel Schwenke Assignee: Axel Schwenke
Resolution: Unresolved Votes: 0
Labels: None

Attachments: PNG File fill_factor_dss.png     PNG File sysbench_medium_comp.png     PNG File sysbench_medium_comp2.png     PNG File sysbench_read_write.png    
Issue Links:
Relates
relates to MDEV-26740 Inplace alter rebuild increases file ... Closed

 Description   

Tests undertaken in PERF-206 show significant performance improvements of 30% for sysbench-tpcc when INSERT uses a column default value instead of NULL. The reason is that the row is UPDATEd later, filling proper values for the columns inserted as NULL. This leads to a growing row image which eventually leads to a page split. The page split in turn leads to mutex contention.

The issue has been observed for MySQL as well, as explained here

We should review if the hardcoded page fill factor of 15/16 for InnoDB can be chosen better. Other DBMS use a lower fill factor of i.e.70% (PostgreSQL) or make it configurable (Oracle DB, configurable per table).



 Comments   
Comment by Axel Schwenke [ 2022-07-05 ]

I just found a KB entry about a global variable innodb_fill_factor. But it says it's only used for bulk operations. And indeed grepping for fill_factor in the InnoDB code gives only meaningful hits in btr0bulk.cc. What is more: the variable defaults to 100. WTF?

I think it should be also used for INSERT to the tail of a table. If such an INSERT would fill the page beyond the innodb_fill_factor, the row should go to a new page instead. This would cover most real world usecases.

Comment by Axel Schwenke [ 2022-07-07 ]

This shows the impact of setting innodb_fill_factor=90 instead of leaving it at the default of 100. For INSERT NULL the effect is clearly visible even though it impacts only the preloaded data (and not rows inserted during the benchmark).

Comment by Axel Schwenke [ 2022-07-07 ]

The data set size as reported by INFORMATION_SCHEMA for 1000 (10x 100) warehouses is not grown as one would expect:

fill_factor=100   fill_factor=90  
INSERT NULL INSERT DEFAULT INSERT NULL INSERT DEFAULT
72,21 72,94 72,05 72,48

The variant using DEFAULT is indeed a little bigger, but the difference between a fill factor of 90% vs. 100% is not visible. It should also be in the order of 10%.

This could be a side effect of the measuring method. It uses:

SELECT SUM(DATA_LENGTH)/1024/1024/1024 
FROM INFORMATION_SCHEMA.TABLES WHERE TABLE_SCHEMA = '${SCHEMA}'

although KB says:
For InnoDB/XtraDB, the index size, in pages, multiplied by the page size.
so if the smaller fill factor would result in more pages, it should show here.

Comment by Marko Mäkelä [ 2022-08-02 ]

Since MariaDB Server 10.2 (and MySQL 5.7), there are two code paths for inserting records into index trees. As noted in MDEV-26740, the fill factor is worse in the newer "bulk load" code path that is used by ALTER TABLE and since MDEV-24621 (10.6) also for inserting into an empty table or partition.

Postgres CREATE TABLE defaults to fillfactor=100 (complete packing). There is a Postgres benchmark that studies the impact of the fill factor.

Comment by Axel Schwenke [ 2022-08-02 ]

MDEV-29035 is about many issues:

  1. innodb_fill_factor not having any effect. At least none that would be visible by number of pages filled. It could be due to how the tables are loaded. sysbench-tpcc loads tables using "normal" inserts (like most real world apps will). It however always appends rows. I would assume that an insert to the physical end of a table respects innodb_fill_factor.
  2. when inserting NULL to a row and (much) later updating the row with the real value results in much worse performance compared to inserting the row with a default value. Growing the size of a row even by 1 byte can make a huge difference. This is what the initial writeup (for MySQL, by DimK) is all about.
  3. the default value of innodb_fill_factor = 100 makes no sense at all. I remembered a value of 15/16, but obviously this was ages ago.
Comment by Marko Mäkelä [ 2022-08-02 ]
  1. In June 2021, I submitted a Sysbench pull request to enable the use of the MDEV-515 bulk load mechanism (until MDEV-25036 provides a nicer interface). Any real world applications that care about data load performance should enable that.
  2. Only in ROW_FORMAT=REDUNDANT, SQL NULL values for fixed-length columns will allocate the same fixed length. In ROW_FORMAT=COMPACT (the default since MySQL 5.0.3) and later formats, NULL values only consume 1 bit in the null-flag bitmap. Similarly, after instant ADD COLUMN (MDEV-11369), updating instantly added columns from or to their initially specified default values will change the size of the records. This is a design constraint of the file formats, and not anything that could be changed easily.
  3. Indeed, if innodb_fill_factor=100 is the default, it does not seem to have the desired effect, based on the larger file sizes that have been observed in MDEV-26740.

I would suggest you to test the bulk load mechanism. The scope of this MDEV-29035 could be reduced to ensure that innodb_fill_factor will affect the normal DML code path as well. Maybe we should also introduce a proper index option for the MERGE_THRESHOLD, which was introduced as an index and table comment hack in MySQL 5.7 and MariaDB Server 10.2.

Comment by Axel Schwenke [ 2022-08-16 ]

I patched support for bulk loading (MDEV-515) into sysbench. I also added an option to create the main payload (column c=CHAR(120)) with variable length. Running a modified read-write workload with length(c) initially being rand(99,119) and later on constant size of 119 gave me those intructive results:


With fill_factor=95 all subsequent UPDATEs and INSERTs fit onto the page. The dataset has no need to grow and the performance is much better than with the default fill_factor=100.

Comment by Axel Schwenke [ 2022-08-16 ]

Note: OLTP read-write 50:50 = 10 point selects, 4 index updates, 4 non-index update, 1 delete, 1 insert.

Generated at Thu Feb 08 10:05:23 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.