Details

    Description

      Tests undertaken in PERF-206 show significant performance improvements of 30% for sysbench-tpcc when INSERT uses a column default value instead of NULL. The reason is that the row is UPDATEd later, filling proper values for the columns inserted as NULL. This leads to a growing row image which eventually leads to a page split. The page split in turn leads to mutex contention.

      The issue has been observed for MySQL as well, as explained here

      We should review if the hardcoded page fill factor of 15/16 for InnoDB can be chosen better. Other DBMS use a lower fill factor of i.e.70% (PostgreSQL) or make it configurable (Oracle DB, configurable per table).

      Attachments

        Issue Links

          Activity

            axel Axel Schwenke added a comment - - edited

            I just found a KB entry about a global variable innodb_fill_factor. But it says it's only used for bulk operations. And indeed grepping for fill_factor in the InnoDB code gives only meaningful hits in btr0bulk.cc. What is more: the variable defaults to 100. WTF?

            I think it should be also used for INSERT to the tail of a table. If such an INSERT would fill the page beyond the innodb_fill_factor, the row should go to a new page instead. This would cover most real world usecases.

            axel Axel Schwenke added a comment - - edited I just found a KB entry about a global variable innodb_fill_factor . But it says it's only used for bulk operations. And indeed grepping for fill_factor in the InnoDB code gives only meaningful hits in btr0bulk.cc . What is more: the variable defaults to 100. WTF? I think it should be also used for INSERT to the tail of a table. If such an INSERT would fill the page beyond the innodb_fill_factor , the row should go to a new page instead. This would cover most real world usecases.
            axel Axel Schwenke added a comment -

            This shows the impact of setting innodb_fill_factor=90 instead of leaving it at the default of 100. For INSERT NULL the effect is clearly visible even though it impacts only the preloaded data (and not rows inserted during the benchmark).

            axel Axel Schwenke added a comment - This shows the impact of setting innodb_fill_factor=90 instead of leaving it at the default of 100. For INSERT NULL the effect is clearly visible even though it impacts only the preloaded data (and not rows inserted during the benchmark).
            axel Axel Schwenke added a comment - - edited

            The data set size as reported by INFORMATION_SCHEMA for 1000 (10x 100) warehouses is not grown as one would expect:

            fill_factor=100   fill_factor=90  
            INSERT NULL INSERT DEFAULT INSERT NULL INSERT DEFAULT
            72,21 72,94 72,05 72,48

            The variant using DEFAULT is indeed a little bigger, but the difference between a fill factor of 90% vs. 100% is not visible. It should also be in the order of 10%.

            This could be a side effect of the measuring method. It uses:

            SELECT SUM(DATA_LENGTH)/1024/1024/1024 
            FROM INFORMATION_SCHEMA.TABLES WHERE TABLE_SCHEMA = '${SCHEMA}'
            

            although KB says:
            For InnoDB/XtraDB, the index size, in pages, multiplied by the page size.
            so if the smaller fill factor would result in more pages, it should show here.

            axel Axel Schwenke added a comment - - edited The data set size as reported by INFORMATION_SCHEMA for 1000 (10x 100) warehouses is not grown as one would expect: fill_factor=100   fill_factor=90   INSERT NULL INSERT DEFAULT INSERT NULL INSERT DEFAULT 72,21 72,94 72,05 72,48 The variant using DEFAULT is indeed a little bigger, but the difference between a fill factor of 90% vs. 100% is not visible. It should also be in the order of 10%. This could be a side effect of the measuring method. It uses: SELECT SUM (DATA_LENGTH)/1024/1024/1024 FROM INFORMATION_SCHEMA.TABLES WHERE TABLE_SCHEMA = '${SCHEMA}' although KB says: For InnoDB/XtraDB, the index size, in pages, multiplied by the page size. so if the smaller fill factor would result in more pages, it should show here.

            Since MariaDB Server 10.2 (and MySQL 5.7), there are two code paths for inserting records into index trees. As noted in MDEV-26740, the fill factor is worse in the newer "bulk load" code path that is used by ALTER TABLE and since MDEV-24621 (10.6) also for inserting into an empty table or partition.

            Postgres CREATE TABLE defaults to fillfactor=100 (complete packing). There is a Postgres benchmark that studies the impact of the fill factor.

            marko Marko Mäkelä added a comment - Since MariaDB Server 10.2 (and MySQL 5.7), there are two code paths for inserting records into index trees. As noted in MDEV-26740 , the fill factor is worse in the newer "bulk load" code path that is used by ALTER TABLE and since MDEV-24621 (10.6) also for inserting into an empty table or partition. Postgres CREATE TABLE defaults to fillfactor=100 (complete packing). There is a Postgres benchmark that studies the impact of the fill factor .
            axel Axel Schwenke added a comment - - edited

            MDEV-29035 is about many issues:

            1. innodb_fill_factor not having any effect. At least none that would be visible by number of pages filled. It could be due to how the tables are loaded. sysbench-tpcc loads tables using "normal" inserts (like most real world apps will). It however always appends rows. I would assume that an insert to the physical end of a table respects innodb_fill_factor.
            2. when inserting NULL to a row and (much) later updating the row with the real value results in much worse performance compared to inserting the row with a default value. Growing the size of a row even by 1 byte can make a huge difference. This is what the initial writeup (for MySQL, by DimK) is all about.
            3. the default value of innodb_fill_factor = 100 makes no sense at all. I remembered a value of 15/16, but obviously this was ages ago.
            axel Axel Schwenke added a comment - - edited MDEV-29035 is about many issues: innodb_fill_factor not having any effect. At least none that would be visible by number of pages filled. It could be due to how the tables are loaded. sysbench-tpcc loads tables using "normal" inserts (like most real world apps will). It however always appends rows. I would assume that an insert to the physical end of a table respects innodb_fill_factor. when inserting NULL to a row and (much) later updating the row with the real value results in much worse performance compared to inserting the row with a default value. Growing the size of a row even by 1 byte can make a huge difference. This is what the initial writeup (for MySQL, by DimK) is all about. the default value of innodb_fill_factor = 100 makes no sense at all. I remembered a value of 15/16, but obviously this was ages ago.
            1. In June 2021, I submitted a Sysbench pull request to enable the use of the MDEV-515 bulk load mechanism (until MDEV-25036 provides a nicer interface). Any real world applications that care about data load performance should enable that.
            2. Only in ROW_FORMAT=REDUNDANT, SQL NULL values for fixed-length columns will allocate the same fixed length. In ROW_FORMAT=COMPACT (the default since MySQL 5.0.3) and later formats, NULL values only consume 1 bit in the null-flag bitmap. Similarly, after instant ADD COLUMN (MDEV-11369), updating instantly added columns from or to their initially specified default values will change the size of the records. This is a design constraint of the file formats, and not anything that could be changed easily.
            3. Indeed, if innodb_fill_factor=100 is the default, it does not seem to have the desired effect, based on the larger file sizes that have been observed in MDEV-26740.

            I would suggest you to test the bulk load mechanism. The scope of this MDEV-29035 could be reduced to ensure that innodb_fill_factor will affect the normal DML code path as well. Maybe we should also introduce a proper index option for the MERGE_THRESHOLD, which was introduced as an index and table comment hack in MySQL 5.7 and MariaDB Server 10.2.

            marko Marko Mäkelä added a comment - In June 2021, I submitted a Sysbench pull request to enable the use of the MDEV-515 bulk load mechanism (until MDEV-25036 provides a nicer interface). Any real world applications that care about data load performance should enable that. Only in ROW_FORMAT=REDUNDANT , SQL NULL values for fixed-length columns will allocate the same fixed length. In ROW_FORMAT=COMPACT (the default since MySQL 5.0.3 ) and later formats, NULL values only consume 1 bit in the null-flag bitmap. Similarly, after instant ADD COLUMN ( MDEV-11369 ), updating instantly added columns from or to their initially specified default values will change the size of the records. This is a design constraint of the file formats, and not anything that could be changed easily. Indeed, if innodb_fill_factor=100 is the default, it does not seem to have the desired effect, based on the larger file sizes that have been observed in MDEV-26740 . I would suggest you to test the bulk load mechanism. The scope of this MDEV-29035 could be reduced to ensure that innodb_fill_factor will affect the normal DML code path as well. Maybe we should also introduce a proper index option for the MERGE_THRESHOLD , which was introduced as an index and table comment hack in MySQL 5.7 and MariaDB Server 10.2.
            axel Axel Schwenke added a comment -

            I patched support for bulk loading (MDEV-515) into sysbench. I also added an option to create the main payload (column c=CHAR(120)) with variable length. Running a modified read-write workload with length(c) initially being rand(99,119) and later on constant size of 119 gave me those intructive results:


            With fill_factor=95 all subsequent UPDATEs and INSERTs fit onto the page. The dataset has no need to grow and the performance is much better than with the default fill_factor=100.

            axel Axel Schwenke added a comment - I patched support for bulk loading ( MDEV-515 ) into sysbench. I also added an option to create the main payload (column c=CHAR(120)) with variable length. Running a modified read-write workload with length(c) initially being rand(99,119) and later on constant size of 119 gave me those intructive results: With fill_factor=95 all subsequent UPDATEs and INSERTs fit onto the page. The dataset has no need to grow and the performance is much better than with the default fill_factor=100.
            axel Axel Schwenke added a comment -

            Note: OLTP read-write 50:50 = 10 point selects, 4 index updates, 4 non-index update, 1 delete, 1 insert.

            axel Axel Schwenke added a comment - Note: OLTP read-write 50:50 = 10 point selects, 4 index updates, 4 non-index update, 1 delete, 1 insert.

            People

              axel Axel Schwenke
              axel Axel Schwenke
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

                Created:
                Updated:

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.