Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-16796

TRUNCATE TABLE slowdown with innodb_file_per_table=ON

Details

    Description

      The documentation for TRUNCATE TABLE says the following:

      TRUNCATE TABLE is faster than DELETE, because it drops and re-creates a table. With XtraDB/InnoDB, this is faster if the innodb_file_per_table variable is ON (the default since MariaDB 5.5).

      https://mariadb.com/kb/en/library/truncate-table/

      What is the basis for the statement that it should be faster if innodb_file_per_table=ON is set? This does not seem to be true for a relatively small table with only 50k rows.

      For example, let's create a table and populate it with the following:

      CREATE TABLE test_table (
         id int NOT NULL PRIMARY KEY AUTO_INCREMENT, 
         val int
      );
       
      DELIMITER $$
       
      CREATE PROCEDURE prepare_data(num_rows int)
      BEGIN
        DECLARE i INT DEFAULT 0;
       
        WHILE i < num_rows DO
          INSERT INTO test_table (val) VALUES (FLOOR(RAND() * 1000000));
          SET i = i + 1;
        END WHILE;
      END$$
       
      DELIMITER ;
       
      CALL prepare_data(50000);
      

      Then let's run the following statements with both innodb_file_per_table set to ON and OFF:

      SHOW GLOBAL VARIABLES LIKE 'innodb_file_per_table';
      CREATE TABLE truncate_test AS SELECT * FROM test_table;
      SET profiling = 1;
      TRUNCATE TABLE truncate_test;
      SHOW PROFILES;
      SHOW PROFILE ALL;
      DROP TABLE truncate_test;
      

      Here are the results:

      MariaDB [db1]> SHOW GLOBAL VARIABLES LIKE 'innodb_file_per_table';
      +-----------------------+-------+
      | Variable_name         | Value |
      +-----------------------+-------+
      | innodb_file_per_table | ON    |
      +-----------------------+-------+
      1 row in set (0.00 sec)
       
      MariaDB [db1]> CREATE TABLE truncate_test AS SELECT * FROM test_table;
      Query OK, 50000 rows affected (0.24 sec)
      Records: 50000  Duplicates: 0  Warnings: 0
       
      MariaDB [db1]> SET profiling = 1;
      Query OK, 0 rows affected (0.00 sec)
       
      MariaDB [db1]> TRUNCATE TABLE truncate_test;
      Query OK, 0 rows affected (0.01 sec)
       
      MariaDB [db1]> SHOW PROFILES;
      +----------+------------+------------------------------+
      | Query_ID | Duration   | Query                        |
      +----------+------------+------------------------------+
      |        1 | 0.01077308 | TRUNCATE TABLE truncate_test |
      +----------+------------+------------------------------+
      1 row in set (0.00 sec)
       
      MariaDB [db1]> SHOW PROFILE ALL;
      +----------------------+----------+----------+------------+-------------------+---------------------+--------------+---------------+---------------+-------------------+-------------------+-------------------+-------+-----------------------+--------------+-------------+
      | Status               | Duration | CPU_user | CPU_system | Context_voluntary | Context_involuntary | Block_ops_in | Block_ops_out | Messages_sent | Messages_received | Page_faults_major | Page_faults_minor | Swaps | Source_function       | Source_file  | Source_line |
      +----------------------+----------+----------+------------+-------------------+---------------------+--------------+---------------+---------------+-------------------+-------------------+-------------------+-------+-----------------------+--------------+-------------+
      | starting             | 0.000017 | 0.000004 |   0.000007 |                 0 |                   0 |            0 |             0 |             0 |                 0 |                 0 |                 0 |     0 | NULL                  | NULL         |        NULL |
      | Opening tables       | 0.000007 | 0.000002 |   0.000004 |                 0 |                   0 |            0 |             0 |             0 |                 0 |                 0 |                 0 |     0 | open_ltable           | sql_base.cc  |        5176 |
      | System lock          | 0.000002 | 0.000001 |   0.000002 |                 0 |                   0 |            0 |             0 |             0 |                 0 |                 0 |                 0 |     0 | mysql_lock_tables     | lock.cc      |         308 |
      | Table lock           | 0.000002 | 0.000001 |   0.000001 |                 0 |                   0 |            0 |             0 |             0 |                 0 |                 0 |                 0 |     0 | mysql_lock_tables     | lock.cc      |         313 |
      | Opening tables       | 0.000002 | 0.000000 |   0.000001 |                 0 |                   0 |            0 |             0 |             0 |                 0 |                 0 |                 0 |     0 | mysql_lock_tables     | lock.cc      |         328 |
      | After opening tables | 0.000012 | 0.000005 |   0.000009 |                 0 |                   0 |            0 |             0 |             0 |                 0 |                 0 |                 0 |     0 | open_ltable           | sql_base.cc  |        5245 |
      | closing tables       | 0.000002 | 0.000001 |   0.000001 |                 0 |                   0 |            0 |             0 |             0 |                 0 |                 0 |                 0 |     0 | close_thread_tables   | sql_base.cc  |         919 |
      | Unlocking tables     | 0.000015 | 0.000005 |   0.000010 |                 0 |                   0 |            0 |             0 |             0 |                 0 |                 0 |                 0 |     0 | mysql_unlock_tables   | lock.cc      |         396 |
      | checking permissions | 0.000021 | 0.000008 |   0.000014 |                 0 |                   0 |            0 |             0 |             0 |                 0 |                 0 |                 0 |     0 | check_access          | sql_parse.cc |        6154 |
      | Opening tables       | 0.000170 | 0.004876 |   0.000000 |                 0 |                   0 |            0 |             0 |             0 |                 0 |                 0 |                 0 |     0 | open_tables           | sql_base.cc  |        4532 |
      | After opening tables | 0.000004 | 0.000000 |   0.000000 |                 0 |                   0 |            0 |             0 |             0 |                 0 |                 0 |                 0 |     0 | open_tables           | sql_base.cc  |        4779 |
      | System lock          | 0.000003 | 0.000000 |   0.000000 |                 0 |                   0 |            0 |             0 |             0 |                 0 |                 0 |                 0 |     0 | mysql_lock_tables     | lock.cc      |         308 |
      | Table lock           | 0.010468 | 0.000000 |   0.000000 |                21 |                   0 |            8 |           256 |             0 |                 0 |                 0 |                 0 |     0 | mysql_lock_tables     | lock.cc      |         313 |
      | Writing to binlog    | 0.000018 | 0.000000 |   0.000000 |                 0 |                   0 |            0 |             0 |             0 |                 0 |                 0 |                 0 |     0 | write_bin_log         | sql_table.cc |        1969 |
      | query end            | 0.000005 | 0.000000 |   0.000000 |                 0 |                   0 |            0 |             0 |             0 |                 0 |                 0 |                 0 |     0 | mysql_execute_command | sql_parse.cc |        5706 |
      | closing tables       | 0.000002 | 0.000000 |   0.000000 |                 0 |                   0 |            0 |             0 |             0 |                 0 |                 0 |                 0 |     0 | close_thread_tables   | sql_base.cc  |         919 |
      | Unlocking tables     | 0.000007 | 0.000000 |   0.000000 |                 0 |                   0 |            0 |             0 |             0 |                 0 |                 0 |                 0 |     0 | mysql_unlock_tables   | lock.cc      |         396 |
      | freeing items        | 0.000003 | 0.000000 |   0.000000 |                 0 |                   0 |            0 |             0 |             0 |                 0 |                 0 |                 0 |     0 | mysql_parse           | sql_parse.cc |        7466 |
      | updating status      | 0.000008 | 0.000000 |   0.000000 |                 0 |                   0 |            0 |             0 |             0 |                 0 |                 0 |                 0 |     0 | dispatch_command      | sql_parse.cc |        1954 |
      | cleaning up          | 0.000002 | 0.000000 |   0.000000 |                 0 |                   0 |            0 |             0 |             0 |                 0 |                 0 |                 0 |     0 | dispatch_command      | sql_parse.cc |        1973 |
      +----------------------+----------+----------+------------+-------------------+---------------------+--------------+---------------+---------------+-------------------+-------------------+-------------------+-------+-----------------------+--------------+-------------+
      20 rows in set (0.00 sec)
       
      MariaDB [db1]> DROP TABLE truncate_test;
      Query OK, 0 rows affected (0.01 sec)
      

      MariaDB [db1]> SHOW GLOBAL VARIABLES LIKE 'innodb_file_per_table';
      +-----------------------+-------+
      | Variable_name         | Value |
      +-----------------------+-------+
      | innodb_file_per_table | OFF   |
      +-----------------------+-------+
      1 row in set (0.00 sec)
       
      MariaDB [db1]> CREATE TABLE truncate_test AS SELECT * FROM test_table;
      Query OK, 50000 rows affected (0.16 sec)
      Records: 50000  Duplicates: 0  Warnings: 0
       
      MariaDB [db1]> SET profiling = 1;
      Query OK, 0 rows affected (0.00 sec)
       
      MariaDB [db1]> TRUNCATE TABLE truncate_test;
      Query OK, 0 rows affected (0.00 sec)
       
      MariaDB [db1]> SHOW PROFILES;
      +----------+------------+------------------------------+
      | Query_ID | Duration   | Query                        |
      +----------+------------+------------------------------+
      |        1 | 0.00474932 | TRUNCATE TABLE truncate_test |
      +----------+------------+------------------------------+
      1 row in set (0.00 sec)
       
      MariaDB [db1]> SHOW PROFILE ALL;
      +----------------------+----------+----------+------------+-------------------+---------------------+--------------+---------------+---------------+-------------------+-------------------+-------------------+-------+-----------------------+--------------+-------------+
      | Status               | Duration | CPU_user | CPU_system | Context_voluntary | Context_involuntary | Block_ops_in | Block_ops_out | Messages_sent | Messages_received | Page_faults_major | Page_faults_minor | Swaps | Source_function       | Source_file  | Source_line |
      +----------------------+----------+----------+------------+-------------------+---------------------+--------------+---------------+---------------+-------------------+-------------------+-------------------+-------+-----------------------+--------------+-------------+
      | starting             | 0.000016 | 0.000005 |   0.000006 |                 0 |                   0 |            0 |             0 |             0 |                 0 |                 0 |                 0 |     0 | NULL                  | NULL         |        NULL |
      | Opening tables       | 0.000008 | 0.000004 |   0.000004 |                 0 |                   0 |            0 |             0 |             0 |                 0 |                 0 |                 0 |     0 | open_ltable           | sql_base.cc  |        5176 |
      | System lock          | 0.000002 | 0.000001 |   0.000001 |                 0 |                   0 |            0 |             0 |             0 |                 0 |                 0 |                 0 |     0 | mysql_lock_tables     | lock.cc      |         308 |
      | Table lock           | 0.000002 | 0.000001 |   0.000001 |                 0 |                   0 |            0 |             0 |             0 |                 0 |                 0 |                 0 |     0 | mysql_lock_tables     | lock.cc      |         313 |
      | Opening tables       | 0.000002 | 0.000001 |   0.000001 |                 0 |                   0 |            0 |             0 |             0 |                 0 |                 0 |                 0 |     0 | mysql_lock_tables     | lock.cc      |         328 |
      | After opening tables | 0.000015 | 0.000008 |   0.000008 |                 0 |                   0 |            0 |             0 |             0 |                 0 |                 0 |                 0 |     0 | open_ltable           | sql_base.cc  |        5245 |
      | closing tables       | 0.000002 | 0.000001 |   0.000001 |                 0 |                   0 |            0 |             0 |             0 |                 0 |                 0 |                 0 |     0 | close_thread_tables   | sql_base.cc  |         919 |
      | Unlocking tables     | 0.000016 | 0.000008 |   0.000008 |                 0 |                   0 |            0 |             0 |             0 |                 0 |                 0 |                 1 |     0 | mysql_unlock_tables   | lock.cc      |         396 |
      | checking permissions | 0.000020 | 0.000010 |   0.000010 |                 0 |                   0 |            0 |             0 |             0 |                 0 |                 0 |                 1 |     0 | check_access          | sql_parse.cc |        6154 |
      | Opening tables       | 0.000147 | 0.000074 |   0.000075 |                 0 |                   0 |            0 |             0 |             0 |                 0 |                 0 |                 0 |     0 | open_tables           | sql_base.cc  |        4532 |
      | After opening tables | 0.000005 | 0.000002 |   0.000001 |                 0 |                   0 |            0 |             0 |             0 |                 0 |                 0 |                 0 |     0 | open_tables           | sql_base.cc  |        4779 |
      | System lock          | 0.000002 | 0.000001 |   0.000002 |                 0 |                   0 |            0 |             0 |             0 |                 0 |                 0 |                 0 |     0 | mysql_lock_tables     | lock.cc      |         308 |
      | Table lock           | 0.004469 | 0.001860 |   0.000000 |                11 |                   1 |            0 |            40 |             0 |                 0 |                 0 |                 5 |     0 | mysql_lock_tables     | lock.cc      |         313 |
      | Writing to binlog    | 0.000017 | 0.000000 |   0.000000 |                 0 |                   0 |            0 |             0 |             0 |                 0 |                 0 |                 0 |     0 | write_bin_log         | sql_table.cc |        1969 |
      | query end            | 0.000004 | 0.000000 |   0.000000 |                 0 |                   0 |            0 |             0 |             0 |                 0 |                 0 |                 0 |     0 | mysql_execute_command | sql_parse.cc |        5706 |
      | closing tables       | 0.000003 | 0.000000 |   0.000000 |                 0 |                   0 |            0 |             0 |             0 |                 0 |                 0 |                 0 |     0 | close_thread_tables   | sql_base.cc  |         919 |
      | Unlocking tables     | 0.000008 | 0.000000 |   0.000000 |                 0 |                   0 |            0 |             0 |             0 |                 0 |                 0 |                 0 |     0 | mysql_unlock_tables   | lock.cc      |         396 |
      | freeing items        | 0.000003 | 0.000000 |   0.000000 |                 0 |                   0 |            0 |             0 |             0 |                 0 |                 0 |                 0 |     0 | mysql_parse           | sql_parse.cc |        7466 |
      | updating status      | 0.000006 | 0.000000 |   0.000000 |                 0 |                   0 |            0 |             0 |             0 |                 0 |                 0 |                 0 |     0 | dispatch_command      | sql_parse.cc |        1954 |
      | cleaning up          | 0.000002 | 0.000000 |   0.000000 |                 0 |                   0 |            0 |             0 |             0 |                 0 |                 0 |                 0 |     0 | dispatch_command      | sql_parse.cc |        1973 |
      +----------------------+----------+----------+------------+-------------------+---------------------+--------------+---------------+---------------+-------------------+-------------------+-------------------+-------+-----------------------+--------------+-------------+
      20 rows in set (0.00 sec)
       
      MariaDB [db1]> DROP TABLE truncate_test;
      Query OK, 0 rows affected (0.00 sec)
      

      It seems to be consistently faster with innodb_file_per_table=OFF set. The difference seems to be in the Table lock state, and there seems to be a difference in the Block_ops_out value.

      Is this a bug, or is the statement in the documentation incorrect, or is there an exception to this rule for relatively small tables?

      Attachments

        1. adapt_0_per_table_0.svg
          510 kB
        2. adapt_0_per_table_1.svg
          443 kB
        3. mem_adapt_0_per_table_0.svg
          530 kB
        4. mem_adapt_0_per_table_1.svg
          521 kB
        5. mem_adapt_1_per_table_0_10.2_safe_truncate_0.svg
          508 kB
        6. mem_adapt_1_per_table_0_10.2_safe_truncate_1.svg
          391 kB
        7. mem_adapt_1_per_table_0.svg
          417 kB
        8. mem_adapt_1_per_table_1.svg
          534 kB
        9. per_table_0.svg
          330 kB
        10. per_table_1.svg
          422 kB

        Issue Links

          Activity

            GeoffMontee Geoff Montee (Inactive) created issue -

            marko, do you want to provide an "official" comment on this?

            elenst Elena Stepanova added a comment - marko , do you want to provide an "official" comment on this?
            elenst Elena Stepanova made changes -
            Field Original Value New Value
            Assignee Marko Mäkelä [ marko ]
            elenst Elena Stepanova made changes -
            Fix Version/s 10.1 [ 16100 ]

            I created a mysql-test-run test case for this:

            --source include/have_innodb.inc
            --source include/have_sequence.inc
            SET @save_per_table = @@GLOBAL.innodb_file_per_table;
            SET GLOBAL innodb_file_per_table=0;
            CREATE TABLE t0 (id int PRIMARY KEY AUTO_INCREMENT, val int) ENGINE=InnoDB;
            SET GLOBAL innodb_file_per_table=0;
            CREATE TABLE t1 LIKE t0;
            SET GLOBAL innodb_file_per_table = @save_per_table;
             
            insert t0 select null, 0 from seq_1_to_50000;
            insert t1 select null, 0 from seq_1_to_50000;
             
            SET profiling = 1;
            TRUNCATE TABLE t0;
            TRUNCATE TABLE t1;
            SHOW PROFILES;
            SHOW PROFILE ALL;
            DROP TABLE t0, t1;
            

            Running cmake -DCMAKE_BUILD_TYPE=RelWithDebInfo with mysql-test-run --mem I see a small but noticeable difference in favour of innodb_file_per_table=1 (which is contrary to the claim):

            10.1 c886368a3ad89cc87eaf66344837d59afa49db7e

            SHOW PROFILES;
            Query_ID	Duration	Query
            1	0.00185556	TRUNCATE TABLE t0
            2	0.00122334	TRUNCATE TABLE t1
            

            With mysql-test-run -mysqld=-skip-innodb-adaptive-hash-index, the command runs faster (I’d guess because dropping the adaptive hash index for an .ibd file is faster), but a slight difference remains:

            10.1 c886368a3ad89cc87eaf66344837d59afa49db7e

            SHOW PROFILES;
            Query_ID	Duration	Query
            1	0.00036774	TRUNCATE TABLE t0
            2	0.00034451	TRUNCATE TABLE t1
            

            On 10.3 with the MDEV-13564 included, the difference is even bigger:

            10.3 aba5c72be28cbd3028623070b7bf3d7d1e128be1

            SHOW PROFILES;
            Query_ID	Duration	Query
            1	0.03168979	TRUNCATE TABLE t0
            2	0.00156857	TRUNCATE TABLE t1
            

            Again, if the buffer pool scan for dropping the adaptive hash index is removed, the relative difference becomes smaller:

            10.3 aba5c72be28cbd3028623070b7bf3d7d1e128be1

            SHOW PROFILES;
            Query_ID	Duration	Query
            1	0.00143449	TRUNCATE TABLE t0
            2	0.00091162	TRUNCATE TABLE t1
            

            I would like to see more detailed profile output (Linux perf or similar) that proves that TRUNCATE on an .ibd file is slower than for a table that is located in the InnoDB system tablespace, because I am consistently seeing the opposite. Could the file system I/O play a role?

            Note: My numbers are for optimized non-debug builds.

            marko Marko Mäkelä added a comment - I created a mysql-test-run test case for this: --source include/have_innodb.inc --source include/have_sequence.inc SET @save_per_table = @@ GLOBAL .innodb_file_per_table; SET GLOBAL innodb_file_per_table=0; CREATE TABLE t0 (id int PRIMARY KEY AUTO_INCREMENT, val int ) ENGINE=InnoDB; SET GLOBAL innodb_file_per_table=0; CREATE TABLE t1 LIKE t0; SET GLOBAL innodb_file_per_table = @save_per_table;   insert t0 select null , 0 from seq_1_to_50000; insert t1 select null , 0 from seq_1_to_50000;   SET profiling = 1; TRUNCATE TABLE t0; TRUNCATE TABLE t1; SHOW PROFILES; SHOW PROFILE ALL ; DROP TABLE t0, t1; Running cmake -DCMAKE_BUILD_TYPE=RelWithDebInfo with mysql-test-run --mem I see a small but noticeable difference in favour of innodb_file_per_table=1 (which is contrary to the claim): 10.1 c886368a3ad89cc87eaf66344837d59afa49db7e SHOW PROFILES; Query_ID Duration Query 1 0.00185556 TRUNCATE TABLE t0 2 0.00122334 TRUNCATE TABLE t1 With mysql-test-run - mysqld= -skip-innodb-adaptive-hash-index , the command runs faster (I’d guess because dropping the adaptive hash index for an .ibd file is faster), but a slight difference remains: 10.1 c886368a3ad89cc87eaf66344837d59afa49db7e SHOW PROFILES; Query_ID Duration Query 1 0.00036774 TRUNCATE TABLE t0 2 0.00034451 TRUNCATE TABLE t1 On 10.3 with the MDEV-13564 included, the difference is even bigger: 10.3 aba5c72be28cbd3028623070b7bf3d7d1e128be1 SHOW PROFILES; Query_ID Duration Query 1 0.03168979 TRUNCATE TABLE t0 2 0.00156857 TRUNCATE TABLE t1 Again, if the buffer pool scan for dropping the adaptive hash index is removed, the relative difference becomes smaller: 10.3 aba5c72be28cbd3028623070b7bf3d7d1e128be1 SHOW PROFILES; Query_ID Duration Query 1 0.00143449 TRUNCATE TABLE t0 2 0.00091162 TRUNCATE TABLE t1 I would like to see more detailed profile output (Linux perf or similar) that proves that TRUNCATE on an .ibd file is slower than for a table that is located in the InnoDB system tablespace, because I am consistently seeing the opposite. Could the file system I/O play a role? Note: My numbers are for optimized non-debug builds.
            julien.fritsch Julien Fritsch made changes -
            Labels innodb truncate innodb need_feedback truncate

            marko,

            The test case that you shared above sets innodb_file_per_table=0 before creating both t0 and t1. Shouldn't it have set innodb_file_per_table=1 before creating one of those if you wanted to test the difference? Are your test results invalid, or was this just a copy/paste error?

            GeoffMontee Geoff Montee (Inactive) added a comment - marko , The test case that you shared above sets innodb_file_per_table=0 before creating both t0 and t1. Shouldn't it have set innodb_file_per_table=1 before creating one of those if you wanted to test the difference? Are your test results invalid, or was this just a copy/paste error?

            GeoffMontee, my intention was to SET GLOBAL innodb_file_per_table=1 before CREATE TABLE t1. When repeating the (corrected) test today, I do get slower results for t1. It is likely that I accidentally had the tables t0 and t1 swapped in my previous run, and also in the previous test, indeed TRUNCATE is slower for innodb_file_per_table=1 tables:

            10.1 2a576f71c5d3c7aacef564e5b1251f83bde48f51

            SHOW PROFILES;
            Query_ID	Duration	Query
            1	0.00125170	TRUNCATE TABLE t0
            2	0.00195823	TRUNCATE TABLE t1
            

            10.2 2308b9afec559cd8c5717007a7ad6821c374370d innodb_safe_truncate=ON

            SHOW PROFILES;
            Query_ID	Duration	Query
            1	0.00156317	TRUNCATE TABLE t0
            2	0.00237339	TRUNCATE TABLE t1
            

            With innodb_adaptive_hash_index=OFF the difference is bigger:

            10.1 2a576f71c5d3c7aacef564e5b1251f83bde48f51

            SHOW PROFILES;
            Query_ID	Duration	Query
            1	0.00038720	TRUNCATE TABLE t0
            2	0.00105073	TRUNCATE TABLE t1
            

            10.2 2308b9afec559cd8c5717007a7ad6821c374370d innodb_safe_truncate=ON

            SHOW PROFILES;
            Query_ID	Duration	Query
            1	0.00062888	TRUNCATE TABLE t0
            2	0.00156727	TRUNCATE TABLE t1
            

            10.2 2308b9afec559cd8c5717007a7ad6821c374370d innodb_safe_truncate=OFF

            SHOW PROFILES;
            Query_ID	Duration	Query
            1	0.00046643	TRUNCATE TABLE t0
            2	0.00121532	TRUNCATE TABLE t1
            

            The difference between the last two runs show the impact of MDEV-13564 (implementing TRUNCATE as a combination of RENAME, CREATE and DROP).

            marko Marko Mäkelä added a comment - GeoffMontee , my intention was to SET GLOBAL innodb_file_per_table=1 before CREATE TABLE t1 . When repeating the (corrected) test today, I do get slower results for t1 . It is likely that I accidentally had the tables t0 and t1 swapped in my previous run, and also in the previous test, indeed TRUNCATE is slower for innodb_file_per_table=1 tables: 10.1 2a576f71c5d3c7aacef564e5b1251f83bde48f51 SHOW PROFILES; Query_ID Duration Query 1 0.00125170 TRUNCATE TABLE t0 2 0.00195823 TRUNCATE TABLE t1 10.2 2308b9afec559cd8c5717007a7ad6821c374370d innodb_safe_truncate=ON SHOW PROFILES; Query_ID Duration Query 1 0.00156317 TRUNCATE TABLE t0 2 0.00237339 TRUNCATE TABLE t1 With innodb_adaptive_hash_index=OFF the difference is bigger: 10.1 2a576f71c5d3c7aacef564e5b1251f83bde48f51 SHOW PROFILES; Query_ID Duration Query 1 0.00038720 TRUNCATE TABLE t0 2 0.00105073 TRUNCATE TABLE t1 10.2 2308b9afec559cd8c5717007a7ad6821c374370d innodb_safe_truncate=ON SHOW PROFILES; Query_ID Duration Query 1 0.00062888 TRUNCATE TABLE t0 2 0.00156727 TRUNCATE TABLE t1 10.2 2308b9afec559cd8c5717007a7ad6821c374370d innodb_safe_truncate=OFF SHOW PROFILES; Query_ID Duration Query 1 0.00046643 TRUNCATE TABLE t0 2 0.00121532 TRUNCATE TABLE t1 The difference between the last two runs show the impact of MDEV-13564 (implementing TRUNCATE as a combination of RENAME , CREATE and DROP ).
            marko Marko Mäkelä made changes -
            Status Open [ 1 ] Confirmed [ 10101 ]
            marko Marko Mäkelä made changes -
            Labels innodb need_feedback truncate innodb truncate

            Please check with perf record or similar where the difference between innodb_file_per_table=1 and innodb_file_per_table=0 is coming from, with innodb_adaptive_hash_index=0.

            Side note: Why is 10.2 so much slower than 10.1? Could it be because of page_size_t and page_id_t?

            marko Marko Mäkelä added a comment - Please check with perf record or similar where the difference between innodb_file_per_table=1 and innodb_file_per_table=0 is coming from, with innodb_adaptive_hash_index=0 . Side note: Why is 10.2 so much slower than 10.1? Could it be because of page_size_t and page_id_t ?
            marko Marko Mäkelä made changes -
            Assignee Marko Mäkelä [ marko ] Eugene Kosov [ kevg ]
            kevg Eugene Kosov (Inactive) made changes -
            Attachment per_table_0.svg [ 46612 ]
            kevg Eugene Kosov (Inactive) made changes -
            Attachment per_table_1.svg [ 46613 ]
            kevg Eugene Kosov (Inactive) added a comment - - edited

            For 10.1 I used manually started perf record -g -p `pgrep mysqld` just before TRUNCATE. Thame flamegraphs.

            One query is roughly two times faster for me:

            uery_ID	Duration	Query
            1	0.00327482	TRUNCATE TABLE t0
            2	0.00695284	TRUNCATE TABLE t1
            

            And I see a bit similar staff for row_truncate_table_for_mysql(): 34 vs 74 samples.

            per_table_0.svg per_table_1.svg

            Flame graphs contains a lot of irrelevant info. If you open files in browser you may click on different function to zoom like here: http://www.brendangregg.com/FlameGraphs/example-perf.svg

            kevg Eugene Kosov (Inactive) added a comment - - edited For 10.1 I used manually started perf record -g -p `pgrep mysqld` just before TRUNCATE . Thame flamegraphs. One query is roughly two times faster for me: uery_ID Duration Query 1 0.00327482 TRUNCATE TABLE t0 2 0.00695284 TRUNCATE TABLE t1 And I see a bit similar staff for row_truncate_table_for_mysql() : 34 vs 74 samples. per_table_0.svg per_table_1.svg Flame graphs contains a lot of irrelevant info. If you open files in browser you may click on different function to zoom like here: http://www.brendangregg.com/FlameGraphs/example-perf.svg

            One slow thing is a tablespace removal. I mean a system call which unlinks file. Example times: 0.0076 vs 0.0091. But file_per_table=0 is even faster 0.0070. That's a times for seq_1_to_500000.

            kevg Eugene Kosov (Inactive) added a comment - One slow thing is a tablespace removal. I mean a system call which unlinks file. Example times: 0.0076 vs 0.0091. But file_per_table=0 is even faster 0.0070. That's a times for seq_1_to_500000 .
            kevg Eugene Kosov (Inactive) made changes -
            Status Confirmed [ 10101 ] In Progress [ 3 ]

            file_per_table=0 TRUNCATE: iterates all records on disk and removes them from adaptive hash.

            file_per_table=1 TRUNCATE: iterates pages in buffer pool, iterates all records in interesting pages and removes them from adaptive hash, after that it recreates tablespace.

            So, performance of both cases depends on a table size, records size and buffer pool size. I'm not post here my benchmarks but in every case per_table=0 was faster. Even when table is bit and buffer pool is small when per_table=0 iterates a lot and per_table=1 iterates a little the former is slower because of file removal.

            Major note: do not benchmark both cases simultaneously.

            And my results are for 10.1.

            kevg Eugene Kosov (Inactive) added a comment - file_per_table=0 TRUNCATE : iterates all records on disk and removes them from adaptive hash. file_per_table=1 TRUNCATE : iterates pages in buffer pool, iterates all records in interesting pages and removes them from adaptive hash, after that it recreates tablespace. So, performance of both cases depends on a table size, records size and buffer pool size. I'm not post here my benchmarks but in every case per_table=0 was faster. Even when table is bit and buffer pool is small when per_table=0 iterates a lot and per_table=1 iterates a little the former is slower because of file removal. Major note: do not benchmark both cases simultaneously. And my results are for 10.1.
            kevg Eugene Kosov (Inactive) made changes -
            Attachment adapt_0_per_table_0.svg [ 46621 ]
            Attachment adapt_0_per_table_1.svg [ 46622 ]

            Now with innodb_adapt_hash_index=0

            per_table=0 is roughly 0.0081
            per_table=1 is roughly 0.0121

            The first one still iterates index pages.
            The second one still unlinks disk file. And it's still slower. adapt_0_per_table_0.svg adapt_0_per_table_1.svg

            kevg Eugene Kosov (Inactive) added a comment - Now with innodb_adapt_hash_index=0 per_table=0 is roughly 0.0081 per_table=1 is roughly 0.0121 The first one still iterates index pages. The second one still unlinks disk file. And it's still slower. adapt_0_per_table_0.svg adapt_0_per_table_1.svg

            And now for --mem:

            adapt=1 per_table=0
             
            0.0062
            0.0058
            0.0056
             
            adapt=1 per_table=1
             
            0.0153
            0.0154
            0.0160
             
            adapt=0 per_table=0
             
            0.0024
            0.0027
            0.0025
             
            adapt=0 per_table=1
             
            0.0135
            0.0122
            0.0133
            
            

            kevg Eugene Kosov (Inactive) added a comment - And now for --mem : adapt=1 per_table=0   0.0062 0.0058 0.0056   adapt=1 per_table=1   0.0153 0.0154 0.0160   adapt=0 per_table=0   0.0024 0.0027 0.0025   adapt=0 per_table=1   0.0135 0.0122 0.0133

            And flame graphs. Nothing very new. Performance schema code became noticeable with per_table=0.
            mem_adapt_0_per_table_0.svg mem_adapt_0_per_table_1.svg mem_adapt_1_per_table_0.svg mem_adapt_1_per_table_1.svg

            kevg Eugene Kosov (Inactive) added a comment - And flame graphs. Nothing very new. Performance schema code became noticeable with per_table=0 . mem_adapt_0_per_table_0.svg mem_adapt_0_per_table_1.svg mem_adapt_1_per_table_0.svg mem_adapt_1_per_table_1.svg
            kevg Eugene Kosov (Inactive) made changes -
            Attachment mem_adapt_0_per_table_0.svg [ 46623 ]
            Attachment mem_adapt_0_per_table_1.svg [ 46624 ]
            Attachment mem_adapt_1_per_table_0.svg [ 46625 ]
            Attachment mem_adapt_1_per_table_1.svg [ 46626 ]
            marko Marko Mäkelä made changes -
            marko Marko Mäkelä made changes -

            kevg’s flame graphs are very useful at visualizing this.

            My summary of the problems found (or previously known):

            1. If innodb_adaptive_hash_index=ON and any adaptive hash index (AHI) entries exist on a table, table-rebuilding DDL operations (such as TRUNCATE) will be slow, because the AHI will have to be dropped, one record at a time. Before MDEV-16283, we would in some cases scan the buffer pool even if there were no AHI entries for the table in question. It could be a good idea to disable the AHI by default, or even to build without AHI support (option introduced in MDEV-12121).
            2. The operation will not return to the user before the file has been deleted. It is the file deletion that is really dominating the benchmark. This would be addressed in MDEV-8069.
            3. We are unnecessarily writing back garbage pages to the system tablespace. This should be fixed in MDEV-15528.

            The fixes require changing the undo log and redo log formats, so they cannot be done in a GA version.

            marko Marko Mäkelä added a comment - kevg ’s flame graphs are very useful at visualizing this. My summary of the problems found (or previously known): If innodb_adaptive_hash_index=ON and any adaptive hash index (AHI) entries exist on a table, table-rebuilding DDL operations (such as TRUNCATE ) will be slow, because the AHI will have to be dropped, one record at a time. Before MDEV-16283 , we would in some cases scan the buffer pool even if there were no AHI entries for the table in question. It could be a good idea to disable the AHI by default, or even to build without AHI support (option introduced in MDEV-12121 ). The operation will not return to the user before the file has been deleted. It is the file deletion that is really dominating the benchmark. This would be addressed in MDEV-8069 . We are unnecessarily writing back garbage pages to the system tablespace. This should be fixed in MDEV-15528 . The fixes require changing the undo log and redo log formats, so they cannot be done in a GA version.

            Now graphs for 10.2

            safe_truncate=1 is ha_innobase::create + ha_innobase::delete in profile.

            safe_truncate=0 does a little bit more stuff than in 10.1 mem_adapt_1_per_table_0_10.2_safe_truncate_0.svg mem_adapt_1_per_table_0_10.2_safe_truncate_1.svg

            kevg Eugene Kosov (Inactive) added a comment - Now graphs for 10.2 safe_truncate=1 is ha_innobase::create + ha_innobase::delete in profile. safe_truncate=0 does a little bit more stuff than in 10.1 mem_adapt_1_per_table_0_10.2_safe_truncate_0.svg mem_adapt_1_per_table_0_10.2_safe_truncate_1.svg

            And now some benchmark results. They're too variative, I know. But after looking at flame graphs I have an optinion than 10.2 is a little bit slower.

            safe_truncate=1 is significantly slower, indeed.

            mem per_table=0 adaptive=1 10.1
             
            0.0072
            0.0159
            0.0087
            0.0079
            0.0080
             
            mem per_table=0 adaptive=1 10.2 safe_truncate=1
             
            0.0093
            0.0094
            0.0089
             
            mem per_table=0 adaptive=1 10.2 safe_truncate=0
             
            0.0162
            0.0078
            0.0054
            0.0084
            0.0086
            

            kevg Eugene Kosov (Inactive) added a comment - And now some benchmark results. They're too variative, I know. But after looking at flame graphs I have an optinion than 10.2 is a little bit slower. safe_truncate=1 is significantly slower, indeed. mem per_table=0 adaptive=1 10.1   0.0072 0.0159 0.0087 0.0079 0.0080   mem per_table=0 adaptive=1 10.2 safe_truncate=1   0.0093 0.0094 0.0089   mem per_table=0 adaptive=1 10.2 safe_truncate=0   0.0162 0.0078 0.0054 0.0084 0.0086
            kevg Eugene Kosov (Inactive) made changes -

            Nothing more to do with this issue. AHI should be benchmarked to find cases where it brinds performance benefits. And if there aren't many such cases it probably should be removed. Here is the benchmarking task https://jira.mariadb.org/browse/MDEV-17492

            kevg Eugene Kosov (Inactive) added a comment - Nothing more to do with this issue. AHI should be benchmarked to find cases where it brinds performance benefits. And if there aren't many such cases it probably should be removed. Here is the benchmarking task https://jira.mariadb.org/browse/MDEV-17492
            kevg Eugene Kosov (Inactive) made changes -
            Fix Version/s 10.1.38 [ 23209 ]
            Fix Version/s 10.1 [ 16100 ]
            Resolution Won't Fix [ 2 ]
            Status In Progress [ 3 ] Closed [ 6 ]

            Thanks for those thorough benchmarks, kevg.

            marko,

            Thanks for the analysis.

            It's probably a good idea to fix the incorrect statement in the documentation. I've changed it to say the following:

            TRUNCATE TABLE is faster than DELETE, because it drops and re-creates a table. With XtraDB/InnoDB, this is slower if innodb_file_per_table=ON is set (the default since MariaDB 5.5). This is because TRUNCATE TABLE unlinks the underlying tablespace file, which can be an expensive operation. See MDEV-8069 for more details. TRUNCATE TABLE can also perform poorly in cases where the InnoDB buffer pool is very large. In that case, DROP TABLE followed by CREATE TABLE may perform better. In MariaDB 10.2.19 and later, this performance can also be improved by setting innodb_safe_truncate=OFF. See MDEV-9459 for more details. Setting innodb_adaptive_hash_index=OFF can improve performance of TRUNCATE TABLE in general.

            https://mariadb.com/kb/en/library/truncate-table/

            Did I forget or misunderstand anything important?

            Thanks!

            GeoffMontee Geoff Montee (Inactive) added a comment - Thanks for those thorough benchmarks, kevg . marko , Thanks for the analysis. It's probably a good idea to fix the incorrect statement in the documentation. I've changed it to say the following: TRUNCATE TABLE is faster than DELETE, because it drops and re-creates a table. With XtraDB/InnoDB, this is slower if innodb_file_per_table=ON is set (the default since MariaDB 5.5). This is because TRUNCATE TABLE unlinks the underlying tablespace file, which can be an expensive operation. See MDEV-8069 for more details. TRUNCATE TABLE can also perform poorly in cases where the InnoDB buffer pool is very large. In that case, DROP TABLE followed by CREATE TABLE may perform better. In MariaDB 10.2.19 and later, this performance can also be improved by setting innodb_safe_truncate=OFF. See MDEV-9459 for more details. Setting innodb_adaptive_hash_index=OFF can improve performance of TRUNCATE TABLE in general. https://mariadb.com/kb/en/library/truncate-table/ Did I forget or misunderstand anything important? Thanks!

            Hi!. I'll summarise:

            adapt=0 per_file=0:
            Iterate all table pages to mark them as free.

            adapt=0 per_file=1:
            Unlink file.

            adapt=1 per_file=0:
            Iterate all table pages to mark them as free. Also iterate every row to remove it from AHI.

            adapt=1 per_file=1:
            Iterate buffer pool pages to find records and remove them from AHI. Unlink file.

            > TRUNCATE TABLE can also perform poorly in cases where the InnoDB buffer pool is very large.

            That's a bit imprecise. Only adapt=1 per_file=1 case iterates through buffer pool.

            Everything else seems correct.

            Btw, I haven't checked https://mariadb.com/kb/en/library/truncate-table/ because it doesn't work for me today.

            kevg Eugene Kosov (Inactive) added a comment - Hi!. I'll summarise: adapt=0 per_file=0: Iterate all table pages to mark them as free. adapt=0 per_file=1: Unlink file. adapt=1 per_file=0: Iterate all table pages to mark them as free. Also iterate every row to remove it from AHI. adapt=1 per_file=1: Iterate buffer pool pages to find records and remove them from AHI. Unlink file. > TRUNCATE TABLE can also perform poorly in cases where the InnoDB buffer pool is very large. That's a bit imprecise. Only adapt=1 per_file=1 case iterates through buffer pool. Everything else seems correct. Btw, I haven't checked https://mariadb.com/kb/en/library/truncate-table/ because it doesn't work for me today.

            > TRUNCATE TABLE can also perform poorly in cases where the InnoDB buffer pool is very large.

            That's a bit imprecise. Only adapt=1 per_file=1 case iterates through buffer pool.

            The note about TRUNCATE and large buffer pools in the documentation was not originally related to this issue. It was related to MDEV-9459, which was related to the following upstream bug:

            https://bugs.mysql.com/bug.php?id=68184

            After looking more closely at the upstream bug report, I see that it is indeed directly related to the issues with innodb_adaptive_hash_index. Thanks for pointing that out. I've clarified the documentation.

            Btw, I haven't checked https://mariadb.com/kb/en/library/truncate-table/ because it doesn't work for me today.

            The knowledge base site has been having some intermittent issues after the latest web site update.

            GeoffMontee Geoff Montee (Inactive) added a comment - > TRUNCATE TABLE can also perform poorly in cases where the InnoDB buffer pool is very large. That's a bit imprecise. Only adapt=1 per_file=1 case iterates through buffer pool. The note about TRUNCATE and large buffer pools in the documentation was not originally related to this issue. It was related to MDEV-9459 , which was related to the following upstream bug: https://bugs.mysql.com/bug.php?id=68184 After looking more closely at the upstream bug report, I see that it is indeed directly related to the issues with innodb_adaptive_hash_index. Thanks for pointing that out. I've clarified the documentation. Btw, I haven't checked https://mariadb.com/kb/en/library/truncate-table/ because it doesn't work for me today. The knowledge base site has been having some intermittent issues after the latest web site update.
            GeoffMontee Geoff Montee (Inactive) made changes -
            kevg Eugene Kosov (Inactive) made changes -
            Fix Version/s 10.1.38 [ 23209 ]
            rob.schwyzer@mariadb.com Rob Schwyzer (Inactive) made changes -
            Labels innodb truncate ServiceNow innodb truncate
            rob.schwyzer@mariadb.com Rob Schwyzer (Inactive) made changes -
            Labels ServiceNow innodb truncate 76qDvLB8Gju6Hs7nk3VY3EX42G795W5z innodb truncate
            serg Sergei Golubchik made changes -
            Labels 76qDvLB8Gju6Hs7nk3VY3EX42G795W5z innodb truncate innodb truncate
            serg Sergei Golubchik made changes -
            Workflow MariaDB v3 [ 88510 ] MariaDB v4 [ 154694 ]
            mariadb-jira-automation Jira Automation (IT) made changes -
            Zendesk Related Tickets 201658 169729
            Zendesk active tickets 201658

            People

              kevg Eugene Kosov (Inactive)
              GeoffMontee Geoff Montee (Inactive)
              Votes:
              0 Vote for this issue
              Watchers:
              5 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.