Details

    Description

      This DB has about 30GB and a lot of tables in many databases:

      [mysql]# find ./ -name "*.ibd" | wc -l
      90219
      

      Start takes more than 5 minutes with heavy IO load made by mysql process:

      2017-09-22 10:07:27 140632756656096 [Note] InnoDB: Highest supported file format is Barracuda.
      2017-09-22 10:13:05 140632756656096 [Note] InnoDB: 128 out of 128 rollback segments are active.
      

      During the startup, strace shows this:

      cat /root/strace | grep open
       
      [pid 29937] open("./deploy_test_b2c15/data_sources.ibd", O_RDONLY) = 1363
      [pid 29937] open("./deploy_test_b2c15/data_sources.ibd", O_RDWR) = 1363
      [pid 29937] open("./deploy_test_b2c15/data_sources_backup.isl", O_RDWR) = -1 ENOENT (No such file or directory)
      [pid 29937] open("./deploy_test_b2c15/data_sources_backup.ibd", O_RDONLY) = 4108
      [pid 29937] open("./deploy_test_b2c15/data_sources_backup.ibd", O_RDONLY) = 1364
      [pid 29937] open("./deploy_test_b2c15/data_sources_backup.ibd", O_RDWR) = 1364
      [pid 29937] open("./deploy_test_b2c15/data_sources_brands.isl", O_RDWR) = -1 ENOENT (No such file or directory)
      [pid 29937] open("./deploy_test_b2c15/data_sources_brands.ibd", O_RDONLY) = 4108
      [pid 29937] open("./deploy_test_b2c15/data_sources_brands.ibd", O_RDONLY) = 1365
      [pid 29937] open("./deploy_test_b2c15/data_sources_brands.ibd", O_RDWR) = 1365
      [pid 29937] open("./deploy_test_b2c15/data_sources_categories.isl", O_RDWR) = -1 ENOENT (No such file or directory)
      [pid 29937] open("./deploy_test_b2c15/data_sources_categories.ibd", O_RDONLY) = 4108
      [pid 29937] open("./deploy_test_b2c15/data_sources_categories.ibd", O_RDONLY) = 1366
      [pid 29937] open("./deploy_test_b2c15/data_sources_categories.ibd", O_RDWR) = 1366
      [pid 29937] open("./deploy_test_b2c15/data_sources_categories_brands.isl", O_RDWR) = -1 ENOENT (No such file or directory)
      [pid 29937] open("./deploy_test_b2c15/data_sources_categories_brands.ibd", O_RDONLY) = 4108
      [pid 29937] open("./deploy_test_b2c15/data_sources_categories_brands.ibd", O_RDONLY) = 1367
      [pid 29937] open("./deploy_test_b2c15/data_sources_categories_brands.ibd", O_RDWR) = 1367
      [pid 29937] open("./deploy_test_b2c15/data_sources_img_queue.isl", O_RDWR) = -1 ENOENT (No such file or directory)
      [pid 29937] open("./deploy_test_b2c15/data_sources_img_queue.ibd", O_RDONLY) = 4108
      [pid 29937] open("./deploy_test_b2c15/data_sources_img_queue.ibd", O_RDONLY) = 1368
      [pid 29937] open("./deploy_test_b2c15/data_sources_img_queue.ibd", O_RDWR) = 1368
      [pid 29937] open("./deploy_test_b2c15/data_sources_methods.isl", O_RDWR) = -1 ENOENT (No such file or directory)
      [pid 29937] open("./deploy_test_b2c15/data_sources_methods.ibd", O_RDONLY) = 4108
      [pid 29937] open("./deploy_test_b2c15/data_sources_methods.ibd", O_RDONLY) = 1369
      [pid 29937] open("./deploy_test_b2c15/data_sources_methods.ibd", O_RDWR) = 1369
      [pid 29937] open("./deploy_test_b2c15/default_delivery_address.isl", O_RDWR) = -1 ENOENT (No such file or directory)
      [pid 29937] open("./deploy_test_b2c15/default_delivery_address.ibd", O_RDONLY) = 4108
      [pid 29937] open("./deploy_test_b2c15/default_delivery_address.ibd", O_RDONLY) = 1370
      [pid 29937] open("./deploy_test_b2c15/default_delivery_address.ibd", O_RDWR) = 1370??
      

      Full strace:

      [pid 29937] open("./demo_b2c15/novinky4_kategorie_priv_write.ibd", O_RDONLY) = 3810
      [pid 29937] lseek(3810, 0, SEEK_CUR)    = 0
      [pid 29937] lseek(3810, 0, SEEK_END)    = 131072
      [pid 29937] lseek(3810, 0, SEEK_SET)    = 0
      [pid 29937] pread(3810, "\343t\2[\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\3\351\31;\355\0\10\0\0\0\0\0\0"..., 16384, 0) = 16384
      [pid 29937] close(3810)                 = 0
      [pid 29937] open("./demo_b2c15/novinky4_kategorie_priv_write.ibd", O_RDWR) = 3810
      [pid 29937] fcntl(3810, F_SETLK, {type=F_WRLCK, whence=SEEK_SET, start=0, len=0}) = 0
      [pid 29937] pread(3810, "\343t\2[\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\3\351\31;\355\0\10\0\0\0\0\0\0"..., 16384, 0) = 16384
      [pid 29937] open("./demo_b2c15/novinky4_soubory.isl", O_RDWR) = -1 ENOENT (No such file or directory)
      [pid 29937] open("./demo_b2c15/novinky4_soubory.ibd", O_RDONLY) = 4108
      [pid 29937] fstat(4108, {st_mode=S_IFREG|0660, st_size=114688, ...}) = 0
      [pid 29937] pread(4108, "W\352G\231\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\3\351\31`/\0\10\0\0\0\0\0\0"..., 65536, 0) = 65536
      [pid 29937] close(4108)                 = 0
      [pid 29937] close(3811)                 = 0
      [pid 29937] open("./demo_b2c15/novinky4_soubory.ibd", O_RDONLY) = 3811
      [pid 29937] lseek(3811, 0, SEEK_CUR)    = 0
      [pid 29937] lseek(3811, 0, SEEK_END)    = 114688
      [pid 29937] lseek(3811, 0, SEEK_SET)    = 0
      [pid 29937] pread(3811, "W\352G\231\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\3\351\31`/\0\10\0\0\0\0\0\0"..., 16384, 0) = 16384
      [pid 29937] close(3811)                 = 0
      [pid 29937] open("./demo_b2c15/novinky4_soubory.ibd", O_RDWR) = 3811
      [pid 29937] fcntl(3811, F_SETLK, {type=F_WRLCK, whence=SEEK_SET, start=0, len=0}) = 0
      [pid 29937] pread(3811, "W\352G\231\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\3\351\31`/\0\10\0\0\0\0\0\0"..., 16384, 0) = 16384
      [pid 29937] open("./demo_b2c15/novinky5.isl", O_RDWR) = -1 ENOENT (No such file or directory)
      [pid 29937] open("./demo_b2c15/novinky5.ibd", O_RDONLY) = 4108
      [pid 29937] fstat(4108, {st_mode=S_IFREG|0660, st_size=212992, ...}) = 0
      [pid 29937] pread(4108, "\6\242(\27\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\3\351\32\10\352\0\10\0\0\0\0\0\0"..., 65536, 0) = 65536
      [pid 29937] close(4108)                 = 0
      [pid 29937] close(3812)                 = 0
      [pid 29937] open("./demo_b2c15/novinky5.ibd", O_RDONLY) = 3812
      [pid 29937] lseek(3812, 0, SEEK_CUR)    = 0
      [pid 29937] lseek(3812, 0, SEEK_END)    = 212992
      [pid 29937] lseek(3812, 0, SEEK_SET)    = 0
      [pid 29937] pread(3812, "\6\242(\27\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\3\351\32\10\352\0\10\0\0\0\0\0\0"..., 16384, 0) = 16384
      [pid 29937] close(3812)                 = 0
      [pid 29937] open("./demo_b2c15/novinky5.ibd", O_RDWR) = 3812
      [pid 29937] fcntl(3812, F_SETLK, {type=F_WRLCK, whence=SEEK_SET, start=0, len=0}) = 0
      [pid 29937] pread(3812, "\6\242(\27\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\3\351\32\10\352\0\10\0\0\0\0\0\0"..., 16384, 0) = 16384
      [pid 29937] open("./demo_b2c15/novinky5_formulare.isl", O_RDWR) = -1 ENOENT (No such file or directory)
      [pid 29937] open("./demo_b2c15/novinky5_formulare.ibd", O_RDONLY) = 4108
      [pid 29937] fstat(4108, {st_mode=S_IFREG|0660, st_size=114688, ...}) = 0
      [pid 29937] pread(4108, "\253\270\202\216\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\3\351\32\323)\0\10\0\0\0\0\0\0"..., 65536, 0) = 65536
      [pid 29937] close(4108)                 = 0
      [pid 29937] close(3813)                 = 0
      

      I found the very same problem (https://jira.mariadb.org/browse/MDEV-12610) closed as solved in 10.2.7. This is 10.2.8 and looks like problem appears again.

      This db was 10.0.xx upgraded to 10.1.xx and finally to 10.2.8. Can make this upgrade this issue?

      Attachments

        Issue Links

          Activity

            In MariaDB 10.2.2, we seem to always open all .ibd files on startup, even when no crash recovery is involved. (When crash recovery is involved, we do this on purpose, as noted in MDEV-9843.)

            This mistake was made in the merge of MySQL 5.7.9 into MariaDB 10.2.2. The assignment validate=true in dict_check_sys_tables() causes InnoDB to read every single .ibd file at startup, even when no crash recovery is needed.

            marko Marko Mäkelä added a comment - In MariaDB 10.2.2, we seem to always open all .ibd files on startup, even when no crash recovery is involved. (When crash recovery is involved, we do this on purpose, as noted in MDEV-9843 .) This mistake was made in the merge of MySQL 5.7.9 into MariaDB 10.2.2 . The assignment validate=true in dict_check_sys_tables() causes InnoDB to read every single .ibd file at startup, even when no crash recovery is needed.

            As far as I can tell, the purpose of the parameter bool validate of fil_ibd_open() is to perform some extra checks in case any inconsistencies were caused by the implementation of DATA DIRECTORY, which was originally introduced in MySQL 5.6 and MariaDB 10.0.

            diff --git a/storage/innobase/dict/dict0load.cc b/storage/innobase/dict/dict0load.cc
            index 15e14b9d9f7..725862944bc 100644
            --- a/storage/innobase/dict/dict0load.cc
            +++ b/storage/innobase/dict/dict0load.cc
            @@ -1471,8 +1471,6 @@ dict_check_sys_tables(
             		char*	filepath = dict_get_first_path(space_id);
             
             		/* Check that the .ibd file exists. */
            -		validate = true; /* Encryption */
            -
             		dberr_t	err = fil_ibd_open(
             			validate,
             			!srv_read_only_mode && srv_log_file_size != 0,
            

            With the above patch, we would skip reading the files in fil_ibd_open() at database startup, but then we would also fail to detect some corruption, because the tests innodb.table_flags and innodb.log_file_name would start to fail.

            I think that we must try to implement some checks in fil_node_open_file() when first_time_open holds, so that we would always validate the data file when we open it for the very first time. That is a little tricky, because we will not have the dict_table_t available in that context.

            marko Marko Mäkelä added a comment - As far as I can tell, the purpose of the parameter bool validate of fil_ibd_open() is to perform some extra checks in case any inconsistencies were caused by the implementation of DATA DIRECTORY , which was originally introduced in MySQL 5.6 and MariaDB 10.0. diff --git a/storage/innobase/dict/dict0load.cc b/storage/innobase/dict/dict0load.cc index 15e14b9d9f7..725862944bc 100644 --- a/storage/innobase/dict/dict0load.cc +++ b/storage/innobase/dict/dict0load.cc @@ -1471,8 +1471,6 @@ dict_check_sys_tables( char* filepath = dict_get_first_path(space_id); /* Check that the .ibd file exists. */ - validate = true; /* Encryption */ - dberr_t err = fil_ibd_open( validate, !srv_read_only_mode && srv_log_file_size != 0, With the above patch, we would skip reading the files in fil_ibd_open() at database startup, but then we would also fail to detect some corruption, because the tests innodb.table_flags and innodb.log_file_name would start to fail. I think that we must try to implement some checks in fil_node_open_file() when first_time_open holds, so that we would always validate the data file when we open it for the very first time. That is a little tricky, because we will not have the dict_table_t available in that context.

            Not a work around, but if you are hitting this issue and on a systemd, you might need to tweak the service settings. If TimeoutStartSec is hit when systemd is trying to start up mariadb, it will kill the process.

            /etc/systemd/system/mariadb.service.d/tweak.conf

            [Service]
            TimeoutStartSec=7200

            My MariaDB startup time is now weighing in at 36 minutes, which by default would hit the TimeoutStartSec threshold making startup impossible.

            Even at 7200 seconds (2 hours), I got into trouble when upgrading from 10.2.11 to 10.2.12. The update process has another tax on restart (which I assume is similar to this recorded issue), that had the restart on upgrade hitting TimeoutStartSec. Not sure what would be an acceptable value for TimeoutStartSec.

            michaelcaplan Michael Caplan added a comment - Not a work around, but if you are hitting this issue and on a systemd, you might need to tweak the service settings. If TimeoutStartSec is hit when systemd is trying to start up mariadb, it will kill the process. /etc/systemd/system/mariadb.service.d/tweak.conf [Service] TimeoutStartSec=7200 My MariaDB startup time is now weighing in at 36 minutes, which by default would hit the TimeoutStartSec threshold making startup impossible. Even at 7200 seconds (2 hours), I got into trouble when upgrading from 10.2.11 to 10.2.12. The update process has another tax on restart (which I assume is similar to this recorded issue), that had the restart on upgrade hitting TimeoutStartSec. Not sure what would be an acceptable value for TimeoutStartSec.

            @marko are there any workarounds for this issue for Maria 10.2? Or is a rollback to 10.1 the best bet?

            michaelcaplan Michael Caplan added a comment - @marko are there any workarounds for this issue for Maria 10.2? Or is a rollback to 10.1 the best bet?

            FYI, my 36 minute start time is for a DB roughly 290GB and roughly 991K tables

            michaelcaplan Michael Caplan added a comment - FYI, my 36 minute start time is for a DB roughly 290GB and roughly 991K tables
            danblack Daniel Black added a comment -

            michaelcaplan FYI - working on getting co-operation between mariadb and systemd regarding startup/shutdown timings. https://github.com/MariaDB/server/pull/576 . Unfortunately requires systemd v236. So I'm really watching this to make sure I have some idea of which slow points in startup/shutdown can't be prevented.

            danblack Daniel Black added a comment - michaelcaplan FYI - working on getting co-operation between mariadb and systemd regarding startup/shutdown timings. https://github.com/MariaDB/server/pull/576 . Unfortunately requires systemd v236. So I'm really watching this to make sure I have some idea of which slow points in startup/shutdown can't be prevented.

            Sorry, but it turns out that there is one more cause of reading all .ibd files at startup. That one was added in MySQL 5.7, which was first merged to MariaDB 10.2.2. See MDEV-15333.

            marko Marko Mäkelä added a comment - Sorry, but it turns out that there is one more cause of reading all .ibd files at startup. That one was added in MySQL 5.7, which was first merged to MariaDB 10.2.2. See MDEV-15333 .

            People

              marko Marko Mäkelä
              daniel.rimal@cldn.cz Dan Rimal
              Votes:
              1 Vote for this issue
              Watchers:
              6 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.