Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-27074

container start-up to take a long time

Details

    • Bug
    • Status: Closed (View Workflow)
    • Minor
    • Resolution: Fixed
    • 10.6.5
    • 10.2.42, 10.3.33, 10.4.23, 10.5.14, 10.6.6, 10.7.2
    • Docker
    • None
    • Windows 10, Docker Desktop for Windows WSLv2, AMD Ryzen 5800X CPU, 32GB 3600MHz DDR4, and 2TB Samsung 980 Pro PCI-E 4.0 x4 SSD

    Description

      On startup the image throws the following warnings, each of which seems to contribute to the container taking longer to finish start-up tasks, making full start-up take over a minute.

      Least impactful is the first warning-

      2021-11-17 20:47:02 1 [Warning] Failed to load slave replication state from table mysql.gtid_slave_pos: 1017: Can't find file: './mysql/' (errno: 2 "No such file or directory")

      This one is probably fine as-is, but if it can be addressed, would be nice.

      The bigger issues are these three-

      2021-11-17 20:47:07 0 [Note] mariadbd: ready for connections.
      Version: '10.6.5-MariaDB-1:10.6.5+maria~focal-log'  socket: '/run/mysqld/mysqld.sock'  port: 0  mariadb.org binary distribution
      2021-11-17 20:47:07+00:00 [Note] [Entrypoint]: Temporary server started.
      Warning: Unable to load '/usr/share/zoneinfo/leap-seconds.list' as time zone. Skipping it.
      Warning: Unable to load '/usr/share/zoneinfo/leapseconds' as time zone. Skipping it.
      Warning: Unable to load '/usr/share/zoneinfo/tzdata.zi' as time zone. Skipping it.
      2021-11-17 20:47:57 6 [Warning] 'proxies_priv' entry '@% root@3ca6dd5b620d' ignored in --skip-name-resolve mode.
      2021-11-17 20:47:57+00:00 [Note] [Entrypoint]: Creating database test
      2021-11-17 20:47:57+00:00 [Note] [Entrypoint]: Creating user tester
      2021-11-17 20:47:57+00:00 [Note] [Entrypoint]: Giving user tester access to schema test
      2021-11-17 20:47:57+00:00 [Note] [Entrypoint]: Stopping temporary server
      2021-11-17 20:47:57 0 [Note] mariadbd (initiated by: root[root] @ localhost []): Normal shutdown
      2021-11-17 20:47:57 0 [Note] InnoDB: FTS optimize thread exiting.

      Key here are the time stamps- it is 20:47:07 before the Warning: Unable to load... entries occur and then 20:47:57 after. It seems like the system waits for these to time out before proceeding.

      The amount of time here doesn't look significant, but this was a faster than typical run for this- it can take longer. It also feels inordinately long compared to when the temporary server starts up (and also quite long compared to Nginx and PHP-FPM container start-up times, making MariaDB the bottleneck for launching new EMP stacks). If we can resolve or work-around this issue, the MariaDB Docker container will feel much more responsive and yield a much improved user experience.

      Attachments

        Issue Links

          Activity

            danblack Daniel Black added a comment -

            The unable to load message itself isn't incuring a timeout, these are just that these particular files are zoneinfo file. The timezone loading is quite IO intense. I did reduce the time in https://github.com/MariaDB/mariadb-docker/commit/88ff4eed96a7662c1a06b16397592af954770070, however there are better ways like LOCK TABLES ... WRITE on these that I haven't done. 50 seconds does seem like far too long however.

            If you don't need the timezones - MARIADB_INITDB_SKIP_TZINFO=1 can help.

            On the replication, yes there is an every growing list of things that should be implicitly part of mysql_install_db. It will be addressed soon.

            I'll look into the number of restarts and what is being done in each. Do you have a mariadb configuration file that may hint as to which bits are taking time in the server itself?

            danblack Daniel Black added a comment - The unable to load message itself isn't incuring a timeout, these are just that these particular files are zoneinfo file. The timezone loading is quite IO intense. I did reduce the time in https://github.com/MariaDB/mariadb-docker/commit/88ff4eed96a7662c1a06b16397592af954770070 , however there are better ways like LOCK TABLES ... WRITE on these that I haven't done. 50 seconds does seem like far too long however. If you don't need the timezones - MARIADB_INITDB_SKIP_TZINFO=1 can help. On the replication, yes there is an every growing list of things that should be implicitly part of mysql_install_db. It will be addressed soon. I'll look into the number of restarts and what is being done in each. Do you have a mariadb configuration file that may hint as to which bits are taking time in the server itself?

            [mariadb]
            server_id=1001
             
            max_connections=60
            max_user_connections=50
             
            lower_case_table_names=1
            character_set_server=utf8mb4
            collation_server=utf8mb4_unicode_ci
             
            log_bin = mariadb-bin
            binlog_format = ROW
            sync_binlog=1
             
            innodb_flush_log_at_trx_commit=1
            innodb_buffer_pool_size=2G
            innodb_log_file_size=2G

            rob.schwyzer@mariadb.com Rob Schwyzer (Inactive) added a comment - [mariadb] server_id=1001   max_connections=60 max_user_connections=50   lower_case_table_names=1 character_set_server=utf8mb4 collation_server=utf8mb4_unicode_ci   log_bin = mariadb-bin binlog_format = ROW sync_binlog=1   innodb_flush_log_at_trx_commit=1 innodb_buffer_pool_size=2G innodb_log_file_size=2G
            danblack Daniel Black added a comment -

            Thanks rob.schwyzer@mariadb.com

            Can you please test the image quay.io/danielgblack/mariadb-test:10.6_tz_init_lock_rather_than_alter which includes this change. Can you include -e MARIADB_RANDOM_ROOT_PASSWORD=1 in your test as the root password in the output provides a good separate between the time consumed by TZ initialization compared to the other generic user setup.

            Enabling the binlog, esp with sync_binlog is obviously going to be added IO. Overall the above fix might improve this somewhat, but I think I can do better.

            There's no need for a record of the binary log from the initialization (please correct me if there is). So the part 2, on top of the above, is to remove the the log-bin during initialization. So please test the timings of quay.io/danielgblack/mariadb-test:10.6_no_log_bin_on_temp too which is based on this change.

            danblack Daniel Black added a comment - Thanks rob.schwyzer@mariadb.com Can you please test the image quay.io/danielgblack/mariadb-test:10.6_tz_init_lock_rather_than_alter which includes this change . Can you include -e MARIADB_RANDOM_ROOT_PASSWORD=1 in your test as the root password in the output provides a good separate between the time consumed by TZ initialization compared to the other generic user setup. Enabling the binlog, esp with sync_binlog is obviously going to be added IO. Overall the above fix might improve this somewhat, but I think I can do better. There's no need for a record of the binary log from the initialization (please correct me if there is). So the part 2, on top of the above, is to remove the the log-bin during initialization. So please test the timings of quay.io/danielgblack/mariadb-test:10.6_no_log_bin_on_temp too which is based on this change .

            Thank you danblack!

            I first tested the image as-is:

            2021-12-08 21:09:06+00:00 [Note] [Entrypoint]: Temporary server started.
            stack-db-1       | Warning: Unable to load '/usr/share/zoneinfo/leap-seconds.list' as time zone. Skipping it.
            stack-db-1       | Warning: Unable to load '/usr/share/zoneinfo/leapseconds' as time zone. Skipping it.
            stack-db-1       | Warning: Unable to load '/usr/share/zoneinfo/tzdata.zi' as time zone. Skipping it.
            stack-db-1       | 2021-12-08 21:09:55 6 [Warning] 'proxies_priv' entry '@% root@361818db6f5d' ignored in --skip-name-resolve mode.

            I next tested the image to use -e MARIADB_RANDOM_ROOT_PASSWORD=1 instead:

            stack-db-1       | 2021-12-08 21:12:30+00:00 [Note] [Entrypoint]: Temporary server started.
            stack-db-1       | Warning: Unable to load '/usr/share/zoneinfo/leap-seconds.list' as time zone. Skipping it.
            stack-db-1       | Warning: Unable to load '/usr/share/zoneinfo/leapseconds' as time zone. Skipping it.
            stack-db-1       | Warning: Unable to load '/usr/share/zoneinfo/tzdata.zi' as time zone. Skipping it.
            stack-db-1       | 2021-12-08 21:13:18+00:00 [Note] [Entrypoint]: GENERATED ROOT PASSWORD: p&_%W9~FqsSI"jsx9|[:l_e{*"]4?1w.
            stack-db-1       | 2021-12-08 21:13:18 6 [Warning] 'proxies_priv' entry '@% root@6154ea28819f' ignored in --skip-name-resolve mode.

            No, notable improvement.

            Now, using the 2nd image you provided instead, with the generated root password:

            stack-db-1       | 2021-12-08 21:16:47+00:00 [Note] [Entrypoint]: Temporary server started.
            stack-db-1       | Warning: Unable to load '/usr/share/zoneinfo/leap-seconds.list' as time zone. Skipping it.
            stack-db-1       | Warning: Unable to load '/usr/share/zoneinfo/leapseconds' as time zone. Skipping it.
            stack-db-1       | Warning: Unable to load '/usr/share/zoneinfo/tzdata.zi' as time zone. Skipping it.
            stack-db-1       | 2021-12-08 21:16:48+00:00 [Note] [Entrypoint]: GENERATED ROOT PASSWORD: J+An%nDp|<Q054TG;mo>e^Du)zql>:#H
            stack-db-1       | 2021-12-08 21:16:48 5 [Warning] 'proxies_priv' entry '@% root@897de353c4f9' ignored in --skip-name-resolve mode.

            Night and day difference. Finally, testing with the original ENV config:

            stack-db-1       | 2021-12-08 21:17:43+00:00 [Note] [Entrypoint]: Temporary server started.
            stack-db-1       | Warning: Unable to load '/usr/share/zoneinfo/leap-seconds.list' as time zone. Skipping it.
            stack-db-1       | Warning: Unable to load '/usr/share/zoneinfo/leapseconds' as time zone. Skipping it.
            stack-db-1       | Warning: Unable to load '/usr/share/zoneinfo/tzdata.zi' as time zone. Skipping it.
            stack-db-1       | 2021-12-08 21:17:45 5 [Warning] 'proxies_priv' entry '@% root@54fd176a0175' ignored in --skip-name-resolve mode.

            So quay.io/danielgblack/mariadb-test:10.6_no_log_bin_on_temp is the real difference-maker here.

            Finally, checking on my end with sync_binlog=1 commented out while using the original MariaDB 10.6 container image:

            stack-db-1       | 2021-12-08 21:20:02+00:00 [Note] [Entrypoint]: Temporary server started.
            stack-db-1       | Warning: Unable to load '/usr/share/zoneinfo/leap-seconds.list' as time zone. Skipping it.
            stack-db-1       | Warning: Unable to load '/usr/share/zoneinfo/leapseconds' as time zone. Skipping it.
            stack-db-1       | Warning: Unable to load '/usr/share/zoneinfo/tzdata.zi' as time zone. Skipping it.
            stack-db-1       | 2021-12-08 21:20:04 6 [Warning] 'proxies_priv' entry '@% root@257c09280fc4' ignored in --skip-name-resolve mode.

            As expected, this is all caused by sync_binlog=1 (I spent more time evaluating hardware resource use on my end yesterday and noticed my Samsung 980 Pro 2TB was reporting 100% usage despite pumping out only ~5MB/s which usually means it's doing an enormous number of tiny writes), so your work-around of disabling binary logging for the temporary server part fixes the issue caused by sync_binlog=1. And honestly... this is perfect! There are use-cases where sync_binlog=1 makes sense and where we want to test with that on, and quay.io/danielgblack/mariadb-test:10.6_no_log_bin_on_temp enables this use-case wonderfully.

            If quay.io/danielgblack/mariadb-test:10.6_no_log_bin_on_temp can be merged upstream, my concerns here are fully addressed. Thank you!

            rob.schwyzer@mariadb.com Rob Schwyzer (Inactive) added a comment - Thank you danblack ! I first tested the image as-is: 2021-12-08 21:09:06+00:00 [Note] [Entrypoint]: Temporary server started. stack-db-1 | Warning: Unable to load '/usr/share/zoneinfo/leap-seconds.list' as time zone. Skipping it. stack-db-1 | Warning: Unable to load '/usr/share/zoneinfo/leapseconds' as time zone. Skipping it. stack-db-1 | Warning: Unable to load '/usr/share/zoneinfo/tzdata.zi' as time zone. Skipping it. stack-db-1 | 2021-12-08 21:09:55 6 [Warning] 'proxies_priv' entry '@% root@361818db6f5d' ignored in --skip-name-resolve mode. I next tested the image to use -e MARIADB_RANDOM_ROOT_PASSWORD=1 instead: stack-db-1 | 2021-12-08 21:12:30+00:00 [Note] [Entrypoint]: Temporary server started. stack-db-1 | Warning: Unable to load '/usr/share/zoneinfo/leap-seconds.list' as time zone. Skipping it. stack-db-1 | Warning: Unable to load '/usr/share/zoneinfo/leapseconds' as time zone. Skipping it. stack-db-1 | Warning: Unable to load '/usr/share/zoneinfo/tzdata.zi' as time zone. Skipping it. stack-db-1 | 2021-12-08 21:13:18+00:00 [Note] [Entrypoint]: GENERATED ROOT PASSWORD: p&_%W9~FqsSI"jsx9|[:l_e{*"]4?1w. stack-db-1 | 2021-12-08 21:13:18 6 [Warning] 'proxies_priv' entry '@% root@6154ea28819f' ignored in --skip-name-resolve mode. No, notable improvement. Now, using the 2nd image you provided instead, with the generated root password: stack-db-1 | 2021-12-08 21:16:47+00:00 [Note] [Entrypoint]: Temporary server started. stack-db-1 | Warning: Unable to load '/usr/share/zoneinfo/leap-seconds.list' as time zone. Skipping it. stack-db-1 | Warning: Unable to load '/usr/share/zoneinfo/leapseconds' as time zone. Skipping it. stack-db-1 | Warning: Unable to load '/usr/share/zoneinfo/tzdata.zi' as time zone. Skipping it. stack-db-1 | 2021-12-08 21:16:48+00:00 [Note] [Entrypoint]: GENERATED ROOT PASSWORD: J+An%nDp|<Q054TG;mo>e^Du)zql>:#H stack-db-1 | 2021-12-08 21:16:48 5 [Warning] 'proxies_priv' entry '@% root@897de353c4f9' ignored in --skip-name-resolve mode. Night and day difference. Finally, testing with the original ENV config: stack-db-1 | 2021-12-08 21:17:43+00:00 [Note] [Entrypoint]: Temporary server started. stack-db-1 | Warning: Unable to load '/usr/share/zoneinfo/leap-seconds.list' as time zone. Skipping it. stack-db-1 | Warning: Unable to load '/usr/share/zoneinfo/leapseconds' as time zone. Skipping it. stack-db-1 | Warning: Unable to load '/usr/share/zoneinfo/tzdata.zi' as time zone. Skipping it. stack-db-1 | 2021-12-08 21:17:45 5 [Warning] 'proxies_priv' entry '@% root@54fd176a0175' ignored in --skip-name-resolve mode. So quay.io/danielgblack/mariadb-test:10.6_no_log_bin_on_temp is the real difference-maker here. Finally, checking on my end with sync_binlog=1 commented out while using the original MariaDB 10.6 container image: stack-db-1 | 2021-12-08 21:20:02+00:00 [Note] [Entrypoint]: Temporary server started. stack-db-1 | Warning: Unable to load '/usr/share/zoneinfo/leap-seconds.list' as time zone. Skipping it. stack-db-1 | Warning: Unable to load '/usr/share/zoneinfo/leapseconds' as time zone. Skipping it. stack-db-1 | Warning: Unable to load '/usr/share/zoneinfo/tzdata.zi' as time zone. Skipping it. stack-db-1 | 2021-12-08 21:20:04 6 [Warning] 'proxies_priv' entry '@% root@257c09280fc4' ignored in --skip-name-resolve mode. As expected, this is all caused by sync_binlog=1 (I spent more time evaluating hardware resource use on my end yesterday and noticed my Samsung 980 Pro 2TB was reporting 100% usage despite pumping out only ~5MB/s which usually means it's doing an enormous number of tiny writes), so your work-around of disabling binary logging for the temporary server part fixes the issue caused by sync_binlog=1 . And honestly... this is perfect! There are use-cases where sync_binlog=1 makes sense and where we want to test with that on, and quay.io/danielgblack/mariadb-test:10.6_no_log_bin_on_temp enables this use-case wonderfully. If quay.io/danielgblack/mariadb-test:10.6_no_log_bin_on_temp can be merged upstream, my concerns here are fully addressed. Thank you!
            danblack Daniel Black added a comment -

            Thanks for testing.

            https://github.com/MariaDB/mariadb-docker/commit/c6c9e6cd78d12cdb837348c644de4d543ad3dee1
            https://github.com/MariaDB/mariadb-docker/commit/146e09c9c7aa5d301ac96c1939fdd112f802f4ae

            Will be out next release, or maybe sooner if I get a few major features of healthcheck/mariadb-upgrade resolved.

            I've omitted the TZ initialization improvements, but rerolled up a fix for MDEV-23326 and pinged the reviewer for a server fix. Then you can gain the last second of start up time .

            danblack Daniel Black added a comment - Thanks for testing. https://github.com/MariaDB/mariadb-docker/commit/c6c9e6cd78d12cdb837348c644de4d543ad3dee1 https://github.com/MariaDB/mariadb-docker/commit/146e09c9c7aa5d301ac96c1939fdd112f802f4ae Will be out next release, or maybe sooner if I get a few major features of healthcheck/mariadb-upgrade resolved. I've omitted the TZ initialization improvements, but rerolled up a fix for MDEV-23326 and pinged the reviewer for a server fix. Then you can gain the last second of start up time .

            People

              danblack Daniel Black
              rob.schwyzer@mariadb.com Rob Schwyzer (Inactive)
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.