Details

    • 5.5.55, 10.2.5-1

    Attachments

      1. jan.opt
        0.1 kB
      2. jan.test
        0.9 kB

      Issue Links

        Activity

          jplindst Jan Lindström (Inactive) created issue -
          ratzpo Rasmus Johansson (Inactive) made changes -
          Field Original Value New Value
          Sprint 5.5.55&10.0.30 [ 138 ]
          ratzpo Rasmus Johansson (Inactive) made changes -
          Rank Ranked lower
          jplindst Jan Lindström (Inactive) made changes -
          Status Open [ 1 ] In Progress [ 3 ]

          On a related note: When the log files are missing on startup and they have just been created by the very same InnoDB startup run, also in that case InnoDB will display messages about doing crash recovery, even though there is nothing to recover. This is the case at least in 10.2, maybe also 10.0 and 10.1.

          marko Marko Mäkelä added a comment - On a related note: When the log files are missing on startup and they have just been created by the very same InnoDB startup run, also in that case InnoDB will display messages about doing crash recovery, even though there is nothing to recover. This is the case at least in 10.2, maybe also 10.0 and 10.1.

          For review, please use this benchmark which is applicable to 10.0 and later:

          ./mtr innodb.log_file_size
          wc -l var/log/mysqld.1.err
          

          Display the number of lines in the server error log before and after the fix. Preferrably, display the message lines that were omitted by the patched code.

          marko Marko Mäkelä added a comment - For review, please use this benchmark which is applicable to 10.0 and later: ./mtr innodb.log_file_size wc -l var/log/mysqld.1.err Display the number of lines in the server error log before and after the fix. Preferrably, display the message lines that were omitted by the patched code.

          If we avoid assigning recv_needed_recovery=true when the redo log files are missing on startup, we will also avoid a costly process of opening all *.ibd files.

          marko Marko Mäkelä added a comment - If we avoid assigning recv_needed_recovery=true when the redo log files are missing on startup, we will also avoid a costly process of opening all *.ibd files.

          MySQL Bug#78844 was filed against MySQL 5.7, so it would only apply to MariaDB 10.2.2 onwards.
          I think that fixing MySQL Bug#80788 would have a better impact.

          marko Marko Mäkelä added a comment - MySQL Bug#78844 was filed against MySQL 5.7, so it would only apply to MariaDB 10.2.2 onwards. I think that fixing MySQL Bug#80788 would have a better impact.

          Can't use patch on #80788 as we do not have necessary license (provided one is not for MariaDB). Maybe you could invent your own solution for that "bug" on its own MDEV.

          jplindst Jan Lindström (Inactive) added a comment - Can't use patch on #80788 as we do not have necessary license (provided one is not for MariaDB). Maybe you could invent your own solution for that "bug" on its own MDEV.
          jplindst Jan Lindström (Inactive) added a comment - - edited

          With attached test case, I could reduce the Doing recovery... outputs from 26 to 1:

          ~/mysql/10.0/mysql-test/var/log$ grep "Doing recovery" mysqld.err | wc -l
          26
          ~/mysql/10.0/mysql-test/var/log$ grep "Doing recovery" mysqld.1.err | wc -l
          1
          

          jplindst Jan Lindström (Inactive) added a comment - - edited With attached test case, I could reduce the Doing recovery... outputs from 26 to 1: ~/mysql/10.0/mysql-test/var/log$ grep "Doing recovery" mysqld.err | wc -l 26 ~/mysql/10.0/mysql-test/var/log$ grep "Doing recovery" mysqld.1.err | wc -l 1
          jplindst Jan Lindström (Inactive) made changes -
          Attachment jan.opt [ 43339 ]
          Attachment jan.test [ 43340 ]
          marko Marko Mäkelä made changes -

          I filed MDEV-12103 for fixing MySQL Bug#80788 in MariaDB.
          This request is about increasing the signal-to-noise ratio of the error log output of crash recovery. The InnoDB crash recovery consists of a few tasks:

          1. Apply the redo log.
          2. If a DDL transaction was active when the server was killed, roll it back.
          3. Drop any incompletely created indexes.
          4. Start a background thread to roll back any recovered transactions.
          5. Start accepting connections.

          I think that as part of this ticket, we should concentrate on the redo log apply. The background rollback can be somewhat noisy too, but it can be fixed separately.
          The redo log apply does not have a concept of transactions. Instead, it only deals with mini-transactions, redo log records (measured in LSN or bytes) and data file pages.
          If the last redo log checkpoint was far away, crash recovery may comprise multiple passes. Usually one pass will be enough. Each pass comprises the following steps:

          1. Read and parse the redo log records, putting them into recv_sys->addr_hash table by (space_id,page_number), in a linked list, ordered by LSN. This should be fast, as it only is sequential I/O.
          2. Take a random bucket from recv_sys->addr_hash, and read a range of pages starting from the (space_id,page_number) of that bucket.
          3. When the read for a given page completes, apply all buffered records for the page, and schedule the page for writing.
          4. If this is the last pass, apply the change buffer to the page if applicable (and write redo log about this).
          5. Write the pages back to the data files.
          6. Empty the recv_sys->addr_hash.

          Because the I/O is asynchronous, it is somewhat challenging to provide a meaningful progress measure.
          I think that we should consider a progress update that is taking place every 5 to 60 seconds (maybe introduce a parameter --innodb-log-recovery-reports= for this). This progress counter could expose the following information:

          • Size of redo log scanned so far (last checkpoint LSN, scanned-up-to-LSN, and whether the end of the log (end of last pass) was reached). A percentage cannot be reported, because we do not know where the log ends until we reach the end.
          • Number of redo log passes so far. (Or emit a message every time a new pass is started.)
          • Number of pages to process in the current pass (number of distinct pages in recv_sys->addr_hash)
          • Number of pages that have been read in the current pass (<= the above)
          • Number of pages that have been written in the current pass (<= the above)

          I would not necessarily expose the change buffer merges separately. If that is needed, it could be done separately. We do know in advance how many pages need a change buffer merge, because that information is available in the change buffer bitmap pages.

          marko Marko Mäkelä added a comment - I filed MDEV-12103 for fixing MySQL Bug#80788 in MariaDB. This request is about increasing the signal-to-noise ratio of the error log output of crash recovery. The InnoDB crash recovery consists of a few tasks: Apply the redo log. If a DDL transaction was active when the server was killed, roll it back. Drop any incompletely created indexes. Start a background thread to roll back any recovered transactions. Start accepting connections. I think that as part of this ticket, we should concentrate on the redo log apply. The background rollback can be somewhat noisy too, but it can be fixed separately. The redo log apply does not have a concept of transactions. Instead, it only deals with mini-transactions, redo log records (measured in LSN or bytes) and data file pages. If the last redo log checkpoint was far away, crash recovery may comprise multiple passes. Usually one pass will be enough. Each pass comprises the following steps: Read and parse the redo log records, putting them into recv_sys->addr_hash table by (space_id,page_number), in a linked list, ordered by LSN. This should be fast, as it only is sequential I/O. Take a random bucket from recv_sys->addr_hash, and read a range of pages starting from the (space_id,page_number) of that bucket. When the read for a given page completes, apply all buffered records for the page, and schedule the page for writing. If this is the last pass, apply the change buffer to the page if applicable (and write redo log about this). Write the pages back to the data files. Empty the recv_sys->addr_hash. Because the I/O is asynchronous, it is somewhat challenging to provide a meaningful progress measure. I think that we should consider a progress update that is taking place every 5 to 60 seconds (maybe introduce a parameter --innodb-log-recovery-reports= for this). This progress counter could expose the following information: Size of redo log scanned so far (last checkpoint LSN, scanned-up-to-LSN, and whether the end of the log (end of last pass) was reached). A percentage cannot be reported, because we do not know where the log ends until we reach the end. Number of redo log passes so far. (Or emit a message every time a new pass is started.) Number of pages to process in the current pass (number of distinct pages in recv_sys->addr_hash) Number of pages that have been read in the current pass (<= the above) Number of pages that have been written in the current pass (<= the above) I would not necessarily expose the change buffer merges separately. If that is needed, it could be done separately. We do know in advance how many pages need a change buffer merge, because that information is available in the change buffer bitmap pages.
          jplindst Jan Lindström (Inactive) made changes -
          Assignee Jan Lindström [ jplindst ] Marko Mäkelä [ marko ]
          serg Sergei Golubchik made changes -
          Sprint 5.5.55 [ 138 ] 5.5.55, 10.0.30 [ 138, 140 ]
          ratzpo Rasmus Johansson (Inactive) made changes -
          Sprint 5.5.55, 10.0.30 [ 138, 140 ] 5.5.55, 10.1.22 [ 138, 143 ]
          ratzpo Rasmus Johansson (Inactive) made changes -
          Rank Ranked lower
          sjmudd Simon Mudd added a comment -

          Marko: MySQL Bug#78844 was filed against 5.7 but has been there earlier. It's not "critical" but just horribly noisy and a pain. I've seen thousands of log lines generated during recovery on a busy server which crashed and this level of noise is just pointless. So all it does is making inspecting the log files after a crash to see that the server recovered properly more of a pain. Hope that clarifies.

          sjmudd Simon Mudd added a comment - Marko: MySQL Bug#78844 was filed against 5.7 but has been there earlier. It's not "critical" but just horribly noisy and a pain. I've seen thousands of log lines generated during recovery on a busy server which crashed and this level of noise is just pointless. So all it does is making inspecting the log files after a crash to see that the server recovered properly more of a pain. Hope that clarifies.
          marko Marko Mäkelä made changes -

          sjmudd: Yes, InnoDB recovery has always been noisy, since the very first release (MySQL 3.23). I will try to find a reasonable balance between the size of the diagnostic output and the time elapsed with no apparent progress.

          MDEV-6456 increased the diagnostic output volume in 10.1, but with the objective of increasing user-friendliness. Maybe the 15-second time chosen in MDEV-6456 is a reasonable constant also for other redo log progress reporting.

          marko Marko Mäkelä added a comment - sjmudd : Yes, InnoDB recovery has always been noisy, since the very first release (MySQL 3.23). I will try to find a reasonable balance between the size of the diagnostic output and the time elapsed with no apparent progress. MDEV-6456 increased the diagnostic output volume in 10.1, but with the objective of increasing user-friendliness. Maybe the 15-second time chosen in MDEV-6456 is a reasonable constant also for other redo log progress reporting.

          sjmudd, would you be happy with rate-limited output like this?

          2017-03-07 14:58:57 4147463936 [Note] InnoDB: Starting crash recovery from checkpoint LSN=1238777
          2017-03-07 14:58:57 4147463936 [Note] InnoDB: Read redo log up to LSN=1242112
          2017-03-07 14:58:57 4147463936 [Note] InnoDB: Starting log recovery batch of 12 pages
          2017-03-07 14:58:57 4147463936 [Note] InnoDB: To recover: 12 pages from log
          

          The first and third message would be output unconditionally, while the second and last message would only be output at most once per 15 seconds. The second message would be issued directly after completing redo log file I/O, while the last message would be issued at the end of recv_recover_page(), which is indirectly tied to data file I/O.

          It would be tricky to add progress reporting to the page writes, so I would avoid it in the first cut.
          On a related note, MDEV-10509 (another major source of InnoDB spam in MariaDB) should be addressed by implementing MariaDB progress reporting for ALTER TABLE…ALGORITHM=INPLACE.

          marko Marko Mäkelä added a comment - sjmudd , would you be happy with rate-limited output like this? 2017-03-07 14:58:57 4147463936 [Note] InnoDB: Starting crash recovery from checkpoint LSN=1238777 2017-03-07 14:58:57 4147463936 [Note] InnoDB: Read redo log up to LSN=1242112 2017-03-07 14:58:57 4147463936 [Note] InnoDB: Starting log recovery batch of 12 pages 2017-03-07 14:58:57 4147463936 [Note] InnoDB: To recover: 12 pages from log The first and third message would be output unconditionally, while the second and last message would only be output at most once per 15 seconds. The second message would be issued directly after completing redo log file I/O, while the last message would be issued at the end of recv_recover_page(), which is indirectly tied to data file I/O. It would be tricky to add progress reporting to the page writes, so I would avoid it in the first cut. On a related note, MDEV-10509 (another major source of InnoDB spam in MariaDB) should be addressed by implementing MariaDB progress reporting for ALTER TABLE…ALGORITHM=INPLACE.
          marko Marko Mäkelä made changes -
          marko Marko Mäkelä added a comment - - edited

          bb-10.1-marko
          bb-10.2-marko
          I would also appreciate some feedback from users.

          marko Marko Mäkelä added a comment - - edited bb-10.1-marko bb-10.2-marko I would also appreciate some feedback from users.
          marko Marko Mäkelä made changes -
          Assignee Marko Mäkelä [ marko ] Jan Lindström [ jplindst ]
          Status In Progress [ 3 ] In Review [ 10002 ]

          ok to push, did you consider 10.0/5.5 ?

          jplindst Jan Lindström (Inactive) added a comment - ok to push, did you consider 10.0/5.5 ?
          jplindst Jan Lindström (Inactive) made changes -
          Status In Review [ 10002 ] Stalled [ 10000 ]
          marko Marko Mäkelä made changes -
          Assignee Jan Lindström [ jplindst ] Marko Mäkelä [ marko ]

          Thanks for the review! I will backport; the risk should be very low.
          The patches will be a bit different, because InnoDB in 5.5 is in C, not in C++. Also the sd_notifyf() calls for systemd integration are not present before 10.1.

          marko Marko Mäkelä added a comment - Thanks for the review! I will backport; the risk should be very low. The patches will be a bit different, because InnoDB in 5.5 is in C, not in C++. Also the sd_notifyf() calls for systemd integration are not present before 10.1.
          ratzpo Rasmus Johansson (Inactive) made changes -
          Sprint 5.5.55, 10.1.22 [ 138, 143 ] 5.5.55, 10.2.5-1 [ 138, 144 ]
          marko Marko Mäkelä made changes -
          Fix Version/s 5.5.55 [ 22311 ]
          Fix Version/s 10.0.30 [ 22313 ]
          Fix Version/s 10.1.22 [ 22502 ]
          Fix Version/s 10.2.5 [ 22117 ]
          Fix Version/s 10.0 [ 16000 ]
          dbart Daniel Bartholomew made changes -
          Fix Version/s 10.0.31 [ 22501 ]
          Fix Version/s 10.0.30 [ 22313 ]
          dbart Daniel Bartholomew made changes -
          Fix Version/s 10.0.30 [ 22313 ]
          Fix Version/s 10.0.31 [ 22501 ]

          Finally, the merge from 10.0 to 10.1 is completed as well.

          marko Marko Mäkelä added a comment - Finally, the merge from 10.0 to 10.1 is completed as well.
          marko Marko Mäkelä made changes -
          issue.field.resolutiondate 2017-03-09 07:00:32.0 2017-03-09 07:00:32.265
          marko Marko Mäkelä made changes -
          Resolution Fixed [ 1 ]
          Status Stalled [ 10000 ] Closed [ 6 ]
          marko Marko Mäkelä made changes -
          marko Marko Mäkelä made changes -
          marko Marko Mäkelä made changes -
          serg Sergei Golubchik made changes -
          Workflow MariaDB v3 [ 77762 ] MariaDB v4 [ 151051 ]
          marko Marko Mäkelä made changes -

          People

            marko Marko Mäkelä
            jplindst Jan Lindström (Inactive)
            Votes:
            1 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Git Integration

                Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.