Details

    Description

      Hello,
      on one of our MariaDB Servers (running Piwik) we observe lot's of Database Page corruptions.
      Since it's running on the same SSD SAN as dozens of other MySQL/MariaDB/Percona SQL Servers I can exclude the Storage as the Source of the Problem.
      Also several other DB Servers on same Hardware Node show no such Problems.

      Could there be some kind of Bug? Already restored some of those Tables several times because of the Page corruptions ... I attached Log File starting with 28.03.2017 - at this Date all Tables were working. (restored some crashed Tables on 27.03.2017)

      Today again:

      mysqlcheck --all-databases --auto-repair
         Repairing tables
         piwik.piwik_archive_blob_2015_07
         Error    : Table 'piwik.piwik_archive_blob_2015_07' doesn't exist in engine
         status   : Operation failed
       
      ll /var/lib/mysql/piwik/piwik_archive_blob_2015_07*
         -rw-rw---- 1 mysql mysql      2380 Apr  2 03:09 /var/lib/mysql/piwik/piwik_archive_blob_2015_07.frm
         -rw-rw---- 1 mysql mysql 364904448 Apr  2 03:10 /var/lib/mysql/piwik/piwik_archive_blob_2015_07.ibd
      

      Andreas Schnederle-Wagner

      Attachments

        1. mariadb.log
          2.87 MB
          Andreas Schnederle-Wagner
        2. create_table.txt
          0.4 kB
          Andreas Schnederle-Wagner
        3. show_variables.txt
          18 kB
          Andreas Schnederle-Wagner
        4. icinga.jpg
          159 kB
          Andreas Schnederle-Wagner
        5. piwik_archive_blob_2015_07.jpg
          286 kB
          Andreas Schnederle-Wagner
        6. piwik_archive_blob_2015_07_show_blob.jpg
          595 kB
          Andreas Schnederle-Wagner
        7. mariadb.log
          562 kB
          Andreas Schnederle-Wagner
        8. mariadb.28062017.log
          562 kB
          Andreas Schnederle-Wagner

        Issue Links

          Activity

            Hey @Marko Mäkelä,
            it turned out that the ploop defragmentation of Virtuozzo 6 Storage sometimes corrupted IBD Files within Containers. Virtuozzo Devs tried to find the Problem but had no luck. (Virtuozzo Support Case: 15178)

            Virtuozzo Support:

            ...
            I have spoken with the responsible developer, so far there is no clue how this corruption could happen, 
            with all the debug trace we have for defrag operations there are completely no evidence that something may go wrong.
            Now we are thinking of a way to add debug for IO that was created not by defrag but by MariaDB.
            ---
            It turns out that we cannot make a sufficient debug method without affecting ploop performance.
            ...
            

            Since it was not reproduceable on a Test Container - but only on the Live Container which was productive and we couldn't accept downtimes on this one the research was stopped and in the next steps we moved to Virtuozzo 7.
            No corruptions within Virtuozzo 7 Containers so far. So I guess it must be something really specific between MariaDB + Virtuozzo 6 Ploop Defrag .... guess the Details will remain a mystery ... :-/

            Thank you for your help on this, bye from sunny Austria
            Andreas

            Futureweb Andreas Schnederle-Wagner added a comment - Hey @Marko Mäkelä, it turned out that the ploop defragmentation of Virtuozzo 6 Storage sometimes corrupted IBD Files within Containers. Virtuozzo Devs tried to find the Problem but had no luck. (Virtuozzo Support Case: 15178) Virtuozzo Support: ... I have spoken with the responsible developer, so far there is no clue how this corruption could happen, with all the debug trace we have for defrag operations there are completely no evidence that something may go wrong. Now we are thinking of a way to add debug for IO that was created not by defrag but by MariaDB. --- It turns out that we cannot make a sufficient debug method without affecting ploop performance. ... Since it was not reproduceable on a Test Container - but only on the Live Container which was productive and we couldn't accept downtimes on this one the research was stopped and in the next steps we moved to Virtuozzo 7. No corruptions within Virtuozzo 7 Containers so far. So I guess it must be something really specific between MariaDB + Virtuozzo 6 Ploop Defrag .... guess the Details will remain a mystery ... :-/ Thank you for your help on this, bye from sunny Austria Andreas

            Small Update in case someone faces the same problem: It seems that upgrading to latest Virtuozzo 7 (+ latest Virtuozzo Storage) did not entirely solve this Problem. We are still facing the corruptions - they got way less ... but still happen every now and then ...
            New Virtuozzo Ticket: #21768

            Futureweb Andreas Schnederle-Wagner added a comment - Small Update in case someone faces the same problem: It seems that upgrading to latest Virtuozzo 7 (+ latest Virtuozzo Storage) did not entirely solve this Problem. We are still facing the corruptions - they got way less ... but still happen every now and then ... New Virtuozzo Ticket: #21768

            I may have found a possible explanation for this in MDEV-31347.

            Futureweb, is the database executing ALTER TABLE or OPTIMIZE TABLE on InnoDB tables?

            marko Marko Mäkelä added a comment - I may have found a possible explanation for this in MDEV-31347 . Futureweb , is the database executing ALTER TABLE or OPTIMIZE TABLE on InnoDB tables?

            Unfortunately I' don't know the internals of Matomo. So I can't say for sure if they alter/optimize the Tables in question.
            I haven't observed a corruption for longer time now ... maybe some circumstances have changed which lead to those corruptions - even I can't say which ...

            Futureweb Andreas Schnederle-Wagner added a comment - Unfortunately I' don't know the internals of Matomo. So I can't say for sure if they alter/optimize the Tables in question. I haven't observed a corruption for longer time now ... maybe some circumstances have changed which lead to those corruptions - even I can't say which ...

            Futureweb, are the corruptions now a thing of the past? In 10.6 we introduced and rather recently fixed some regressions: MDEV-30531 and MDEV-31767 were the worst.

            marko Marko Mäkelä added a comment - Futureweb , are the corruptions now a thing of the past? In 10.6 we introduced and rather recently fixed some regressions: MDEV-30531 and MDEV-31767 were the worst.

            People

              marko Marko Mäkelä
              Futureweb Andreas Schnederle-Wagner
              Votes:
              1 Vote for this issue
              Watchers:
              6 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.