Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-16136

Various ASAN failures when testing 10.2/10.3

    XMLWordPrintable

Details

    Description

      MariaDB 10.3 commit 8b087c63b56408edfae21f3234bae0b5391759b6 (2018-05-09)
      compiled with ASAN.

      I have some rather simple RQG test containing mostly DDL.
      When executing this test via combinations.pl in parallel (leads to high loaded box) with many trials than some significant share of the test runs fail with ASAN failures like
      SUMMARY: AddressSanitizer: use-after-poison .../storage/innobase/row/row0upd.cc:3422 in row_upd_step(que_thr_t*)
      SUMMARY: AddressSanitizer: use-after-poison .../storage/innobase/trx/trx0purge.cc:224 in trx_purge_add_undo_to_history(trx_t const*, trx_undo_t*&, mtr_t*)
      SUMMARY: AddressSanitizer: use-after-poison .../storage/innobase/trx/trx0purge.cc:226 in trx_purge_add_undo_to_history(trx_t const*, trx_undo_t*&, mtr_t*)

      There were >= 52 different unique ASAN Summary lines.
      (grep -h 'SUMMARY: AddressSanitizer: ' last_comb_workdir/trial*.log | sort -u)

      I am aware that

      • a significant fraction of these ASAN failures are already reported
        But these reports lack often some fast replay testcase.
      • some clear decision about which part in MariaDB is "guilty" (InnoDB or the server or both) cannot be made based on the current information available
      • there is some significant but not big likelihood that the failures reported during testing might be caused by
      • exceeding OS/testing box resources -> server/InnoDB meet conditions they cannot handle good enough in the moment -> ....
        There are at least no signs that the OS starts to "attack" the mass of perl processes because of resource shortages or similar.
      • weaknesses in RQG mechanics
        Basically RQG has also sometimes problems to handle slow reacting servers/processes.
        Sorry in case that is valid.
        The dilemma is that we need extreme CPU and memory IO load for getting a short bug replay time etc. On a system with low load the test passes nearly all time.

      Attachments

        Issue Links

          Activity

            People

              marko Marko Mäkelä
              mleich Matthias Leich
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.