Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-19660

wsrep_rec_get_foreign_key() is dereferencing a stale pointer to a page that was previously latched

Details

    Description

      In row_ins_foreign_check_on_constraint(), clustered index record is being passed to wsrep_append_foreign_key() after releasing the latch. If a record has been changed by other thread in the meantime then it could lead to a crash when
      wsrep_rec_get_foreign_key () tries to access the record.

      The following is the problematic code :

              btr_pcur_store_position(pcur, mtr); 
       
              if (index == clust_index) {
                      btr_pcur_copy_stored_position(cascade->pcur, pcur);
              } else {
                      btr_pcur_store_position(cascade->pcur, mtr);
              }
       
              mtr_commit(mtr);
       
              ut_a(cascade->pcur->rel_pos == BTR_PCUR_ON);
              
              cascade->state = UPD_NODE_UPDATE_CLUSTERED;
              
      #ifdef WITH_WSREP
              err = wsrep_append_foreign_key(
                                              thr_get_trx(thr),
                                              foreign,
                                              clust_rec,
                                              clust_index,
                                              FALSE,
                                              (node) ? TRUE : FALSE);
      

      Attachments

        Activity

          marko Marko Mäkelä added a comment - As far as I can tell, this was introduced in 5.5.25-galera, 10.0.19-galera, 10.1.6 .

          In row_ins_check_foreign_constraint same function is called inside a active mtr.

          jplindst Jan Lindström (Inactive) added a comment - In row_ins_check_foreign_constraint same function is called inside a active mtr.
          jplindst Jan Lindström (Inactive) added a comment - https://github.com/MariaDB/server/commit/42a1ad314700b705077333f42393250c978c92d7

          I think that we need a test case that exercises the error handling code. Note: I am not asking for a test that reproduces the race condition.

          Also, please address my review comments regarding the code changes.

          marko Marko Mäkelä added a comment - I think that we need a test case that exercises the error handling code. Note: I am not asking for a test that reproduces the race condition. Also, please address my review comments regarding the code changes.

          I wonder if we could simply replace clust_rec with cascade->pcur->old_rec in the call.

          marko Marko Mäkelä added a comment - I wonder if we could simply replace clust_rec with cascade->pcur->old_rec in the call.
          jplindst Jan Lindström (Inactive) added a comment - https://github.com/MariaDB/server/commit/859052dfcc3da3d61e3023c7401ef009cea50bcd

          Can we merely replace the clust_rec with cascade->pcur->old_rec and omit all other changes? I am concerned about adding so much code for error handling or reporting, especially when that code is not being backed by an ‘organic’ test case that does not resort to fault injection.

          marko Marko Mäkelä added a comment - Can we merely replace the clust_rec with cascade->pcur->old_rec and omit all other changes? I am concerned about adding so much code for error handling or reporting, especially when that code is not being backed by an ‘organic’ test case that does not resort to fault injection.

          Sure, I will do that for 10.1-10.4.

          jplindst Jan Lindström (Inactive) added a comment - Sure, I will do that for 10.1-10.4.

          People

            jplindst Jan Lindström (Inactive)
            thiru Thirunarayanan Balathandayuthapani
            Votes:
            1 Vote for this issue
            Watchers:
            5 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Git Integration

                Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.