Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-742

LP:803649 - Xa recovery failed on client disconnection

Details

    Description

      Dispite prepare phase have been rich you will get nothing when doing a xa recover if the client get disconnected befor commit

      In the following php script an error is generated to force the COMMIT to failed cf:XA COMMITEE and close the connection.

       
      <?php
       
      $dt = date_create();
      $xid = date_timestamp_get($dt);
      echo "xid : ".$xid;
       
      $logger1 = new mysqli('localhost', 'root', 'xxxxx', 'test');
      if (mysqli_connect_error()) {
          die('Erreur de connexion (' . mysqli_connect_errno() . ') '. mysqli_connect_error());
      }
      $potest = new mysqli('192.168.45.166', 'test', 'xxxxxxx', 'test');
      if (mysqli_connect_error()) {
          $logger1->close();
          die('Erreur de connexion (' . mysqli_connect_errno() . ') '. mysqli_connect_error());
      }
       
      echo 'SuccËs logger1 ' . $logger1->host_info . "\n";
      echo 'SuccËs potest ' . $potest->host_info . "\n";
       
       
      ##################################
      $queryXA = 'XA START "'.$xid.'"';
       
      $queryRes = $logger1->query($queryXA);
      if (!$queryRes) {
          $message  = 'RequÍte invalide logger1: ' . $logger1->error . "\n";
          $message .= 'RequÍte complËte : ' . $queryXA;
          $logger1->close();
          $potest->close();
          die($message);
      }
      $queryRes = $potest->query($queryXA);
      if (!$queryRes) {
          $message  = 'RequÍte invalide potest: ' . $potest->error . "\n";
          $message .= 'RequÍte complËte : ' . $queryXA;
          $logger1->close();
          $potest->close();
          die($message);
      }
      ##################################
       
       
      ##################################
      $queryXA = 'INSERT INTO mthoxa(xamtho_tc) values ("'.$xid.'")';
      $queryRes = $logger1->query($queryXA);
      if (!$queryRes) {
          $message  = 'RequÍte invalide logger1: ' . $logger1->error . "\n";
          $message .= 'RequÍte complËte : ' . $queryXA;
          $logger1->close();
          $potest->close();
          die($message);
      }
       
      $queryRes = $potest->query($queryXA);
      if (!$queryRes) {
          $message  = 'RequÍte invalide logger1: ' . $logger1->error . "\n";
          $message .= 'RequÍte complËte : ' . $queryXA;
          $logger1->close();
          $potest->close();
          die($message);
      }
      ##################################
       
       
      ##################################
      $queryXA = 'XA END "'.$xid.'"';
       
      $queryRes = $logger1->query($queryXA);
      if (!$queryRes) {
          $message  = 'RequÍte invalide logger1: ' . $logger1->error . "\n";
          $message .= 'RequÍte complËte : ' . $queryXA;
          $logger1->close();
          $potest->close();
          die($message);
      }
      $queryRes = $potest->query($queryXA);
      if (!$queryRes) {
          $message  = 'RequÍte invalide potest: ' . $potest->error . "\n";
          $message .= 'RequÍte complËte : ' . $queryXA;
          $logger1->close();
          $potest->close();
          die($message);
      }
      ##################################
       
       
      ##################################
      $queryXA = 'XA PREPARE "'.$xid.'"';
      $queryRes = $logger1->query($queryXA);
      if (!$queryRes) {
          $message  = 'RequÍte invalide logger1: ' . $logger1->error . "\n";
          $message .= 'RequÍte complËte : ' . $queryXA;
          $logger1->close();
          $potest->close();
          die($message);
      }
       
      $queryRes = $potest->query($queryXA);
      if (!$queryRes) {
          $message  = 'RequÍte invalide potest: ' . $potest->error . "\n";
          $message .= 'RequÍte complËte : ' . $queryXA;
          $logger1->close();
          $potest->close();
          die($message);
      }
       
      ##################################
       
       
      ##################################
      $queryXA = 'XA COMMIT "'.$xid.'"';
      $queryRes = $logger1->query($queryXA);
      if (!$queryRes) {
          $message  = 'RequÍte invalide logger1: ' . $logger1->error . "\n";
          $message .= 'RequÍte complËte : ' . $queryXA;
          $logger1->close();
          $potest->close();
          die($message);
      }
       
      $queryXA = 'XA COMMITEE "'.$xid.'"';
      $queryRes = $potest->query($queryXA);
      if (!$queryRes) {
          $message  = 'RequÍte invalide potest: ' . $potest->error . "\n";
          $message .= 'RequÍte complËte : ' . $queryXA;
          
          $queryXA = 'XA RECOVER';
          $queryRes = $potest->query($queryXA);
          while ($r=$queryRes->fetch_assoc())
          	print_r($r);
          
          
          $logger1->close();
          $potest->close();
          die($message);
      }
      ##################################
       
      $logger1->close();
      $potest->close();
       
      echo "Finish !\n"
       
      ?>

      Attachments

        Issue Links

          Activity

            psergei Sergei Petrunia added a comment - - edited

            Take-aways from discussions in Frankfurt:

            detach_prepared_tx() is ok
            he transaction remains prepared inside RocksDB.
            MyRocks' Rdb_transaction object is de-associated from RocksDB transaction and
            is destroyed. This is fine.

            replace_native_transaction_in_thd is redundant

            The idea behind this method is that it is needed on the slave. A slave worker
            may run the following sequence of operations:

            XA BEGIN 'trx1'
            ... -- actions made by trx1. 
            XA END 'trx1';
            XA PREPARE 'trx1';
            -- at this point, all actions by trx1 has been done 
            -- the only two actions that are possible are commit or rollback.
             
            XA BEGIN 'trx2';
            ... -- actions made by trx2
            XA END 'trx2';
            XA PREPARE 'trx2';
             
            XA (COMMIT|ROLLBACK) 'trx1';
            -- This is where we would want to "re-attach" to the prepared transaction.
            

            The problem with re-attaching is that Rdb_transaction has a quite a bit of context about the transaction outside the m_rocksdb_tx member.
            If one just replaces the m_rocksdb_tx, we end up with a mismatch between Rdb_transaction members.

            It is much better if the SQL layer uses the existing API and calls

            engine_hton->(commit|rollback)_by_xid(...);
            

            There was another argument: both InnoDB and MyRocks try to re-use their internal transaction object. and this is why replace_trx_in_thd saves it away. And this is how we end up with method signatures like this:

            +innodb_replace_trx_in_thd(
            +       THD*    thd,
            +       void*   new_trx_arg,
            +       void**  ptr_trx_arg)
            +{
            

            My opinion is that this sort of caching should not be exposed through the storage engine API.

            psergei Sergei Petrunia added a comment - - edited Take-aways from discussions in Frankfurt: detach_prepared_tx() is ok he transaction remains prepared inside RocksDB. MyRocks' Rdb_transaction object is de-associated from RocksDB transaction and is destroyed. This is fine. replace_native_transaction_in_thd is redundant The idea behind this method is that it is needed on the slave. A slave worker may run the following sequence of operations: XA BEGIN 'trx1' ... -- actions made by trx1. XA END 'trx1' ; XA PREPARE 'trx1' ; -- at this point, all actions by trx1 has been done -- the only two actions that are possible are commit or rollback.   XA BEGIN 'trx2' ; ... -- actions made by trx2 XA END 'trx2' ; XA PREPARE 'trx2' ;   XA ( COMMIT | ROLLBACK ) 'trx1' ; -- This is where we would want to "re-attach" to the prepared transaction. The problem with re-attaching is that Rdb_transaction has a quite a bit of context about the transaction outside the m_rocksdb_tx member. If one just replaces the m_rocksdb_tx, we end up with a mismatch between Rdb_transaction members. It is much better if the SQL layer uses the existing API and calls engine_hton->(commit|rollback)_by_xid(...); There was another argument: both InnoDB and MyRocks try to re-use their internal transaction object. and this is why replace_trx_in_thd saves it away. And this is how we end up with method signatures like this: +innodb_replace_trx_in_thd( + THD* thd, + void* new_trx_arg, + void** ptr_trx_arg) +{ My opinion is that this sort of caching should not be exposed through the storage engine API.
            Elkin Andrei Elkin added a comment - - edited

            psergey, thanks for your notes!

            I think you have a good point to replace_native_transaction_in_thd which is indeed to arrange a form
            of caching which SE actually manages on its own anyway. To use the SE caching instead may not cost much performance wise for the slave. We may measure impact anyway, which I'll try to do.
            I am fine to remove this extension meanwhile (and permanently if the measurement proves it's worth of that).

            Elkin Andrei Elkin added a comment - - edited psergey , thanks for your notes! I think you have a good point to replace_native_transaction_in_thd which is indeed to arrange a form of caching which SE actually manages on its own anyway. To use the SE caching instead may not cost much performance wise for the slave. We may measure impact anyway, which I'll try to do. I am fine to remove this extension meanwhile (and permanently if the measurement proves it's worth of that).
            Elkin Andrei Elkin added a comment -

            Howdy, Sergei!

            Could you please check the patch. It's latest version is in bb-10.5-mdev_742.
            I am also asking Marko, Sergei P., and Sergei Voitovich to take a look. Sergei V could take a close
            look on usage of LF_HASH in the parallel slave code.

            Thanks and cheers,

            Andrei

            Elkin Andrei Elkin added a comment - Howdy, Sergei! Could you please check the patch. It's latest version is in bb-10.5-mdev_742. I am also asking Marko, Sergei P., and Sergei Voitovich to take a look. Sergei V could take a close look on usage of LF_HASH in the parallel slave code. Thanks and cheers, Andrei

            I do not think that the XA functionality can meaningfully be tested with DDL operations before MDEV-15532 and MDEV-21602 have been fixed.

            marko Marko Mäkelä added a comment - I do not think that the XA functionality can meaningfully be tested with DDL operations before MDEV-15532 and MDEV-21602 have been fixed.

            marko, definitely. Although the problem is not new: recovery wouldn't recover MDL locks. This patch just extends the problem a bit further.

            svoj Sergey Vojtovich added a comment - marko , definitely. Although the problem is not new: recovery wouldn't recover MDL locks. This patch just extends the problem a bit further.

            People

              Elkin Andrei Elkin
              stephanevaroqui Stephane VAROQUI (Inactive)
              Votes:
              2 Vote for this issue
              Watchers:
              20 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.