Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-33174

nondeterministic test results in spider/bugfix.self_reference_multi

Details

    Description

      Happens quite often:

      https://buildbot.mariadb.net/ci/reports/cross_reference#branch=&revision=&platform=&fail_name=spider/bugfix.self_reference_multi&fail_variant=&fail_info_full=&typ=&info=&dt=&limit=100&fail_info_short=

      Could probably be quickly fixed by a regex replace of the query result - doesn't really matter on which node of the loop the loop is reported - all these outcomes are correct.

      It would be good to find out why this happens too.

      Attachments

        Activity

          ycp Yuchen Pei added a comment -

          The reason that in the first select (from t0) it should print

          ERROR HY000: An infinite loop is detected when opening table test.t0

          is because it queries for table status of the remote table of t0 which
          is t1, and then that of t1 which is t2, and that of t2 which is t0, by
          which time it detects the self-reference (of t0) and reports the
          error.

          The reason that in the second and third selects (from t1 and t2) it
          prints the same message is because it reuses the error from the first
          select during ha_spider::info because not enough time has passed
          to try querying for table stats from remote table again (see the line
          marked with an arrow below)

          10.5 81d01855f

          int ha_spider::info(
            uint flag
          ) {
          //  [... 48 lines elided]
            if (flag &
              (HA_STATUS_TIME | HA_STATUS_CONST | HA_STATUS_VARIABLE | HA_STATUS_AUTO))
            {
          //  [... 14 lines elided]
              if (!share->sts_init)
              {
                pthread_mutex_lock(&share->sts_mutex);
                if (share->sts_init)
                  pthread_mutex_unlock(&share->sts_mutex);
                else {
                  if ((spider_init_error_table =
                    spider_get_init_error_table(wide_handler->trx, share, FALSE)))
                  {
                    DBUG_PRINT("info",("spider diff=%f",
                      difftime(tmp_time, spider_init_error_table->init_error_time)));
                    if (difftime(tmp_time,
                      spider_init_error_table->init_error_time) <
                      spider_param_table_init_error_interval())
                    {
          //  [... 11 lines elided]
                      if (spider_init_error_table->init_error_with_message)
                        my_message(spider_init_error_table->init_error,   // <-
                          spider_init_error_table->init_error_msg, MYF(0));
                      DBUG_RETURN(check_error_mode(spider_init_error_table->init_error));
                    }
                  }
          //  [... 6 lines elided]
                }
              }
          //  [... 262 lines elided]
          }

          It is not clear why the error fails nondeterministically, as I cannot
          reproduce this failure locally. Therefore we simply replace the query
          result when the error is reported on any of the three nodes of the
          cycle.

          ycp Yuchen Pei added a comment - The reason that in the first select (from t0) it should print ERROR HY000: An infinite loop is detected when opening table test.t0 is because it queries for table status of the remote table of t0 which is t1, and then that of t1 which is t2, and that of t2 which is t0, by which time it detects the self-reference (of t0) and reports the error. The reason that in the second and third selects (from t1 and t2) it prints the same message is because it reuses the error from the first select during ha_spider::info because not enough time has passed to try querying for table stats from remote table again (see the line marked with an arrow below) 10.5 81d01855f int ha_spider::info( uint flag ) { // [... 48 lines elided] if (flag & (HA_STATUS_TIME | HA_STATUS_CONST | HA_STATUS_VARIABLE | HA_STATUS_AUTO)) { // [... 14 lines elided] if (!share->sts_init) { pthread_mutex_lock(&share->sts_mutex); if (share->sts_init) pthread_mutex_unlock(&share->sts_mutex); else { if ((spider_init_error_table = spider_get_init_error_table(wide_handler->trx, share, FALSE))) { DBUG_PRINT( "info" ,( "spider diff=%f" , difftime (tmp_time, spider_init_error_table->init_error_time))); if ( difftime (tmp_time, spider_init_error_table->init_error_time) < spider_param_table_init_error_interval()) { // [... 11 lines elided] if (spider_init_error_table->init_error_with_message) my_message(spider_init_error_table->init_error, // <- spider_init_error_table->init_error_msg, MYF(0)); DBUG_RETURN(check_error_mode(spider_init_error_table->init_error)); } } // [... 6 lines elided] } } // [... 262 lines elided] } It is not clear why the error fails nondeterministically, as I cannot reproduce this failure locally. Therefore we simply replace the query result when the error is reported on any of the three nodes of the cycle.
          ycp Yuchen Pei added a comment -

          Hi holyfoot, ptal thanks

          upstream/bb-10.5-mdev-33174 b8acdfe37cf2447d41f162abf0468f1bcbc9b28e
          MDEV-33174 Fixing nondeterministic self-referencing test result
          

          ycp Yuchen Pei added a comment - Hi holyfoot , ptal thanks upstream/bb-10.5-mdev-33174 b8acdfe37cf2447d41f162abf0468f1bcbc9b28e MDEV-33174 Fixing nondeterministic self-referencing test result

          ok to push.

          holyfoot Alexey Botchkov added a comment - ok to push.
          ycp Yuchen Pei added a comment -

          thanks for the review. pushed d40eaf2dab66f76e1e3749ddb863ad5bf32772da to 10.5, after a 3 hour wait on a amd64-ubuntu-2204-debug-ps rebuild...

          ycp Yuchen Pei added a comment - thanks for the review. pushed d40eaf2dab66f76e1e3749ddb863ad5bf32772da to 10.5, after a 3 hour wait on a amd64-ubuntu-2204-debug-ps rebuild...

          People

            ycp Yuchen Pei
            ycp Yuchen Pei
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Git Integration

                Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.