Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-30994

Earlier detection of spider table self-reference




      Here's how the self-reference detection works. Say we have a spider
      table t0 on server s0, and it connects to a table t1 on server s1. Let
      us denote this by s0.t0 -> s1.t1. Say we also have s1.t1 -> s2.t2 and
      s2.t2 -> s0.t0 (a self-reference!). When opening the table s0.t0, the
      server executes a query on s1, setting a user var
      @spider_lc.$path_to_t1 to the value of $u0$path_to_t0-, where
      u0 is the a string consisting of the MAC address and process ID of
      s0. An example would be

      set @`spider_lc_./test/t1` = '-1234567890ab-cdef01-./test/t0-'

      where ./test/t0 is the path to table t0, ./test/t1 is the path
      to table t1, 1234567890ab is the MAC address of s0, and cdef01 the
      process id.

      After this query setting the uservar, s0 executes another query
      that causes the opening of t1 on server s1, which will then executes
      the same query on s2, except the value is now concatenated info of
      s0.t0 and s1.t1:

      set @`spider_lc_./test/t2` = '-234567890abc-def012-./test/t1--1234567890ab-cdef01-./test/t0-'

      where 234567890abc and def012 are the MAC addr and pid of s1.

      Then s1 also executes a query that causes t2 to be opened on s2, which
      will then executes the query on s0.t0:

      set @`spider_lc_./test/t0` = '-34567890abcd-ef0123-./test/t2--234567890abc-def012-./test/t1--1234567890ab-cdef01-./test/t0-'

      where 34567890abcd and ef0123 are the MAC addr and pid of s2. Now s2
      also executes a second query that causes t0 to be opened on s0 again.

      So far we've omitted a step that checks for any self-references. It
      happens during the opening of a spider table, before setting the
      uservar on the remote server, and has failed to find any on the first
      opening of s0.t0, and the opening of s1.t1 and s2.t2. But during the
      second opening of s0.t0, it finds the self-references. It achieves
      this by reconstructing $u0$path_to_t0- and checking if it is a
      substring of the value of the uservar @spider_lc.$path_to_t0.

      It is this double opening of s0.t0 that causes bugs like
      MDEV-29676. If we could detect the self-reference during the opening
      of s2.t2 and abort the opening then, we would avoid opening s0.t0
      twice. On s2, we have access to the value of uservar which is
      and we know the remote table is s0.t0. The problem is that we have no
      access to the MAC address and PID of s0. This can be fixed if after
      sending the query to s0 to set the uservar, we could send another
      query to s0 that does the self-reference checking without opening any
      tables and reports the result, which could be picked up by s2 to
      report self-reference. The only query I know that does not open a
      table, and for which it makes sense to return this sort of
      information, is SHOW ENGINE Spider STATUS: we could add logic to
      the query so that the result displays all self-referencing spider
      tables. The logic simply finds all uservars in the format of
      spider_lc_$path_to_table, and does the checking on all of them.


        Issue Links



              ycp Yuchen Pei
              ycp Yuchen Pei
              0 Vote for this issue
              2 Start watching this issue



                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.