Details

    Description

      The FEDERATEDX engine doesn't seem to respect DISTINCT. I see the following behavior in 10.2.6 and 10.2.12. (Names and values redacted.)

      Here's a table FOO in bar.

      CREATE TABLE `FOO` (
      `foo_id` bigint(20) unsigned NOT NULL AUTO_INCREMENT,
      `foo_name` varchar(255) DEFAULT NULL,
      `parent_foo_id` bigint(20) unsigned DEFAULT NULL,
      PRIMARY KEY (`foo_id`),
      KEY `foo_name` (`foo_name`),
      KEY `parent_foo_id` (`parent_foo_id`)
      ) ENGINE=InnoDB AUTO_INCREMENT=978304 DEFAULT CHARSET=utf8
      

      There's another table bar_FOO in another mariadb instance, defined the same except for using FEDERATEDX:

      CREATE TABLE `bar_FOO` (
      ...
      ) ENGINE=FEDERATED DEFAULT CHARSET=utf8 CONNECTION='mysql://user:pw@bar-host:3306/bar/FOO'
      

      The FEDERATEDX plugin is installed. From show plugins:

      +-------------------------------+----------+--------------------+------------------+---------+
      | Name                          | Status   | Type               | Library          | License |
      +-------------------------------+----------+--------------------+------------------+---------+
      | ...                           |          |                    |                  |         |
      | FEDERATED                     | ACTIVE   | STORAGE ENGINE     | ha_federatedx.so | GPL     |
      +-------------------------------+----------+--------------------+------------------+---------+
      

      Here's the query run against the InnoDB table:

      MariaDB [bar]> select DISTINCT foo_id,parent_foo_id,foo_name from FOO where parent_foo_id = 822857 or foo_name like 'STRING%' order by foo_id ;
      +------------+---------------+------------+
      | foo_id     | parent_foo_id | foo_name   |
      +------------+---------------+------------+
      |     822857 |          NULL | STRING     |
      |     968903 |        822857 | STRING - 0 |
      |     968953 |        822857 | STRING - 1 |
      |     971603 |        822857 | STRING - 2 |
      |     971803 |        822857 | STRING - 3 |
      |     975103 |        822857 | STRING - 4 |
      +------------+---------------+------------+
      6 rows in set (0.01 sec)
      

      Here's the query run against the FEDERATEDX table. The result is the same with or without DISTINCT.

      MariaDB [rab]> select DISTINCT foo_id,parent_foo_id,foo_name from bar_FOO where parent_foo_id = 822857 or foo_name like 'STRING%' order by foo_id ;
      +--------+---------------+------------+
      | foo_id | parent_foo_id | foo_name   |
      +--------+---------------+------------+
      | 822857 |          NULL | STRING     |
      | 968903 |        822857 | STRING - 0 |
      | 968903 |        822857 | STRING - 0 |
      | 968953 |        822857 | STRING - 1 |
      | 968953 |        822857 | STRING - 1 |
      | 971603 |        822857 | STRING - 2 |
      | 971603 |        822857 | STRING - 2 |
      | 971803 |        822857 | STRING - 3 |
      | 971803 |        822857 | STRING - 3 |
      | 975103 |        822857 | STRING - 4 |
      | 975103 |        822857 | STRING - 4 |
      +--------+---------------+------------+
      11 rows in set (0.00 sec)
      

      Attachments

        Activity

          elenst Elena Stepanova added a comment - - edited

          Thanks for the report and test case, reproducible as described.

          MTR-friendly version of test case

          INSTALL SONAME 'ha_federatedx';
           
          CREATE TABLE t1 (
            `foo_id` bigint(20) unsigned NOT NULL,
            `foo_name` varchar(255) DEFAULT NULL,
            `parent_foo_id` bigint(20) unsigned DEFAULT NULL,
            PRIMARY KEY (`foo_id`),
            KEY `foo_name` (`foo_name`),
            KEY `parent_foo_id` (`parent_foo_id`)
          ) DEFAULT CHARSET=utf8;
           
          eval CREATE TABLE `fed_t1` ENGINE=FEDERATED DEFAULT CHARSET=utf8 CONNECTION='mysql://root@127.0.0.1:$MASTER_MYPORT/test/t1';
           
          INSERT INTO t1 VALUES (968903, 'STRING - 0', 822857);
          INSERT INTO t1 VALUES (968953, 'STRING - 1', 822857);
          INSERT INTO t1 VALUES (971603, 'STRING - 2', 822857);
          INSERT INTO t1 VALUES (971803, 'STRING - 3', 822857);
          INSERT INTO t1 VALUES (975103, 'STRING - 4', 822857);
          INSERT INTO t1 VALUES (822857, 'STRING', NULL);
           
          select foo_id,parent_foo_id,foo_name from t1 where parent_foo_id = 822857 or foo_name like 'STRING%';
          select foo_id,parent_foo_id,foo_name from fed_t1 where parent_foo_id = 822857 or foo_name like 'STRING%';
           
          # Cleanup
          DROP TABLE fed_t1, t1;
          UNINSTALL SONAME 'ha_federatedx';
          

          The queries internally produced by Federated are awful, unfortunately JIRA cannot handle them pasted as they are, so I replaced strings of \0 and FF below.

          5 Query     SELECT `foo_id`, `foo_name`, `parent_foo_id` FROM `t1` WHERE  ( (`parent_foo_id` = 822857) ) AND ( (1=1) )
          5 Query     SELECT `foo_id`, `foo_name`, `parent_foo_id` FROM `t1` WHERE  ( (`foo_name` >= 'STRING <bunch of \0s>') ) AND ( (`foo_name` <= 'STRING<bunch of FFs>') )
          

          elenst Elena Stepanova added a comment - - edited Thanks for the report and test case, reproducible as described. MTR-friendly version of test case INSTALL SONAME 'ha_federatedx' ;   CREATE TABLE t1 ( `foo_id` bigint (20) unsigned NOT NULL , `foo_name` varchar (255) DEFAULT NULL , `parent_foo_id` bigint (20) unsigned DEFAULT NULL , PRIMARY KEY (`foo_id`), KEY `foo_name` (`foo_name`), KEY `parent_foo_id` (`parent_foo_id`) ) DEFAULT CHARSET=utf8;   eval CREATE TABLE `fed_t1` ENGINE=FEDERATED DEFAULT CHARSET=utf8 CONNECTION = 'mysql://root@127.0.0.1:$MASTER_MYPORT/test/t1' ;   INSERT INTO t1 VALUES (968903, 'STRING - 0' , 822857); INSERT INTO t1 VALUES (968953, 'STRING - 1' , 822857); INSERT INTO t1 VALUES (971603, 'STRING - 2' , 822857); INSERT INTO t1 VALUES (971803, 'STRING - 3' , 822857); INSERT INTO t1 VALUES (975103, 'STRING - 4' , 822857); INSERT INTO t1 VALUES (822857, 'STRING' , NULL );   select foo_id,parent_foo_id,foo_name from t1 where parent_foo_id = 822857 or foo_name like 'STRING%' ; select foo_id,parent_foo_id,foo_name from fed_t1 where parent_foo_id = 822857 or foo_name like 'STRING%' ;   # Cleanup DROP TABLE fed_t1, t1; UNINSTALL SONAME 'ha_federatedx' ; The queries internally produced by Federated are awful, unfortunately JIRA cannot handle them pasted as they are, so I replaced strings of \0 and FF below. 5 Query SELECT `foo_id`, `foo_name`, `parent_foo_id` FROM `t1` WHERE ( (`parent_foo_id` = 822857) ) AND ( (1=1) ) 5 Query SELECT `foo_id`, `foo_name`, `parent_foo_id` FROM `t1` WHERE ( (`foo_name` >= 'STRING <bunch of \0s>' ) ) AND ( (`foo_name` <= 'STRING<bunch of FFs>' ) )

          This is happening because

          • there's an index on parent_foo_id and on foo_name
          • the query has OR, and the optimizer decides to do an index merge
          • meaning, it does two index searches, one for parent_foo_id and one for foo_name.
          • It merges two result sets by looking at row positions
          • In FederatedX a row "position" is the address of MYSQL_ROW result structure in memory
          • So the same row in two different result sets will have a different "position" and the server thinks these are two different rows
          serg Sergei Golubchik added a comment - This is happening because there's an index on parent_foo_id and on foo_name the query has OR , and the optimizer decides to do an index merge meaning, it does two index searches, one for parent_foo_id and one for foo_name . It merges two result sets by looking at row positions In FederatedX a row "position" is the address of MYSQL_ROW result structure in memory So the same row in two different result sets will have a different "position" and the server thinks these are two different rows

          Federated::position() and Federatex::position() is storing in 'ref' a
          pointer into a local result set buffer. This means that one cannot
          compare 'ref' from different handler instances to see if they point to the
          same physical record.

          This bug caused federated.federatedx to return wrong results when the
          optimizer tried to use index_merge to resolve some queries.

          Fixed by introducing table flag HA_NON_COMPARABLE_ROWID and using this
          with the above handlers.

          Future TODO (another task) to fix multi-table updates and deletes with Federated:

          • Fix multi_delete(), multi_update and read_records() to use primary key
            instead of 'ref' if case HA_NON_COMPARABLE_ROWID is set. The current
            code only works if we have only one range (like table scan) for the
            tables that will be updated in the second pass.
          • Enable DBUG_ASSERT() in ha_federated::cmp_ref() and
            ha_federatedx::cmp_ref().
          monty Michael Widenius added a comment - Federated::position() and Federatex::position() is storing in 'ref' a pointer into a local result set buffer. This means that one cannot compare 'ref' from different handler instances to see if they point to the same physical record. This bug caused federated.federatedx to return wrong results when the optimizer tried to use index_merge to resolve some queries. Fixed by introducing table flag HA_NON_COMPARABLE_ROWID and using this with the above handlers. Future TODO (another task) to fix multi-table updates and deletes with Federated: Fix multi_delete(), multi_update and read_records() to use primary key instead of 'ref' if case HA_NON_COMPARABLE_ROWID is set. The current code only works if we have only one range (like table scan) for the tables that will be updated in the second pass. Enable DBUG_ASSERT() in ha_federated::cmp_ref() and ha_federatedx::cmp_ref().

          People

            monty Michael Widenius
            bgrossman Brian Grossman
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Git Integration

                Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.