[MDEV-16610] Torn reads from spider, with XA and isolation level REPEATABLE READ - Jira

XML

Word

Printable

Details

Type: Bug
Status: Open (View Workflow)
Priority: Major
Resolution: Unresolved
Affects Version/s: 10.3(EOL)
Fix Version/s: 10.11, 11.4, 11.8
Component/s: Storage Engine - Spider, XA
Labels:
None
Environment:
Linux

Description

Writes through spider on external datanodes are guaranteed to be consistent through XA, however reads are no consistent, but can be torn. Meaning spider cannot be considered to be an ACID engine, even with SERIALIZABLE isolation level and XA. Where a simple transaction such as inserting or deleting two rows atomically, can still be seen during read as only 1 row exists.

On the spider head node:

create database spider_replication_test;

create table t2 (

  id int unsigned PRIMARY KEY,

  filler varchar(64) DEFAULT 'Default Filler non-sense!'

) ENGINE=SPIDER COMMENT='wrapper = "mysql", table = "t2", database = "spider_replication_test"'

PARTITION BY HASH (`id`)

(PARTITION `p0` COMMENT = 'srv "datanode1"' ENGINE = SPIDER,

 PARTITION `p1` COMMENT = 'srv "datanode2"' ENGINE = SPIDER,

 PARTITION `p2` COMMENT = 'srv "datanode3"' ENGINE = SPIDER,

 PARTITION `p3` COMMENT = 'srv "datanode4"' ENGINE = SPIDER)

DELIMITER //

DROP PROCEDURE IF EXISTS insert_and_delete//

CREATE PROCEDURE insert_and_delete(IN loops INT)

  BEGIN

    SET @x = 1;

    SET TRANSACTION ISOLATION LEVEL SERIALIZABLE;

    # To make it easier to find in the query.log:

    SELECT * FROM spider_replication_test.t2 WHERE filler NOT LIKE 'START OF insert_and_delete';

    REPEAT

        SET @x = @x + 1;

        start transaction;

        INSERT INTO spider_replication_test.t2 (id, filler) VALUES (0, 'p0 consistent view check'), (1, 'p1 consistent view check'), (2, 'p2 consistent view check'), (3, 'p3 consistent view check');

        commit;

        start transaction;

        DELETE FROM spider_replication_test.t2 WHERE id IN (0,1,2,3);

        commit;

    UNTIL @x > loops END REPEAT;

    # To make it easier to find in the query.log:

    SELECT * FROM spider_replication_test.t2 WHERE filler NOT LIKE 'END OF insert_and_delete';

END

//

DROP PROCEDURE IF EXISTS check_select_consistency//

CREATE PROCEDURE check_select_consistency(IN loops INT, OUT outparam INT)

  BEGIN

    DECLARE var_sum INT;

    SET @x = 1;

    SET outparam = 0;

    SET TRANSACTION ISOLATION LEVEL SERIALIZABLE;

    # To make it easier to find in the query.log:

    SELECT * FROM spider_replication_test.t2 WHERE filler NOT LIKE 'START OF check_select_consistency';

    REPEAT

        SET @x = @x + 1;

        SELECT sum(id) INTO var_sum FROM spider_replication_test.t2 WHERE id in (0,1,2,3);

        IF (var_sum IS NOT NULL AND var_sum != 6 ) THEN

            SET outparam = outparam + 1;

        END IF;

    UNTIL @x > loops END REPEAT;

    # To make it easier to find in the query.log:

    SELECT * FROM spider_replication_test.t2 WHERE filler NOT LIKE 'END OF check_select_consistency';

END

//

DELIMITER ;

Insert some minimal data to avoid 0/1 row optimisations:

INSERT INTO t2 (id, filler) VALUES (4, 'p0 dummy data'), (5, 'p1 dummy data'), (6, 'p2 dummy data'), (7, 'p3 dummy data'), (8, 'P0 dummy data 2'), (9, 'P1 dummy data 2'), (10, 'P2 dummy data 2'), (11, 'P3 dummy data 2');

On the data nodes:

use spider_replication_test;

create table t2 (

  id int unsigned PRIMARY KEY,

  filler varchar(64) DEFAULT 'Default Filler non-sense!'

) ENGINE=INNODB;

Spider head variables:

spider-head [spider_replication_test]> show variables WHERE Variable_name like '%spider%' AND value not like '-1';

+---------------------------------------+--------+

| Variable_name                         | Value  |

+---------------------------------------+--------+

| spider_bka_engine                     |        |

| spider_block_size                     | 16384  |

| spider_conn_recycle_mode              | 0      |

| spider_conn_recycle_strict            | 0      |

| spider_conn_wait_timeout              | 10     |

| spider_connect_error_interval         | 1      |

| spider_connect_mutex                  | OFF    |

| spider_connect_retry_count            | 1000   |

| spider_connect_retry_interval         | 1000   |

| spider_direct_dup_insert              | 1      |

| spider_dry_access                     | OFF    |

| spider_force_commit                   | 1      |

| spider_general_log                    | ON     |

| spider_index_hint_pushdown            | OFF    |

| spider_internal_unlock                | OFF    |

| spider_internal_xa                    | ON     |

| spider_internal_xa_id_type            | 0      |

| spider_internal_xa_snapshot           | 0      |

| spider_local_lock_table               | OFF    |

| spider_lock_exchange                  | OFF    |

| spider_log_result_error_with_sql      | 3      |

| spider_log_result_errors              | 3      |

| spider_max_connections                | 0      |

| spider_net_read_timeout               | 86400  |

| spider_net_write_timeout              | 86400  |

| spider_ping_interval_at_trx_start     | 3600   |

| spider_quick_mode                     | 3      |

| spider_remote_access_charset          |        |

| spider_remote_default_database        |        |

| spider_remote_time_zone               | SYSTEM |

| spider_same_server_link               | OFF    |

| spider_semi_table_lock                | 1      |

| spider_semi_trx                       | ON     |

| spider_support_xa                     | ON     |

| spider_sync_autocommit                | ON     |

| spider_sync_time_zone                 | ON     |

| spider_sync_trx_isolation             | ON     |

| spider_table_crd_thread_count         | 10     |

| spider_table_init_error_interval      | 1      |

| spider_table_sts_thread_count         | 10     |

| spider_udf_table_lock_mutex_count     | 20     |

| spider_udf_table_mon_mutex_count      | 20     |

| spider_use_all_conns_snapshot         | OFF    |

| spider_use_consistent_snapshot        | OFF    |

| spider_use_default_database           | ON     |

| spider_use_flash_logs                 | OFF    |

| spider_use_snapshot_with_flush_tables | 0      |

| spider_version                        | 3.3.13 |

| spider_xa_register_mode               | 1      |

+---------------------------------------+--------+

49 rows in set (0.00 sec)

Spider head session 1

spider-head [spider_replication_test]> call insert_and_delete(1000);

Empty set (0.01 sec)

Empty set (20.42 sec)

Query OK, 8000 rows affected (20.42 sec)

Spider head session 2 Concurrently as session 1, showing >15% torn reads!

spider-head [spider_replication_test]> call check_select_consistency(2000, @o); select @o;

Empty set (0.01 sec)

Empty set (15.89 sec)

Query OK, 2000 rows affected (15.89 sec)

+------+

| @o   |

+------+

|  385 |

+------+

1 row in set (0.00 sec)

From query.log on datanode showing concurrent reads and writes, see '5785670 Query select a.id,sum(b.`id`)...' showing reads in 'in-doubt' XA state between prepare and commit state:

                5785669 Connect spider_test@spider-head.example.com as anonymous on

                5785669 Query   set session transaction isolation level read committed;set session autocommit = 1;xa start 0x3339613139,0x63656634376134,1

                5785669 Query   SET NAMES utf8mb4

                5785669 Init DB spider_replication_test

                5785669 Query   drop temporary table if exists spider_replication_test.tmp_spider_bka_0x7f79d40a9870;create temporary table spider_replication_test.tmp_spider_bka_0x7f79d40a9870(id bigint,c0 int(10) unsigned)engine=memory default charset=utf8mb4 collate utf8mb4_general_ci;insert into spider_replication_test.tmp_spider_bka_0x7f79d40a9870(id,c0)values(0,0),(1,1),(2,2),(3,3)

                5785669 Query   select a.id,b.`id` from spider_replication_test.tmp_spider_bka_0x7f79d40a9870 a,`spider_replication_test`.`t2` b where a.c0 <=> b.`id` and (b.`id` in( 0 , 1 , 2 , 3)) for update

                5785669 Query   delete from `spider_replication_test`.`t2` where `id` = 0 limit 1

                5785669 Query   drop temporary table if exists spider_replication_test.tmp_spider_bka_0x7f79d40a9870

                5785669 Query   xa end 0x3339613139,0x63656634376134,1

                5785669 Query   xa prepare 0x3339613139,0x63656634376134,1

                5785670 Connect spider_test@spider-head.example.com as anonymous on

                5785670 Query   set session transaction isolation level read committed;set session autocommit = 1;xa start 0x3339633438,0x63656634376134,1

                5785670 Query   SET NAMES utf8mb4

                5785670 Init DB spider_replication_test

                5785670 Query   drop temporary table if exists spider_replication_test.tmp_spider_bka_0x7f79d40a9870;create temporary table spider_replication_test.tmp_spider_bka_0x7f79d40a9870(id bigint,c0 int(10) unsigned)engine=memory default charset=utf8mb4 collate utf8mb4_general_ci;insert into spider_replication_test.tmp_spider_bka_0x7f79d40a9870(id,c0)values(0,0),(1,1),(2,2),(3,3)

                5785670 Query   select a.id,sum(b.`id`),b.`id` from spider_replication_test.tmp_spider_bka_0x7f79d40a9870 a,`spider_replication_test`.`t2` b where a.c0 <=> b.`id` and (b.`id` in( 0 , 1 , 2 , 3))

                5785669 Query   xa commit 0x3339613139,0x63656634376134,1

                5785669 Quit

                5785671 Connect spider_test@spider-head.example.com as anonymous on

                5785671 Query   set session transaction isolation level read committed;set session autocommit = 1;xa start 0x3339613139,0x63656634376134,1

                5785671 Query   SET NAMES utf8mb4

                5785671 Init DB spider_replication_test

                5785671 Query   insert high_priority into `spider_replication_test`.`t2`(`id`,`filler`)values(0,'p0 consistent view check')

                5785670 Query   drop temporary table if exists spider_replication_test.tmp_spider_bka_0x7f79d40a9870

                5785670 Query   xa end 0x3339633438,0x63656634376134,1

                5785670 Query   xa prepare 0x3339633438,0x63656634376134,1

                5785670 Query   xa commit 0x3339633438,0x63656634376134,1

                5785671 Query   xa end 0x3339613139,0x63656634376134,1

                5785670 Quit

I think this can be fixed by not allowing queries in spider to cross XA commit, i.e. XA commit needs to wait for currently running queries before sent to any node and new queries needs to be blocked before XA commit is completed on all nodes. But this will be limiting the parallelism and performance.

Attachments

Issue Links

relates to

MDEV-5004 Support parallel read transactions on the same snapshot

Open

MDEV-25097 spider xa savepoint not supported/inconsistent

Open

MDEV-27717 Parallel execution on partitions in scans where multiple partitions are needed

Open

Activity

People

Assignee:: Yuchen Pei

Reporter:: Mattias Jonsson

Votes:: 2 Vote for this issue

Watchers:: 15 Start watching this issue

Dates

Created:: 2018-06-28 11:02

Updated:: 2025-08-27 16:47

Git Integration

Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.