Uploaded image for project: 'MariaDB MaxScale'
  1. MariaDB MaxScale
  2. MXS-3353

Tee filter loses statements if branch target is slower

Details

    • Bug
    • Status: Closed (View Workflow)
    • Major
    • Resolution: Fixed
    • 2.5.6, 2.5.10
    • 6.2.0
    • tee
    • None
    • Linux, Debian 10, n.a.
    • MXS-SPRINT-122, MXS-SPRINT-140

    Description

      When we are running our simple insert test or our a bit more complex mixed test (SIUD) at a very high pace (10 us delay between statements) maxscale is not sending all statements to the tee'd instance after a short time.
      When we lower the pace to 10 ms delay this does not happen.
      Unfortunately maxscale is not throwing any warning or error (in maxscale error log) to make us aware of this.
      This means tee filter cannot be used reliably in production.
      Possibly a statement cache or something like this is overflowing?
      Would be good to have something like a "synchronous" mode to guarantee that statements have been received my both database backends with possibly some timeout (similar to semi-synchronous replication).
      We want to use this for some kind of quality assurance between 2 different backends and/or for doing a push replication from the secure to the insecure network zone in a security sensitive environment.

      Attachments

        Issue Links

          Activity

            Hi Markus

            I repeated the test. Please find all the information attached. You can also provoke the situation when you set on the tee'd instance innodb_flush_log_at_trx_commit = 1 and sync_binlog = 1 while you have these values set to 2 and 0 on the primary instance. Even with the slow pace:

            Test with slow pace (10000 us sleep, everything is fine):

            shell> date ; ./mixed_test.php
            Wed 6 Jan 09:17:34 CET 2021
            IIII.........IIII^CCaught signal: 2
            shell> date
            Wed 6 Jan 09:19:29 CET 2021

            Primary instance: mariadb-105

            SQL> SELECT COUNT, CURRENT_TIMESTAMP() FROM test;
            -----------------------------+

            count current_timestamp()

            -----------------------------+

            6163 2021-01-06 09:20:49

            -----------------------------+

            End of General Log:

            330 Query SELECT * FROM test WHERE id = 383966
            330 Query INSERT INTO test (id, data, ts) values (383966, "Test data insert", CURRENT_TIMESTAMP(6))
            210106 9:19:27 330 Quit

            Tee'd instance: mysql-57

            SQL> SELECT COUNT, CURRENT_TIMESTAMP() FROM test;
            -----------------------------+

            count current_timestamp()

            -----------------------------+

            6163 2021-01-06 09:21:02

            -----------------------------+

            End of General Log:

            2021-01-06T08:19:26.942277Z 360 Query SELECT * FROM test WHERE id = 383966
            2021-01-06T08:19:26.943038Z 360 Query INSERT INTO test (id, data, ts) values (383966, "Test data insert", CURRENT_TIMESTAMP(6))
            2021-01-06T08:19:27.522731Z 360 Quit


            Test with fast pace (10 us sleep, rows are missing):

            shell> date ; ./mixed_test.php ; date
            Wed 6 Jan 09:24:26 CET 2021
            IIIIIIII...DUDI^CCaught signal: 2
            Wed 6 Jan 09:27:30 CET 2021

            Primary instance: mariadb-105

            350 Query SELECT * FROM test WHERE id = 934
            350 Query INSERT INTO test (id, data, ts) values (934, "Test data insert", CURRENT_TIMESTAMP(6))
            210106 9:27:30 350 Quit

            SQL> TRUNCATE TABLE test.test;
            SQL> SELECT COUNT, CURRENT_TIMESTAMP() FROM test;
            -----------------------------+

            count current_timestamp()

            -----------------------------+

            0 2021-01-06 09:24:12
            35817 2021-01-06 09:25:37
            55553 2021-01-06 09:26:28
            59872 2021-01-06 09:27:01
            61970 2021-01-06 09:27:43
            61970 2021-01-06 09:28:02
            61970 2021-01-06 09:29:29
            61970 2021-01-06 09:30:14

            -----------------------------+

            Secondary instance: mysql-57

            2021-01-06T08:27:30.470174Z 383 Query DELETE FROM test WHERE id = 61171
            2021-01-06T08:27:30.470592Z 383 Query SELECT * FROM test WHERE id = 24971
            --> Quit is missing! And last statement was an INSERT NOT a DELETE!

            SQL> TRUNCATE TABLE test.test;
            SQL> SELECT COUNT, CURRENT_TIMESTAMP() FROM test;
            -----------------------------+

            count current_timestamp()

            -----------------------------+

            0 2021-01-06 09:24:07
            35228 2021-01-06 09:25:36
            50725 2021-01-06 09:26:30
            59681 2021-01-06 09:27:00
            61401 2021-01-06 09:27:42
            61401 2021-01-06 09:28:00
            61401 2021-01-06 09:29:28
            61401 2021-01-06 09:30:13

            -----------------------------+

            570 rows missing. Now warning in the MaxScale error log.

            The test table:

            CREATE TABLE `test` (
            `id` int(10) unsigned NOT NULL AUTO_INCREMENT,
            `data` varchar(128) DEFAULT NULL,
            `ts` timestamp(6) NOT NULL DEFAULT CURRENT_TIMESTAMP(6) ON UPDATE CURRENT_TIMESTAMP(6),
            PRIMARY KEY (`id`)
            ) ENGINE=InnoDB AUTO_INCREMENT=100000 DEFAULT CHARSET=latin1

            And the test script:

            oli Oli Sennhauser added a comment - Hi Markus I repeated the test. Please find all the information attached. You can also provoke the situation when you set on the tee'd instance innodb_flush_log_at_trx_commit = 1 and sync_binlog = 1 while you have these values set to 2 and 0 on the primary instance. Even with the slow pace: Test with slow pace (10000 us sleep, everything is fine): shell> date ; ./mixed_test.php Wed 6 Jan 09:17:34 CET 2021 IIII.........IIII^CCaught signal: 2 shell> date Wed 6 Jan 09:19:29 CET 2021 Primary instance: mariadb-105 SQL> SELECT COUNT , CURRENT_TIMESTAMP() FROM test; --------- --------------------+ count current_timestamp() --------- --------------------+ 6163 2021-01-06 09:20:49 --------- --------------------+ End of General Log: 330 Query SELECT * FROM test WHERE id = 383966 330 Query INSERT INTO test (id, data, ts) values (383966, "Test data insert", CURRENT_TIMESTAMP(6)) 210106 9:19:27 330 Quit Tee'd instance: mysql-57 SQL> SELECT COUNT , CURRENT_TIMESTAMP() FROM test; --------- --------------------+ count current_timestamp() --------- --------------------+ 6163 2021-01-06 09:21:02 --------- --------------------+ End of General Log: 2021-01-06T08:19:26.942277Z 360 Query SELECT * FROM test WHERE id = 383966 2021-01-06T08:19:26.943038Z 360 Query INSERT INTO test (id, data, ts) values (383966, "Test data insert", CURRENT_TIMESTAMP(6)) 2021-01-06T08:19:27.522731Z 360 Quit Test with fast pace (10 us sleep, rows are missing): shell> date ; ./mixed_test.php ; date Wed 6 Jan 09:24:26 CET 2021 IIIIIIII...DUDI^CCaught signal: 2 Wed 6 Jan 09:27:30 CET 2021 Primary instance: mariadb-105 350 Query SELECT * FROM test WHERE id = 934 350 Query INSERT INTO test (id, data, ts) values (934, "Test data insert", CURRENT_TIMESTAMP(6)) 210106 9:27:30 350 Quit SQL> TRUNCATE TABLE test.test; SQL> SELECT COUNT , CURRENT_TIMESTAMP() FROM test; --------- --------------------+ count current_timestamp() --------- --------------------+ 0 2021-01-06 09:24:12 35817 2021-01-06 09:25:37 55553 2021-01-06 09:26:28 59872 2021-01-06 09:27:01 61970 2021-01-06 09:27:43 61970 2021-01-06 09:28:02 61970 2021-01-06 09:29:29 61970 2021-01-06 09:30:14 --------- --------------------+ Secondary instance: mysql-57 2021-01-06T08:27:30.470174Z 383 Query DELETE FROM test WHERE id = 61171 2021-01-06T08:27:30.470592Z 383 Query SELECT * FROM test WHERE id = 24971 --> Quit is missing! And last statement was an INSERT NOT a DELETE! SQL> TRUNCATE TABLE test.test; SQL> SELECT COUNT , CURRENT_TIMESTAMP() FROM test; --------- --------------------+ count current_timestamp() --------- --------------------+ 0 2021-01-06 09:24:07 35228 2021-01-06 09:25:36 50725 2021-01-06 09:26:30 59681 2021-01-06 09:27:00 61401 2021-01-06 09:27:42 61401 2021-01-06 09:28:00 61401 2021-01-06 09:29:28 61401 2021-01-06 09:30:13 --------- --------------------+ 570 rows missing. Now warning in the MaxScale error log. The test table: CREATE TABLE `test` ( `id` int(10) unsigned NOT NULL AUTO_INCREMENT, `data` varchar(128) DEFAULT NULL, `ts` timestamp(6) NOT NULL DEFAULT CURRENT_TIMESTAMP(6) ON UPDATE CURRENT_TIMESTAMP(6), PRIMARY KEY (`id`) ) ENGINE=InnoDB AUTO_INCREMENT=100000 DEFAULT CHARSET=latin1 And the test script:
            oli Oli Sennhauser added a comment - mixed_test.php
            oli Oli Sennhauser added a comment - All the other details here: https://fromdual.com/traffic-mirroring-with-mariadb-maxscale
            markus makela markus makela added a comment -

            I've managed to reproduce this with a local setup. This seems to happen much more easily when the branched target is slower than the main target. I've created MXS-3370 for the documentation problem and changed this back to a bug.

            markus makela markus makela added a comment - I've managed to reproduce this with a local setup. This seems to happen much more easily when the branched target is slower than the main target. I've created MXS-3370 for the documentation problem and changed this back to a bug.
            markus makela markus makela added a comment -

            The fix to this was inadequate and it caused a session reference leak. This leak made it appear as if it worked as expected.

            markus makela markus makela added a comment - The fix to this was inadequate and it caused a session reference leak. This leak made it appear as if it worked as expected.

            People

              markus makela markus makela
              oli Oli Sennhauser
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.