[MDEV-22573] galera.galera_last_committed_id MTR failed: Result content mismatch Created: 2020-05-15  Updated: 2022-05-03

Status: Stalled
Project: MariaDB Server
Component/s: Galera, Tests
Affects Version/s: 10.5.3
Fix Version/s: 10.5

Type: Bug Priority: Major
Reporter: Stepan Patryshev (Inactive) Assignee: Julius Goryavsky
Resolution: Unresolved Votes: 0
Labels: None
Environment:

kvm-deb-xenial-amd64


Issue Links:
Blocks
blocks MDEV-22122 Galera test failures on 10.5 Open

 Description   

galera.galera_last_committed_id MTR failed on BB, 10.5: Result content mismatch.

stdio.log:

10.5.3 CS, d50f776930425e540678238798b4f7666b9cbb76, kvm-deb-xenial-amd64

galera.galera_last_committed_id 'binlogon,innodb' w2 [ fail ]
        Test ended at 2020-05-04 14:25:51
 
CURRENT_TEST: galera.galera_last_committed_id
--- /usr/share/mysql/mysql-test/suite/galera/r/galera_last_committed_id.result	2020-05-04 08:22:31.000000000 -0400
+++ /dev/shm/var/2/log/galera_last_committed_id.reject	2020-05-04 14:25:51.478655282 -0400
@@ -8,24 +8,24 @@
 INSERT INTO t1 VALUES (1);
 connect node_1a, 127.0.0.1, root, , test, $NODE_MYPORT_1;;
 connection node_1a;
-SELECT WSREP_LAST_WRITTEN_GTID() != '100-1-2' AS wsrep_written_does_not_match_different_conn;
+SELECT WSREP_LAST_WRITTEN_GTID() != '100-1-1' AS wsrep_written_does_not_match_different_conn;
 wsrep_written_does_not_match_different_conn
 1
 connection node_2;
-SELECT WSREP_LAST_WRITTEN_GTID() != '100-1-2' AS wsrep_written_does_not_match_different_nodes;
+SELECT WSREP_LAST_WRITTEN_GTID() != '100-1-1' AS wsrep_written_does_not_match_different_nodes;
 wsrep_written_does_not_match_different_nodes
 1
 connection node_1;
 INSERT INTO t1 VALUES (1);
 connection node_2;
 wsrep_last_written_seen_id_match
-1
+0
 connection node_1;
 SET AUTOCOMMIT=OFF;
 START TRANSACTION;
 INSERT INTO t1 VALUES (1);
-WSREP_LAST_SEEN_GTID() = '100-1-3'
-1
+WSREP_LAST_SEEN_GTID() = '100-1-1'
+0
 wsrep_last_written_id_match
 1
 COMMIT;
 
mysqltest: Result content mismatch



 Comments   
Comment by Stepan Patryshev (Inactive) [ 2020-05-28 ]

It is still failed on bb, 10.5.

Comment by Stepan Patryshev (Inactive) [ 2020-06-03 ]

mkaruza if they are not related to the tests themselves, what do you think causes these failures?

Comment by Mario Karuza (Inactive) [ 2020-06-04 ]

This specific run was influenced by MW-328B. Tests MW-328A & 328B were enabled on 10.5 - i assume not long time ago (https://github.com/MariaDB/server/commit/fde94b4cd6c916f118ccb2785c09dafef391298c).

Issue with this specific test and all test on same run is that they can't bind to port which is used with MW-328B test and failed, as explained before in this ticket- galera.galera_last_committed_id issue was fixed by Marko's patch.

Comment by Stepan Patryshev (Inactive) [ 2020-06-04 ]

mkaruza Thank you for your explanation, but I still don't understand how this failure can be fixed by the commits by Marko from MDEV-22452 a month ago if the last time galera.galera_last_committed_id failed on BB was only 5 days ago - http://buildbot.askmonty.org/buildbot/builders/kvm-deb-xenial-aarch64/builds/4229 ?

Comment by Mario Karuza (Inactive) [ 2020-06-04 ]

Please look closely what is difference between original failure and ones that are problematic lately.

1. Original failure had difference between test run and recorded result file, exactly that was fixed with already mentioned patch.

2. Now the failure is different and this time (http://buildbot.askmonty.org/buildbot/builders/kvm-deb-xenial-aarch64/builds/4229) is triggered by MW-328A. Looking into result file you can notice that failure is on ALL test after MW-328A:

Errors/warnings were found in logfiles during server shutdown after running the
following sequence(s) of tests:
galera.MW-328A
galera.MW-328A
galera.galera_gtid_trx_conflict
galera.galera_last_committed_id
galera.galera_last_committed_id
galera.galera_gtid_trx_conflict
galera.galera_log_bin
galera.galera_last_committed_id
galera.galera_mdev_15611
galera.galera_last_committed_id
galera.galera_query_cache
galera.galera_query_cache_sync_wait
galera.galera_sbr_binlog
galera.galera_log_bin

Failure can be seen here:

***Warnings generated in error logs during shutdown after running tests: galera.galera_last_committed_id

2020-05-29 23:03:55 0 [Warning] WSREP: error while trying to listen 'tcp://0.0.0.0:16002?socket.non_blocking=1', asio error 'bind: Address already in use'
2020-05-29 23:03:55 0 [ERROR] WSREP: failed to open gcomm backend connection: 98: error while trying to listen 'tcp://0.0.0.0:16002?socket.non_blocking=1', asio error 'bind: Address already in use': 98 (Address already in use)
2020-05-29 23:03:55 0 [ERROR] WSREP: gcs/src/gcs_core.cpp:gcs_core_open():220: Failed to open backend connection: -98 (Address already in use)
2020-05-29 23:03:55 0 [ERROR] WSREP: gcs connect failed: Address already in use
2020-05-29 23:03:55 0 [ERROR] Aborting

From traces you can read that test CAN'T bind to specific port which is associated with worker 2. All other test that are run after MW-328A fail with same error. I believe that one of our team members is working on this issue.

As mentioned in last comment, these 2 tests (MW-328A, MW-328B) are enabled 17 days ago.

Comment by Stepan Patryshev (Inactive) [ 2020-06-05 ]

mkaruza Thank you for the detailed clarifications. Now it is clear.

Comment by Stepan Patryshev (Inactive) [ 2020-06-05 ]

Linked to MDEV-22666 as "blocked by".

Generated at Thu Feb 08 09:15:48 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.