[MCOL-5293] Replication not working after failover -restart pm1 Created: 2022-11-01  Updated: 2022-12-13  Resolved: 2022-12-12

Status: Closed
Project: MariaDB ColumnStore
Component/s: cmapi
Affects Version/s: None
Fix Version/s: 22.08.7

Type: Bug Priority: Blocker
Reporter: Daniel Lee (Inactive) Assignee: Roman
Resolution: Fixed Votes: 0
Labels: cluster

Attachments: File a0001_failover.result     Text File yes_mxs.log    
Issue Links:
Blocks
is blocked by MCOL-5306 Broken connections in mariadb while p... Closed
Relates
relates to MCOL-5286 HA Failing when losing a node - resta... Closed
Sprint: 2022-22
Assigned for Testing: Daniel Lee Daniel Lee (Inactive)

 Description   

Build tested: 22.08.2, latest build from drone (#5838)

Steps:

1. Create a 3PM docker cluster
2. Check cluster status on PM1. master=PM1, slave=PM2, PM3
3. Create a database and a table, insert a row in PM1
4. Verify table gets replicated to PM2
5. Execute "docker container stop mcs1", wait 90 seconds
6. Execute "docker container start mcs1", wait 60 seconds
7. Check cluster status on PM1. master=PM1, slave=PM2, PM3
Yesterday, I noticed MaxScale had PM2 sat as the master
Today, MaxScale also set PM1 as the master (I don't know why such behavior today)

For this test, PM2 was expected to take over as the master

8. Create another table and inserted a row on PM1
9. The table did not get replicated to PM2 or PM3
10. "show slave status" on PM2 returned nothing
11. "show slave status" on PM3 did return status, and no error

PM3 slave status showed master log mariadb-bin.000002, position 4684
PM1 master status showed master log mariadb-bin.000004, position 568



 Comments   
Comment by Roman [ 2022-12-02 ]

The scenario David.Hall mentioned should be re-tested when MCOL-5306 is tested.

Comment by Daniel Lee (Inactive) [ 2022-12-12 ]

Build verified: 22.08.7

engine: e243a5332b8613ce0e370a503461990fefc24fce
server: d3049350bb5c61340f5a7518b155d3c9dacdcb33
buildNo: 6202

Executed test case in mustest, test advance.a000_failover.test

Steps performed.

 	echo Checking MaxScale status......
    echo Checking ColumnStore status on mcs1......
    echo Running sanity test on mcs1......
    echo Checking ColumnStore status on mcs1......
    echo Stopping node mcs1......
 	echo Checking MaxScale status......
    echo Checking ColumnStore status on mcs2......
    echo Starting node mcs1......
    echo Checking MaxScale status......
    echo Checking ColumnStore status on mcs1......
    echo Create a 1g DBT2 database on mcs2......
    echo Check row counts on mcs1 for replication......
    echo Drop test database......
    echo Ending of test.

Test result, output from the test, has been attached.

Generated at Thu Feb 08 02:56:48 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.