[MXS-4240] MXS-4239 readconnroute module routing read queries to inconsistent slave node Created: 2022-08-11  Updated: 2022-11-09  Resolved: 2022-08-29

Status: Closed
Project: MariaDB MaxScale
Component/s: readconnroute
Affects Version/s: 2.5
Fix Version/s: 2.5.22, 6.4.2, 22.08.1

Type: Bug Priority: Critical
Reporter: Pramod Mahto Assignee: Esa Korhonen
Resolution: Fixed Votes: 0
Labels: None

Issue Links:
Blocks
is blocked by MXS-4239 Maxscale shows replication status a... Closed
Sprint: MXS-SPRINT-164, MXS-SPRINT-165

 Description   

Replication on Slave node is broken due to wrong password but Maxscale still shows Slave node as a valid Slave : *Slave, Running *

Routing read query via readconnroute module during these timeframe routing all read queries to Slave node (Inconsistent Node ) rather then Master node.

 
MariaDB [test]> show slave status\G
*************************** 1. row ***************************
                Slave_IO_State: Connecting to master
                   Master_Host: 192.168.47.23
                   Master_User: repuser
                   Master_Port: 3306
                 Connect_Retry: 60
               Master_Log_File:
           Read_Master_Log_Pos: 4
                Relay_Log_File: node_02-relay-bin.000001
                 Relay_Log_Pos: 4
         Relay_Master_Log_File:
              Slave_IO_Running: Connecting
             Slave_SQL_Running: Yes
               Replicate_Do_DB:
           Replicate_Ignore_DB:
            Replicate_Do_Table:
        Replicate_Ignore_Table:
       Replicate_Wild_Do_Table:
   Replicate_Wild_Ignore_Table:
                    Last_Errno: 0
                    Last_Error:
                  Skip_Counter: 0
           Exec_Master_Log_Pos: 4
               Relay_Log_Space: 256
               Until_Condition: None
                Until_Log_File:
                 Until_Log_Pos: 0
            Master_SSL_Allowed: No
            Master_SSL_CA_File:
            Master_SSL_CA_Path:
               Master_SSL_Cert:
             Master_SSL_Cipher:
                Master_SSL_Key:
         Seconds_Behind_Master: NULL
 Master_SSL_Verify_Server_Cert: No
                 Last_IO_Errno: 1045
                 Last_IO_Error: error connecting to master 'repuser@192.168.47.23:3306' - retry-time: 60  maximum-retries: 100000  message: Access denied for user 'repuser'@'192.168.47.24' (using password: YES)
                Last_SQL_Errno: 0
                Last_SQL_Error:
   Replicate_Ignore_Server_Ids:
              Master_Server_Id: 1
                Master_SSL_Crl:
            Master_SSL_Crlpath:
                    Using_Gtid: Slave_Pos
                   Gtid_IO_Pos: 1-1-23991
       Replicate_Do_Domain_Ids:
   Replicate_Ignore_Domain_Ids:
                 Parallel_Mode: optimistic
                     SQL_Delay: 0
           SQL_Remaining_Delay: NULL
       Slave_SQL_Running_State: Slave has read all relay log; waiting for more updates
              Slave_DDL_Groups: 44
Slave_Non_Transactional_Groups: 0
    Slave_Transactional_Groups: 23988
1 row in set (0.000 sec)

Maxscale configuration part :-

 
[ReadOnlySlaveService]
type=service
router=readconnroute
servers=node1,node2
user=maxuser
password=###########
router_options=slave
 
 
[ReadOnlySlaveListener]
type=listener
service=ReadOnlySlaveService
protocol=mariadbclient
address=0.0.0.0
port=4406
authenticator_options=lower_case_table_names=1
 
 
[ReadWriteSplitService]
type=service
router=readwritesplit
servers=node1,node2
user=maxuser
password=###########
master_failure_mode=fail_instantly
max_slave_replication_lag=1s
 
[ReadWriteSplitListener]
type=listener
service=ReadWriteSplitService
protocol=mariadbclient
address=0.0.0.0
port=4404
authenticator_options=lower_case_table_names=1

maxctrl output :-

 
root@max_02 tmp]# maxctrl list servers
 
┌────────┬───────────────┬──────┬─────────────┬─────────────────┬────────────┐
│ Server │ Address       │ Port │ Connections │ State           │ GTID       │
├────────┼───────────────┼──────┼─────────────┼─────────────────┼────────────┤
│ node1  │ 192.168.47.23 │ 3306 │ 2           │ Master, Running │ 1-1-598192 │
├────────┼───────────────┼──────┼─────────────┼─────────────────┼────────────┤
│ node2  │ 192.168.47.24 │ 3306 │ 0           │ Slave, Running  │ 1-1-23991  │
└────────┴───────────────┴──────┴─────────────┴─────────────────┴────────────┘

Using readwritesplit module (Port 4404) , working as expected.

 
[root@max_02 tmp]# maxctrl list servers
 
 
┌────────┬───────────────┬──────┬─────────────┬─────────────────┬────────────┐
│ Server │ Address       │ Port │ Connections │ State           │ GTID       │
├────────┼───────────────┼──────┼─────────────┼─────────────────┼────────────┤
│ node1  │ 192.168.47.23 │ 3306 │ 3           │ Master, Running │ 1-1-625551 │
├────────┼───────────────┼──────┼─────────────┼─────────────────┼────────────┤
│ node2  │ 192.168.47.24 │ 3306 │ 0           │ Slave, Running  │ 1-1-23991  │
└────────┴───────────────┴──────┴─────────────┴─────────────────┴────────────┘
 
[root@sysbench ~]# for i in `seq 1 2`; do mariadb -u sbtestuser -psbtestuser -sN -h 192.168.47.26 -P 4404 -e "select count(*), 'read done from: ',@@hostname, sleep(1) from sbtest.sbtest1;"; done
474125  read done from:         node_01 0
474447  read done from:         node_01 0

Using readconnroute module (Port 4406) , not working as expected. Routing to faulty Slave node.

 
┌────────┬───────────────┬──────┬─────────────┬─────────────────┬────────────┐
│ Server │ Address       │ Port │ Connections │ State           │ GTID       │
├────────┼───────────────┼──────┼─────────────┼─────────────────┼────────────┤
│ node1  │ 192.168.47.23 │ 3306 │ 2           │ Master, Running │ 1-1-630668 │
├────────┼───────────────┼──────┼─────────────┼─────────────────┼────────────┤
│ node2  │ 192.168.47.24 │ 3306 │ 1           │ Slave, Running  │ 1-1-23991  │
└────────┴───────────────┴──────┴─────────────┴─────────────────┴────────────┘
 
 
 
 
[root@sysbench ~]# for i in `seq 1 2`; do mariadb -u sbtestuser -psbtestuser -sN -h 192.168.47.26 -P 4406 -e "select count(*), 'read done from: ',@@hostname, sleep(1) from sbtest.sbtest1;"; done
186491  read done from:         node_02 0
186491  read done from:         node_02 0

Maxscale error log :-

 
2022-07-15 01:22:34   info   : (64) [readconnroute] (ReadOnlySlaveService) New session for server node2. Connections : 1
2022-07-15 01:22:34   info   : (64) Started ReadOnlySlaveService client session [64] for 'sbtestuser' from 192.168.47.27
2022-07-15 01:22:34   info   : (2) (ReadWriteSplitService) > Autocommit: [enabled], trx is [not open], cmd: (0x03) COM_QUERY, plen: 243, type: QUERY_TYPE_WRITE, stmt: INSERT INTO sbtest1 (id, k, c, pad) VALUES (0, 4993, '16148217111-13518232889-91641119311-90458143703-96285901191-91564609824-29243606076-29697426785-88416212636-22906525826', '77685601517-26310348041-11842097123-50607119939-02685123519') 
2022-07-15 01:22:34   info   : (2) [readwritesplit] (ReadWriteSplitService) Route query to master: node1 <
2022-07-15 01:22:34   info   : (43) (ReadWriteSplitService) > Autocommit: [enabled], trx is [not open], cmd: (0x17) COM_STMT_EXECUTE, plen: 24, type: QUERY_TYPE_WRITE, stmt: ID: 1 
2022-07-15 01:22:34   info   : (43) [readwritesplit] (ReadWriteSplitService) Route query to master: node1 <
2022-07-15 01:22:34   info   : (43) [readwritesplit] (ReadWriteSplitService) COM_STMT_EXECUTE on node1
 
 
2022-07-15 01:22:34   info   : (64) Connected to 'node2' with thread id 594
2022-07-15 01:22:34   info   : (43) [readwritesplit] (ReadWriteSplitService) Reply complete, last reply from node1
2022-07-15 01:22:34   info   : (64) [readconnroute] (ReadOnlySlaveService) Routed [COM_QUERY] to 'node2' select count(*), 'read done from: ',@@hostname, sleep(1) from sbtest.sbtest1
2022-07-15 01:22:34   info   : (2) [readwritesplit] (ReadWriteSplitService) Reply complete, last reply from node1
2022-07-15 01:22:34   info   : (64) Authentication to 'node2' succeeded.
 
 
2022-07-15 01:22:34   info   : (2) (ReadWriteSplitService) > Autocommit: [enabled], trx is [not open], cmd: (0x03) COM_QUERY, plen: 243, type: QUERY_TYPE_WRITE, stmt: INSERT INTO sbtest1 (id, k, c, pad) VALUES (0, 4995, '93709769087-61733446348-22902750083-03723859371-93490945024-16714819133-48008668078-33683508223-19043347463-89028002640', '67846948656-67478753035-57629612508-39087358299-59008635272') 
2022-07-15 01:22:34   info   : (2) [readwritesplit] (ReadWriteSplitService) Route query to master: node1 <
2022-07-15 01:22:34   info   : (2) [readwritesplit] (ReadWriteSplitService) Reply complete, last reply from node1
2022-07-15 01:22:34   info   : (2) (ReadWriteSplitService) > Autocommit: [enabled], trx is [not open], cmd: (0x03) COM_QUERY, plen: 243, type: QUERY_TYPE_WRITE, stmt: INSERT INTO sbtest1 (id, k, c, pad) VALUES (0, 5042, '17236671370-96473937995-19628916897-22684382456-84554409544-69359675695-30189541135-56391311016-96496343593-54732571723', '37880484456-54793030384-06887477137-20007750699-83358850158') 
2022-07-15 01:22:34   info   : (2) [readwritesplit] (ReadWriteSplitService) Route query to master: node1 <
2022-07-15 01:22:34   info   : (2) [readwritesplit] (ReadWriteSplitService) Reply complete, last reply from node1


Generated at Thu Feb 08 04:27:13 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.