Uploaded image for project: 'MariaDB MaxScale'
  1. MariaDB MaxScale
  2. MXS-4126

With Semi-Sync Replication | Lag doesn't prevent traffic routing to Slave node having second_behind_master=0

    XMLWordPrintable

Details

    • New Feature
    • Status: Closed (View Workflow)
    • Major
    • Resolution: Won't Do
    • None
    • N/A
    • N/A

    Description

      With semi-sync replication , replication waits for just one replica to acknowledge that it has received and logged the events. In such case considering second_behind_master parameter value to route the traffic to other slave node lead to false reporting.

      May be we can think of alternate option of second_behind_master while routing the traffic on nodes via maxscale (May be comparing gtid_current_pos value across the nodes ) or we need to have second_behind_master>0 in-place if the slave node is running behind the master in case of semi-sync replication.

      Bulk load using readwritesplit module

       
      sysbench /usr/share/sysbench/oltp_read_write.lua --mysql-host=192.168.47.22 --mysql-port=4404 --mysql-user=sbtestuser --mysql-password=sbtestuser --mysql-db=sbtest --db-driver=mysql --tables=20 --threads=75 --rate=140 --events=0 --thread-stack-size=128K --report-interval=10 --time=2400 --forced-shutdown='0' run
       
       
      sysbench 1.0.20 (using bundled LuaJIT 2.1.0-beta2)
       
      Running the test with following options:
      Number of threads: 75
      Target transaction rate: 140/sec
      Report intermediate results every 10 second(s)
      Initializing random number generator from current time
       
      Forcing shutdown in 2400 seconds
       
      Initializing worker threads...
       
      Threads started!
       
      [ 10s ] thds: 75 tps: 96.68 qps: 1987.71 (r/w/o: 1396.62/351.43/239.65) lat (ms,95%): 3095.38 err/s: 0.00 reconn/s: 0.00
      [ 10s ] queue length: 462, concurrency: 75
      [ 20s ] thds: 75 tps: 98.80 qps: 1968.43 (r/w/o: 1378.62/352.80/237.00) lat (ms,95%): 6713.97 err/s: 0.00 reconn/s: 0.00
       
       
       
      [root@max_01 ~]# maxctrl list servers
      ┌─────────┬───────────────┬──────┬─────────────┬─────────────────┬────────────┐
      │ Server  │ Address       │ Port │ Connections │ State           │ GTID       │
      ├─────────┼───────────────┼──────┼─────────────┼─────────────────┼────────────┤
      │ node_01 │ 192.168.47.23 │ 3306 │ 0           │ Master, Running │ 1-1-146538 │
      ├─────────┼───────────────┼──────┼─────────────┼─────────────────┼────────────┤
      │ node_02 │ 192.168.47.24 │ 3306 │ 0           │ Slave, Running  │ 1-1-146538 │
      ├─────────┼───────────────┼──────┼─────────────┼─────────────────┼────────────┤
      │ node_03 │ 192.168.47.25 │ 3306 │ 0           │ Slave, Running  │ 1-1-146538 │
      └─────────┴───────────────┴──────┴─────────────┴─────────────────┴────────────┘
      
      

      Lets stop mariadb service on slave node_03

       
      [root@node_03 ~]# systemctl stop mariadb
       
       
       
      Every 2.0s: maxctrl list servers                                                                                                     Mon May  9 17:33:11 2022
       
      ┌─────────┬───────────────┬──────┬─────────────┬─────────────────┬────────────┐
      │ Server  │ Address       │ Port │ Connections │ State           │ GTID       │
      ├─────────┼───────────────┼──────┼─────────────┼─────────────────┼────────────┤
      │ node_01 │ 192.168.47.23 │ 3306 │ 75          │ Master, Running │ 1-1-148993 │
      ├─────────┼───────────────┼──────┼─────────────┼─────────────────┼────────────┤
      │ node_02 │ 192.168.47.24 │ 3306 │ 75          │ Slave, Running  │ 1-1-148993 │
      ├─────────┼───────────────┼──────┼─────────────┼─────────────────┼────────────┤
      │ node_03 │ 192.168.47.25 │ 3306 │ 0           │ Down            │ 1-1-146936 │
      └─────────┴───────────────┴──────┴─────────────┴─────────────────┴────────────┘
      
      

      Lets start mariadb service on node_03

       
      [root@node_03 ~]# systemctl start  mariadb
       
       
      Every 2.0s: maxctrl list servers                                                                                                     Mon May  9 17:34:05 2022
       
      ┌─────────┬───────────────┬──────┬─────────────┬─────────────────┬────────────┐
      │ Server  │ Address       │ Port │ Connections │ State           │ GTID       │
      ├─────────┼───────────────┼──────┼─────────────┼─────────────────┼────────────┤
      │ node_01 │ 192.168.47.23 │ 3306 │ 75          │ Master, Running │ 1-1-155397 │
      ├─────────┼───────────────┼──────┼─────────────┼─────────────────┼────────────┤
      │ node_02 │ 192.168.47.24 │ 3306 │ 75          │ Slave, Running  │ 1-1-155397 │
      ├─────────┼───────────────┼──────┼─────────────┼─────────────────┼────────────┤
      │ node_03 │ 192.168.47.25 │ 3306 │ 0           │ Slave, Running  │ 1-1-147361 │
      └─────────┴───────────────┴──────┴─────────────┴─────────────────┴────────────┘
      
      

       
      MariaDB [(none)]> show slave status\G
      *************************** 1. row ***************************
                      Slave_IO_State: Waiting for master to send event
                         Master_Host: 192.168.47.23
                         Master_User: repuser
                         Master_Port: 3306
                       Connect_Retry: 60
                     Master_Log_File: node_01.000002
                 Read_Master_Log_Pos: 121408196
                      Relay_Log_File: node_03-relay-bin.000001
                       Relay_Log_Pos: 4
               Relay_Master_Log_File: node_01.000002
                    Slave_IO_Running: Yes
                   Slave_SQL_Running: Yes
                     Replicate_Do_DB:
                 Replicate_Ignore_DB:
                  Replicate_Do_Table:
              Replicate_Ignore_Table:
             Replicate_Wild_Do_Table:
         Replicate_Wild_Ignore_Table:
                          Last_Errno: 0
                          Last_Error:
                        Skip_Counter: 0
                 Exec_Master_Log_Pos: 121408196
                     Relay_Log_Space: 256
                     Until_Condition: None
                      Until_Log_File:
                       Until_Log_Pos: 0
                  Master_SSL_Allowed: No
                  Master_SSL_CA_File:
                  Master_SSL_CA_Path:
                     Master_SSL_Cert:
                   Master_SSL_Cipher:
                      Master_SSL_Key:
               Seconds_Behind_Master: 0
       Master_SSL_Verify_Server_Cert: No
                       Last_IO_Errno: 0
                       Last_IO_Error:
                      Last_SQL_Errno: 0
                      Last_SQL_Error:
         Replicate_Ignore_Server_Ids:
                    Master_Server_Id: 1
                      Master_SSL_Crl:
                  Master_SSL_Crlpath:
                          Using_Gtid: Slave_Pos
                         Gtid_IO_Pos: 1-1-147361
             Replicate_Do_Domain_Ids:
         Replicate_Ignore_Domain_Ids:
                       Parallel_Mode: conservative
                           SQL_Delay: 0
                 SQL_Remaining_Delay: NULL
             Slave_SQL_Running_State: Slave has read all relay log; waiting for the slave I/O thread to update it
                    Slave_DDL_Groups: 0
      Slave_Non_Transactional_Groups: 0
          Slave_Transactional_Groups: 0
      1 row in set (0.000 sec)
      
      

      Another read load via readwritesplit module through maxscale

       
      [root@sysbench ~]# sysbench /usr/share/sysbench/oltp_read_only.lua --mysql-host=192.168.47.22 --mysql-port=4404 --mysql-user=sbtestuser --mysql-password=sbtestuser --mysql-db=sbtest --db-driver=mysql --tables=20 --threads=10 --rate=140 --events=0 --thread-stack-size=128K --report-interval=10 --time=2400 --forced-shutdown='0' run
      sysbench 1.0.20 (using bundled LuaJIT 2.1.0-beta2)
       
      Running the test with following options:
      Number of threads: 10
      Target transaction rate: 140/sec
      Report intermediate results every 10 second(s)
      Initializing random number generator from current time
       
      Forcing shutdown in 2400 seconds
       
      Initializing worker threads...
       
      Threads started!
       
      [ 10s ] thds: 10 tps: 14.00 qps: 226.94 (r/w/o: 197.95/0.00/28.99) lat (ms,95%): 8795.93 err/s: 0.00 reconn/s: 0.00
      [ 10s ] queue length: 1263, concurrency: 10
      [ 20s ] thds: 10 tps: 14.60 qps: 236.90 (r/w/o: 208.10/0.00/28.80) lat (ms,95%): 17435.99 err/s: 0.00 reconn/s: 0.00
      [ 20s ] queue length: 2537, concurrency: 10
      [ 30s ] thds: 10 tps: 14.40 qps: 234.90 (r/w/o: 205.70/0.00/29.20) lat (ms,95%): 26861.48 err/s: 0.00 reconn/s: 0.00
      [ 30s ] queue length: 3798, concurrency: 10
       
       
      Every 2.0s: maxctrl list servers                                                                                                     Mon May  9 17:39:40 2022
       
      ┌─────────┬───────────────┬──────┬─────────────┬─────────────────┬────────────┐
      │ Server  │ Address       │ Port │ Connections │ State           │ GTID       │
      ├─────────┼───────────────┼──────┼─────────────┼─────────────────┼────────────┤
      │ node_01 │ 192.168.47.23 │ 3306 │ 85          │ Master, Running │ 1-1-187704 │
      ├─────────┼───────────────┼──────┼─────────────┼─────────────────┼────────────┤
      │ node_02 │ 192.168.47.24 │ 3306 │ 85          │ Slave, Running  │ 1-1-187700 │
      ├─────────┼───────────────┼──────┼─────────────┼─────────────────┼────────────┤
      │ node_03 │ 192.168.47.25 │ 3306 │ 10          │ Slave, Running  │ 1-1-181709 │
      └─────────┴───────────────┴──────┴─────────────┴─────────────────┴────────────┘
      

      Traffic routed to lagging slave node having second_behind_master=0

       
      MariaDB [(none)]> show slave status\G
      *************************** 1. row ***************************
                      Slave_IO_State: Waiting for master to send event
                         Master_Host: 192.168.47.23
                         Master_User: repuser
                         Master_Port: 3306
                       Connect_Retry: 60
                     Master_Log_File: node_01.000002
                 Read_Master_Log_Pos: 149545891
                      Relay_Log_File: node_03-relay-bin.000001
                       Relay_Log_Pos: 4
               Relay_Master_Log_File: node_01.000002
                    Slave_IO_Running: Yes
                   Slave_SQL_Running: Yes
                     Replicate_Do_DB:
                 Replicate_Ignore_DB:
                  Replicate_Do_Table:
              Replicate_Ignore_Table:
             Replicate_Wild_Do_Table:
         Replicate_Wild_Ignore_Table:
                          Last_Errno: 0
                          Last_Error:
                        Skip_Counter: 0
                 Exec_Master_Log_Pos: 149545891
                     Relay_Log_Space: 256
                     Until_Condition: None
                      Until_Log_File:
                       Until_Log_Pos: 0
                  Master_SSL_Allowed: No
                  Master_SSL_CA_File:
                  Master_SSL_CA_Path:
                     Master_SSL_Cert:
                   Master_SSL_Cipher:
                      Master_SSL_Key:
               Seconds_Behind_Master: 0
       Master_SSL_Verify_Server_Cert: No
                       Last_IO_Errno: 0
                       Last_IO_Error:
                      Last_SQL_Errno: 0
                      Last_SQL_Error:
         Replicate_Ignore_Server_Ids:
                    Master_Server_Id: 1
                      Master_SSL_Crl:
                  Master_SSL_Crlpath:
                          Using_Gtid: Slave_Pos
                         Gtid_IO_Pos: 1-1-181709
             Replicate_Do_Domain_Ids:
         Replicate_Ignore_Domain_Ids:
                       Parallel_Mode: conservative
                           SQL_Delay: 0
                 SQL_Remaining_Delay: NULL
             Slave_SQL_Running_State: Slave has read all relay log; waiting for the slave I/O thread to update it
                    Slave_DDL_Groups: 0
      Slave_Non_Transactional_Groups: 0
          Slave_Transactional_Groups: 0
      1 row in set (0.000 sec)
       
      
      

      Attachments

        1. node_03_my.cnf
          0.3 kB
        2. node_02_my.cnf
          0.2 kB
        3. node_01_my.cnf
          0.2 kB
        4. maxscale.cnf
          2 kB

        Issue Links

          Activity

            People

              toddstoffel Todd Stoffel (Inactive)
              pramod.mahto@mariadb.com Pramod Mahto
              Votes:
              1 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.