[MXS-3508] causal_reads=global results in missing data reads Created: 2021-04-21  Updated: 2021-08-13  Resolved: 2021-08-02

Status: Closed
Project: MariaDB MaxScale
Component/s: readwritesplit
Affects Version/s: 2.5.10
Fix Version/s: 2.5.15

Type: Bug Priority: Major
Reporter: Alex Assignee: markus makela
Resolution: Fixed Votes: 0
Labels: None
Environment:

Debian 10


Issue Links:
Relates
relates to MXS-3695 Causal Consistency with MaxScale's Re... Closed

 Description   

We have strange problem

When setting causal_reads=global thera are missing data reads after inserts

running 'show processlist' on slave I see MASTER_GTID_WAIT in processlist. But anyway we have missed data.

Setting it back to fast fixes the problem

│ Parameters          │ {                                                   │
│                     │     "auth_all_servers": true,                       │
│                     │     "causal_reads": "global",                         │
│                     │     "causal_reads_timeout": 3000,                   │
│                     │     "cluster": "TheMonitor",                        │
│                     │     "connection_keepalive": 300,                    │
│                     │     "connection_timeout": 0,                        │
│                     │     "delayed_retry": true,                          │
│                     │     "delayed_retry_timeout": 30000,                 │
│                     │     "disable_sescmd_history": false,                │
│                     │     "enable_root_user": false,                      │
│                     │     "lazy_connect": false,                          │
│                     │     "localhost_match_wildcard_host": true,          │
│                     │     "log_auth_warnings": true,                      │
│                     │     "master_accept_reads": true,                    │
│                     │     "master_failure_mode": "fail_on_write",         │
│                     │     "master_reconnection": true,                    │
│                     │     "max_connections": 0,                           │
│                     │     "max_sescmd_history": 1500,                     │
│                     │     "max_slave_connections": "255",                 │
│                     │     "max_slave_replication_lag": 1000,              │
│                     │     "net_write_timeout": 0,                         │
│                     │     "optimistic_trx": false,                        │
│                     │     "password": "*****",                            │
│                     │     "prune_sescmd_history": true,                   │
│                     │     "rank": "primary",                              │
│                     │     "retain_last_statements": -1,                   │
│                     │     "retry_failed_reads": true,                     │
│                     │     "router_options": null,                         │
│                     │     "session_trace": false,                         │
│                     │     "session_track_trx_state": false,               │
│                     │     "slave_connections": 255,                       │
│                     │     "slave_selection_criteria": "ADAPTIVE_ROUTING", │
│                     │     "strict_multi_stmt": true,                      │
│                     │     "strict_sp_calls": true,                        │
│                     │     "strip_db_esc": true,                           │
│                     │     "targets": null,                                │
│                     │     "transaction_replay": true,                     │
│                     │     "transaction_replay_attempts": 20,              │
│                     │     "transaction_replay_max_size": "1073741824",    │
│                     │     "transaction_replay_retry_on_deadlock": false,  │
│                     │     "use_sql_variables_in": "master",               │
│                     │     "user": "maxscale",                             │
│                     │     "version_string": null                          │
│                     │ }                                                   │
:



 Comments   
Comment by markus makela [ 2021-04-21 ]

Would you happen to have a test case that reproduces the problem?

Comment by Alex [ 2021-04-21 ]

It is quite a simple case

insert into table and then select from the table with primary key. And one in several handred time select returns no data.
Server has some load. SSD disks.

Master uses InnoDB, slave uses TokuDB

Comment by markus makela [ 2021-07-12 ]

The value of causal_reads_timeout is relatively low at three seconds. If you increase this to 15 seconds, does the problem go away?

Comment by Alex [ 2021-07-15 ]

Does it realy matter? As I can see in docs: "If the slave has not caught up to the master within the configured time, it will be retried on the master."

Comment by markus makela [ 2021-07-15 ]

Hmm, that is true. The timeout being hit it wouldn't have the effect I thought it would have. Have you ever tested with causal_reads=local?

Comment by Alex [ 2021-07-15 ]

yes. we are using local mode now in production

Comment by Alex [ 2021-08-13 ]

Hi, When 2.5.15 will be publicly availible?

Generated at Thu Feb 08 04:21:56 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.