[MXS-4677] MaxScale BinlogRouter skips large transactions causing data Inconsistency on attached slave Created: 2023-07-20  Updated: 2023-08-08  Resolved: 2023-08-08

Status: Closed
Project: MariaDB MaxScale
Component/s: binlogrouter
Affects Version/s: 6.4.7
Fix Version/s: 2.5.28, 6.4.9, 22.08.8, 23.02.4

Type: Bug Priority: Critical
Reporter: ALEKSEI KHAMIDOV Assignee: markus makela
Resolution: Fixed Votes: 0
Labels: None
Environment:

OS: CentOS Stream release 8
MariaDB: 10.11.2
MaxScale: 6.4.7


Issue Links:
Relates
relates to MXS-4690 Binlogrouter runs out of memory on ve... Closed
relates to MXS-4691 Binlogrouter cannot write binlog file... Closed
Sprint: MXS-SPRINT-187, MXS-SPRINT-188

 Description   

Due to the presence of longtext columns in the usersDiscoveredLessons table, a substantial amount of data ends up in the binlog. Since we are using row-based replication, every row is copied into the binlog, exacerbating the data volume. Table creation script:

MariaDB [(none)]> show create table test_long_transaction_bug.usersDiscoveredLessons \G;
*************************** 1. row ***************************
       Table: usersDiscoveredLessons
Create Table: CREATE TABLE `usersDiscoveredLessons` (
  `id` int(11) NOT NULL AUTO_INCREMENT,
  `userId` int(11) unsigned NOT NULL,
  `groupId` int(11) unsigned DEFAULT NULL,
  `slideIds` longtext CHARACTER SET utf8mb4 COLLATE utf8mb4_bin DEFAULT NULL CHECK (json_valid(`slideIds`)),
  `videoIds` longtext CHARACTER SET utf8mb4 COLLATE utf8mb4_bin DEFAULT NULL CHECK (json_valid(`videoIds`)),
  `taskIds` longtext CHARACTER SET utf8mb4 COLLATE utf8mb4_bin DEFAULT NULL CHECK (json_valid(`taskIds`)),
  `createdAt` timestamp NOT NULL DEFAULT current_timestamp(),
  `updatedAt` timestamp NOT NULL DEFAULT current_timestamp(),
  `deletedAt` timestamp NULL DEFAULT NULL,
  PRIMARY KEY (`id`),
  UNIQUE KEY `userId_groupId_uniq` (`userId`,`groupId`)
) ENGINE=InnoDB AUTO_INCREMENT=29297215 DEFAULT CHARSET=utf8mb3 COLLATE=utf8mb3_unicode_ci
1 row in set (0.001 sec)

Data sample (the array of slideIds values is significantly shortened, it could include up to 500 Ids):

MariaDB [(none)]> select * from test_long_transaction_bug.usersDiscoveredLessons LIMIT 1 \G;
*************************** 1. row ***************************
       id: 28566982
   userId: 102592167
   groupId: 2
slideIds: [114586208,112104238,28115292,...,96955794]
  videoIds: [112312333,123123444]
taskIds: []
createdAt: 2023-07-13 00:00:00
updatedAt: 2023-07-13 00:00:00
deletedAt: NULL
1 row in set (0.001 sec)

The size of the file from which data will be loaded into the table:

[root@sc-db-main-stage-01 ~]# ls -lah /var/lib/mysql/user*
-rw-r--r-- 1 root root 7.9G Jul 18 18:47 /var/lib/mysql/usersDiscoveredLessons.sql

The sizes of binlogs:

$ ansible sc-db-main-stage-0* -i inventory_stage -u lexusrules -b -m shell -a "ls -lath /var/lib/mysql/mysqld*"
sc-db-main-stage-03 | CHANGED | rc=0 >>
-rw-rw---- 1 mysql mysql 331M Jul 20 07:46 /var/lib/mysql/mysqld-bin.000005
-rw-rw---- 1 mysql mysql   20 Jul 19 13:39 /var/lib/mysql/mysqld-bin.index
sc-db-main-stage-01 | CHANGED | rc=0 >>
-rw-rw---- 1 mysql mysql 331M Jul 20 07:46 /var/lib/mysql/mysqld-bin.000402
-rw-rw---- 1 mysql mysql   20 Jul 19 13:39 /var/lib/mysql/mysqld-bin.index
sc-db-main-stage-04 | CHANGED | rc=0 >>
-rw-rw---- 1 mysql mysql 331M Jul 20 07:46 /var/lib/mysql/mysqld-bin.000008
-rw-rw---- 1 mysql mysql   20 Jul 19 13:39 /var/lib/mysql/mysqld-bin.index
sc-db-main-stage-02 | CHANGED | rc=0 >>
-rw-rw---- 1 mysql mysql 331M Jul 20 07:46 /var/lib/mysql/mysqld-bin.000400
-rw-rw---- 1 mysql mysql   20 Jul 19 13:39 /var/lib/mysql/mysqld-bin.index

I am demonstrating that the table is empty after truncating and before initiating the load:

$ ansible sc-db-main-stage-0* -i inventory_stage -u lexusrules -b -m shell -a "mysql -e 'select count(*) from test_long_transaction_bug.usersDiscoveredLessons;'"
sc-db-main-stage-01 | CHANGED | rc=0 >>
count(*)
0
sc-db-main-stage-02 | CHANGED | rc=0 >>
count(*)
0
sc-db-main-stage-03 | CHANGED | rc=0 >>
count(*)
0
sc-db-main-stage-04 | CHANGED | rc=0 >>
count(*)
0

The actual execution of the load:

[root@sc-db-main-stage-01 ~]# mysql test_long_transaction_bug -e "SET autocommit=0; source /var/lib/mysql/usersDiscoveredLessons.sql; COMMIT;" &

MaxScale binlog router error:

[root@sc-maxscale-binlog-stage-01 ~]# tail -n500 /var/log/maxscale/maxscale.log  | grep -v warning
2023-07-20 05:10:07   error  : (1260) [MariaDBProtocol] Client (172.21.128.78) sent an invalid HandShakeResponse.
2023-07-20 07:10:04   error  : (1293) [MariaDBProtocol] Client (172.21.128.78) sent an invalid HandShakeResponse.
2023-07-20 07:59:08   error  : (srv_main_repl); Error received during replication from '172.18.16.194:3306': std::bad_alloc

The result - the data is inconsistent:

$ ansible sc-db-main-stage-0* -i inventory_stage -u lexusrules -b -m shell -a "mysql -e 'select count(*) from test_long_transaction_bug.usersDiscoveredLessons;'"
sc-db-main-stage-01 | CHANGED | rc=0 >>
count(*)
632852
sc-db-main-stage-02 | CHANGED | rc=0 >>
count(*)
632852
sc-db-main-stage-03 | CHANGED | rc=0 >>
count(*)
632852
sc-db-main-stage-04 | CHANGED | rc=0 >>
count(*)
0

The cluster continues to operate with inconsistent data without any apparent issues:

$ ansible sc_maxscale -i inventory_stage -u lexusrules -b -m shell -a "maxctrl list servers"
sc-maxscale-stage-01 | CHANGED | rc=0 >>
┌────────────────────────┬───────────────┬──────┬─────────────┬───────────────────────┬─────────────────┐
│ Server                 │ Address       │ Port │ Connections │ State                 │ GTID            │
├────────────────────────┼───────────────┼──────┼─────────────┼───────────────────────┼─────────────────┤
│ sc-db-main-stage-01    │ 172.18.16.194 │ 3306 │ 582         │ Master, Running       │ 0-14-336882805  │
├────────────────────────┼───────────────┼──────┼─────────────┼───────────────────────┼─────────────────┤
│ sc-db-main-stage-02    │ 172.18.16.103 │ 3306 │ 383         │ Slave, Running        │ 0-14-336882805  │
├────────────────────────┼───────────────┼──────┼─────────────┼───────────────────────┼─────────────────┤
│ sc-db-main-stage-03    │ 172.18.16.197 │ 3306 │ 383         │ Slave, Running        │ 0-14-336882805  │
├────────────────────────┼───────────────┼──────┼─────────────┼───────────────────────┼─────────────────┤
│ sc-db-main-stage-04    │ 172.18.16.114 │ 3306 │ 0           │ Maintenance, Running  │ 0-14-336882805  │
├────────────────────────┼───────────────┼──────┼─────────────┼───────────────────────┼─────────────────┤
│ sc-maxscale-binlog     │ 172.18.16.69  │ 3306 │ 0           │ Binlog Relay, Running │ 0-14-336882805  │
└────────────────────────┴───────────────┴──────┴─────────────┴───────────────────────┴─────────────────┘

The standard binlog size is 1Gb, but due to large transactions, it can significantly increase (in this case, up to 17Gb). The sizes of new binlogs:

$ ansible sc-db-main-stage-0* -i inventory_stage -u lexusrules -b -m shell -a "ls -lath /var/lib/mysql/mysqld*"
[DEPRECATION WARNING]: [defaults]callback_whitelist option, normalizing names to new standard, use callbacks_enabled instead. This feature will be removed from ansible-core in version 2.15. Deprecation warnings
can be disabled by setting deprecation_warnings=False in ansible.cfg.
sc-db-main-stage-01 | CHANGED | rc=0 >>
-rw-rw---- 1 mysql mysql 9.2M Jul 20 08:11 /var/lib/mysql/mysqld-bin.000403
-rw-rw---- 1 mysql mysql  17G Jul 20 07:58 /var/lib/mysql/mysqld-bin.000402
-rw-rw---- 1 mysql mysql   40 Jul 20 07:58 /var/lib/mysql/mysqld-bin.index
sc-db-main-stage-02 | CHANGED | rc=0 >>
-rw-rw---- 1 mysql mysql 9.2M Jul 20 08:11 /var/lib/mysql/mysqld-bin.000401
-rw-rw---- 1 mysql mysql  17G Jul 20 08:05 /var/lib/mysql/mysqld-bin.000400
-rw-rw---- 1 mysql mysql   40 Jul 20 08:05 /var/lib/mysql/mysqld-bin.index
sc-db-main-stage-03 | CHANGED | rc=0 >>
-rw-rw---- 1 mysql mysql 9.2M Jul 20 08:11 /var/lib/mysql/mysqld-bin.000006
-rw-rw---- 1 mysql mysql  17G Jul 20 08:04 /var/lib/mysql/mysqld-bin.000005
-rw-rw---- 1 mysql mysql   40 Jul 20 08:04 /var/lib/mysql/mysqld-bin.index
sc-db-main-stage-04 | CHANGED | rc=0 >>
-rw-rw---- 1 mysql mysql 348M Jul 20 08:11 /var/lib/mysql/mysqld-bin.000008
-rw-rw---- 1 mysql mysql   20 Jul 19 13:39 /var/lib/mysql/mysqld-bin.index

On version 6.4.3, we encountered the following error. Subsequently, the MaxScale binlog router became unresponsive and necessitated a restart.

2023-06-30 13:54:36 error: (srv_main_repl); Error received during replication from '172.21.17.206:3306': Could not write final ROTATE to /var/lib/maxscale/binlogs/mysqld-bin.385707
2023-06-30 13:56:03 error: (srv_main_repl); Error received during replication from '172.21.17.206:3306': Could not write final ROTATE to /var/lib/maxscale/binlogs/mysqld-bin.385710



 Comments   
Comment by ALEKSEI KHAMIDOV [ 2023-07-20 ]

Additional Setup Information

It's important to note that only sc-db-main-stage-04 is connected to the binlog router. The other two slaves, sc-db-main-stage-02 and sc-db-main-stage-03, are directly connected to master.

Comment by markus makela [ 2023-07-20 ]

A couple of observations. These message are likely caused by something other than a MariaDB/MySQL client connecting to MaxScale:

2023-07-20 05:10:07   error  : (1260) [MariaDBProtocol] Client (172.21.128.78) sent an invalid HandShakeResponse.
2023-07-20 07:10:04   error  : (1293) [MariaDBProtocol] Client (172.21.128.78) sent an invalid HandShakeResponse.

The following error suggests that a memory allocation failed:

2023-07-20 07:59:08   error  : (srv_main_repl); Error received during replication from '172.18.16.194:3306': std::bad_alloc

What is vm.overcommit_memory set to?

Comment by ALEKSEI KHAMIDOV [ 2023-07-20 ]

[root@sc-maxscale-binlog-stage-01 ~]# sysctl vm.overcommit_memory
vm.overcommit_memory = 0

Comment by markus makela [ 2023-07-20 ]

OK, that suggests that an excessively large memory allocation was attempted which ended up throwing a std::bad_alloc exception. Perhaps the binlog event length was interpreted wrong?

I'll try and reproduce this.

Comment by markus makela [ 2023-08-03 ]

I think that once MXS-4690 and MXS-4691 are fixed, this problem should go away.

Generated at Thu Feb 08 04:30:21 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.