Due to the presence of longtext columns in the usersDiscoveredLessons table, a substantial amount of data ends up in the binlog. Since we are using row-based replication, every row is copied into the binlog, exacerbating the data volume. Table creation script:
MariaDB [(none)]> show create table test_long_transaction_bug.usersDiscoveredLessons \G;
|
*************************** 1. row ***************************
|
Table: usersDiscoveredLessons
|
Create Table: CREATE TABLE `usersDiscoveredLessons` (
|
`id` int(11) NOT NULL AUTO_INCREMENT,
|
`userId` int(11) unsigned NOT NULL,
|
`groupId` int(11) unsigned DEFAULT NULL,
|
`slideIds` longtext CHARACTER SET utf8mb4 COLLATE utf8mb4_bin DEFAULT NULL CHECK (json_valid(`slideIds`)),
|
`videoIds` longtext CHARACTER SET utf8mb4 COLLATE utf8mb4_bin DEFAULT NULL CHECK (json_valid(`videoIds`)),
|
`taskIds` longtext CHARACTER SET utf8mb4 COLLATE utf8mb4_bin DEFAULT NULL CHECK (json_valid(`taskIds`)),
|
`createdAt` timestamp NOT NULL DEFAULT current_timestamp(),
|
`updatedAt` timestamp NOT NULL DEFAULT current_timestamp(),
|
`deletedAt` timestamp NULL DEFAULT NULL,
|
PRIMARY KEY (`id`),
|
UNIQUE KEY `userId_groupId_uniq` (`userId`,`groupId`)
|
) ENGINE=InnoDB AUTO_INCREMENT=29297215 DEFAULT CHARSET=utf8mb3 COLLATE=utf8mb3_unicode_ci
|
1 row in set (0.001 sec)
|
Data sample (the array of slideIds values is significantly shortened, it could include up to 500 Ids):
MariaDB [(none)]> select * from test_long_transaction_bug.usersDiscoveredLessons LIMIT 1 \G;
|
*************************** 1. row ***************************
|
id: 28566982
|
userId: 102592167
|
groupId: 2
|
slideIds: [114586208,112104238,28115292,...,96955794]
|
videoIds: [112312333,123123444]
|
taskIds: []
|
createdAt: 2023-07-13 00:00:00
|
updatedAt: 2023-07-13 00:00:00
|
deletedAt: NULL
|
1 row in set (0.001 sec)
|
The size of the file from which data will be loaded into the table:
[root@sc-db-main-stage-01 ~]# ls -lah /var/lib/mysql/user*
|
-rw-r--r-- 1 root root 7.9G Jul 18 18:47 /var/lib/mysql/usersDiscoveredLessons.sql
|
The sizes of binlogs:
$ ansible sc-db-main-stage-0* -i inventory_stage -u lexusrules -b -m shell -a "ls -lath /var/lib/mysql/mysqld*"
|
sc-db-main-stage-03 | CHANGED | rc=0 >>
|
-rw-rw---- 1 mysql mysql 331M Jul 20 07:46 /var/lib/mysql/mysqld-bin.000005
|
-rw-rw---- 1 mysql mysql 20 Jul 19 13:39 /var/lib/mysql/mysqld-bin.index
|
sc-db-main-stage-01 | CHANGED | rc=0 >>
|
-rw-rw---- 1 mysql mysql 331M Jul 20 07:46 /var/lib/mysql/mysqld-bin.000402
|
-rw-rw---- 1 mysql mysql 20 Jul 19 13:39 /var/lib/mysql/mysqld-bin.index
|
sc-db-main-stage-04 | CHANGED | rc=0 >>
|
-rw-rw---- 1 mysql mysql 331M Jul 20 07:46 /var/lib/mysql/mysqld-bin.000008
|
-rw-rw---- 1 mysql mysql 20 Jul 19 13:39 /var/lib/mysql/mysqld-bin.index
|
sc-db-main-stage-02 | CHANGED | rc=0 >>
|
-rw-rw---- 1 mysql mysql 331M Jul 20 07:46 /var/lib/mysql/mysqld-bin.000400
|
-rw-rw---- 1 mysql mysql 20 Jul 19 13:39 /var/lib/mysql/mysqld-bin.index
|
I am demonstrating that the table is empty after truncating and before initiating the load:
$ ansible sc-db-main-stage-0* -i inventory_stage -u lexusrules -b -m shell -a "mysql -e 'select count(*) from test_long_transaction_bug.usersDiscoveredLessons;'"
|
sc-db-main-stage-01 | CHANGED | rc=0 >>
|
count(*)
|
0
|
sc-db-main-stage-02 | CHANGED | rc=0 >>
|
count(*)
|
0
|
sc-db-main-stage-03 | CHANGED | rc=0 >>
|
count(*)
|
0
|
sc-db-main-stage-04 | CHANGED | rc=0 >>
|
count(*)
|
0
|
The actual execution of the load:
[root@sc-db-main-stage-01 ~]# mysql test_long_transaction_bug -e "SET autocommit=0; source /var/lib/mysql/usersDiscoveredLessons.sql; COMMIT;" &
|
MaxScale binlog router error:
[root@sc-maxscale-binlog-stage-01 ~]# tail -n500 /var/log/maxscale/maxscale.log | grep -v warning
|
2023-07-20 05:10:07 error : (1260) [MariaDBProtocol] Client (172.21.128.78) sent an invalid HandShakeResponse.
|
2023-07-20 07:10:04 error : (1293) [MariaDBProtocol] Client (172.21.128.78) sent an invalid HandShakeResponse.
|
2023-07-20 07:59:08 error : (srv_main_repl); Error received during replication from '172.18.16.194:3306': std::bad_alloc
|
The result - the data is inconsistent:
$ ansible sc-db-main-stage-0* -i inventory_stage -u lexusrules -b -m shell -a "mysql -e 'select count(*) from test_long_transaction_bug.usersDiscoveredLessons;'"
|
sc-db-main-stage-01 | CHANGED | rc=0 >>
|
count(*)
|
632852
|
sc-db-main-stage-02 | CHANGED | rc=0 >>
|
count(*)
|
632852
|
sc-db-main-stage-03 | CHANGED | rc=0 >>
|
count(*)
|
632852
|
sc-db-main-stage-04 | CHANGED | rc=0 >>
|
count(*)
|
0
|
The cluster continues to operate with inconsistent data without any apparent issues:
$ ansible sc_maxscale -i inventory_stage -u lexusrules -b -m shell -a "maxctrl list servers"
|
sc-maxscale-stage-01 | CHANGED | rc=0 >>
|
┌────────────────────────┬───────────────┬──────┬─────────────┬───────────────────────┬─────────────────┐
|
│ Server │ Address │ Port │ Connections │ State │ GTID │
|
├────────────────────────┼───────────────┼──────┼─────────────┼───────────────────────┼─────────────────┤
|
│ sc-db-main-stage-01 │ 172.18.16.194 │ 3306 │ 582 │ Master, Running │ 0-14-336882805 │
|
├────────────────────────┼───────────────┼──────┼─────────────┼───────────────────────┼─────────────────┤
|
│ sc-db-main-stage-02 │ 172.18.16.103 │ 3306 │ 383 │ Slave, Running │ 0-14-336882805 │
|
├────────────────────────┼───────────────┼──────┼─────────────┼───────────────────────┼─────────────────┤
|
│ sc-db-main-stage-03 │ 172.18.16.197 │ 3306 │ 383 │ Slave, Running │ 0-14-336882805 │
|
├────────────────────────┼───────────────┼──────┼─────────────┼───────────────────────┼─────────────────┤
|
│ sc-db-main-stage-04 │ 172.18.16.114 │ 3306 │ 0 │ Maintenance, Running │ 0-14-336882805 │
|
├────────────────────────┼───────────────┼──────┼─────────────┼───────────────────────┼─────────────────┤
|
│ sc-maxscale-binlog │ 172.18.16.69 │ 3306 │ 0 │ Binlog Relay, Running │ 0-14-336882805 │
|
└────────────────────────┴───────────────┴──────┴─────────────┴───────────────────────┴─────────────────┘
|
The standard binlog size is 1Gb, but due to large transactions, it can significantly increase (in this case, up to 17Gb). The sizes of new binlogs:
$ ansible sc-db-main-stage-0* -i inventory_stage -u lexusrules -b -m shell -a "ls -lath /var/lib/mysql/mysqld*"
|
[DEPRECATION WARNING]: [defaults]callback_whitelist option, normalizing names to new standard, use callbacks_enabled instead. This feature will be removed from ansible-core in version 2.15. Deprecation warnings
|
can be disabled by setting deprecation_warnings=False in ansible.cfg.
|
sc-db-main-stage-01 | CHANGED | rc=0 >>
|
-rw-rw---- 1 mysql mysql 9.2M Jul 20 08:11 /var/lib/mysql/mysqld-bin.000403
|
-rw-rw---- 1 mysql mysql 17G Jul 20 07:58 /var/lib/mysql/mysqld-bin.000402
|
-rw-rw---- 1 mysql mysql 40 Jul 20 07:58 /var/lib/mysql/mysqld-bin.index
|
sc-db-main-stage-02 | CHANGED | rc=0 >>
|
-rw-rw---- 1 mysql mysql 9.2M Jul 20 08:11 /var/lib/mysql/mysqld-bin.000401
|
-rw-rw---- 1 mysql mysql 17G Jul 20 08:05 /var/lib/mysql/mysqld-bin.000400
|
-rw-rw---- 1 mysql mysql 40 Jul 20 08:05 /var/lib/mysql/mysqld-bin.index
|
sc-db-main-stage-03 | CHANGED | rc=0 >>
|
-rw-rw---- 1 mysql mysql 9.2M Jul 20 08:11 /var/lib/mysql/mysqld-bin.000006
|
-rw-rw---- 1 mysql mysql 17G Jul 20 08:04 /var/lib/mysql/mysqld-bin.000005
|
-rw-rw---- 1 mysql mysql 40 Jul 20 08:04 /var/lib/mysql/mysqld-bin.index
|
sc-db-main-stage-04 | CHANGED | rc=0 >>
|
-rw-rw---- 1 mysql mysql 348M Jul 20 08:11 /var/lib/mysql/mysqld-bin.000008
|
-rw-rw---- 1 mysql mysql 20 Jul 19 13:39 /var/lib/mysql/mysqld-bin.index
|
On version 6.4.3, we encountered the following error. Subsequently, the MaxScale binlog router became unresponsive and necessitated a restart.
2023-06-30 13:54:36 error: (srv_main_repl); Error received during replication from '172.21.17.206:3306': Could not write final ROTATE to /var/lib/maxscale/binlogs/mysqld-bin.385707
|
2023-06-30 13:56:03 error: (srv_main_repl); Error received during replication from '172.21.17.206:3306': Could not write final ROTATE to /var/lib/maxscale/binlogs/mysqld-bin.385710
|
|