Details
-
Bug
-
Status: Closed (View Workflow)
-
Critical
-
Resolution: Fixed
-
6.4.7
-
None
-
OS: CentOS Stream release 8
MariaDB: 10.11.2
MaxScale: 6.4.7
-
MXS-SPRINT-187, MXS-SPRINT-188
Description
Due to the presence of longtext columns in the usersDiscoveredLessons table, a substantial amount of data ends up in the binlog. Since we are using row-based replication, every row is copied into the binlog, exacerbating the data volume. Table creation script:
MariaDB [(none)]> show create table test_long_transaction_bug.usersDiscoveredLessons \G;
|
*************************** 1. row ***************************
|
Table: usersDiscoveredLessons
|
Create Table: CREATE TABLE `usersDiscoveredLessons` (
|
`id` int(11) NOT NULL AUTO_INCREMENT, |
`userId` int(11) unsigned NOT NULL,
|
`groupId` int(11) unsigned DEFAULT NULL,
|
`slideIds` longtext CHARACTER SET utf8mb4 COLLATE utf8mb4_bin DEFAULT NULL CHECK (json_valid(`slideIds`)),
|
`videoIds` longtext CHARACTER SET utf8mb4 COLLATE utf8mb4_bin DEFAULT NULL CHECK (json_valid(`videoIds`)),
|
`taskIds` longtext CHARACTER SET utf8mb4 COLLATE utf8mb4_bin DEFAULT NULL CHECK (json_valid(`taskIds`)),
|
`createdAt` timestamp NOT NULL DEFAULT current_timestamp(),
|
`updatedAt` timestamp NOT NULL DEFAULT current_timestamp(),
|
`deletedAt` timestamp NULL DEFAULT NULL,
|
PRIMARY KEY (`id`), |
UNIQUE KEY `userId_groupId_uniq` (`userId`,`groupId`)
|
) ENGINE=InnoDB AUTO_INCREMENT=29297215 DEFAULT CHARSET=utf8mb3 COLLATE=utf8mb3_unicode_ci
|
1 row in set (0.001 sec) |
Data sample (the array of slideIds values is significantly shortened, it could include up to 500 Ids):
MariaDB [(none)]> select * from test_long_transaction_bug.usersDiscoveredLessons LIMIT 1 \G; |
*************************** 1. row ***************************
|
id: 28566982 |
userId: 102592167
|
groupId: 2
|
slideIds: [114586208,112104238,28115292,...,96955794]
|
videoIds: [112312333,123123444]
|
taskIds: []
|
createdAt: 2023-07-13 00:00:00
|
updatedAt: 2023-07-13 00:00:00
|
deletedAt: NULL
|
1 row in set (0.001 sec) |
The size of the file from which data will be loaded into the table:
[root@sc-db-main-stage-01 ~]# ls -lah /var/lib/mysql/user* |
-rw-r--r-- 1 root root 7.9G Jul 18 18:47 /var/lib/mysql/usersDiscoveredLessons.sql |
The sizes of binlogs:
$ ansible sc-db-main-stage-0* -i inventory_stage -u lexusrules -b -m shell -a "ls -lath /var/lib/mysql/mysqld*" |
sc-db-main-stage-03 | CHANGED | rc=0 >>
|
-rw-rw---- 1 mysql mysql 331M Jul 20 07:46 /var/lib/mysql/mysqld-bin.000005 |
-rw-rw---- 1 mysql mysql 20 Jul 19 13:39 /var/lib/mysql/mysqld-bin.index |
sc-db-main-stage-01 | CHANGED | rc=0 >>
|
-rw-rw---- 1 mysql mysql 331M Jul 20 07:46 /var/lib/mysql/mysqld-bin.000402 |
-rw-rw---- 1 mysql mysql 20 Jul 19 13:39 /var/lib/mysql/mysqld-bin.index |
sc-db-main-stage-04 | CHANGED | rc=0 >>
|
-rw-rw---- 1 mysql mysql 331M Jul 20 07:46 /var/lib/mysql/mysqld-bin.000008 |
-rw-rw---- 1 mysql mysql 20 Jul 19 13:39 /var/lib/mysql/mysqld-bin.index |
sc-db-main-stage-02 | CHANGED | rc=0 >>
|
-rw-rw---- 1 mysql mysql 331M Jul 20 07:46 /var/lib/mysql/mysqld-bin.000400 |
-rw-rw---- 1 mysql mysql 20 Jul 19 13:39 /var/lib/mysql/mysqld-bin.index |
I am demonstrating that the table is empty after truncating and before initiating the load:
$ ansible sc-db-main-stage-0* -i inventory_stage -u lexusrules -b -m shell -a "mysql -e 'select count(*) from test_long_transaction_bug.usersDiscoveredLessons;'" |
sc-db-main-stage-01 | CHANGED | rc=0 >>
|
count(*)
|
0
|
sc-db-main-stage-02 | CHANGED | rc=0 >>
|
count(*)
|
0
|
sc-db-main-stage-03 | CHANGED | rc=0 >>
|
count(*)
|
0
|
sc-db-main-stage-04 | CHANGED | rc=0 >>
|
count(*)
|
0
|
The actual execution of the load:
[root@sc-db-main-stage-01 ~]# mysql test_long_transaction_bug -e "SET autocommit=0; source /var/lib/mysql/usersDiscoveredLessons.sql; COMMIT;" & |
MaxScale binlog router error:
[root@sc-maxscale-binlog-stage-01 ~]# tail -n500 /var/log/maxscale/maxscale.log | grep -v warning |
2023-07-20 05:10:07 error : (1260) [MariaDBProtocol] Client (172.21.128.78) sent an invalid HandShakeResponse.
|
2023-07-20 07:10:04 error : (1293) [MariaDBProtocol] Client (172.21.128.78) sent an invalid HandShakeResponse.
|
2023-07-20 07:59:08 error : (srv_main_repl); Error received during replication from '172.18.16.194:3306': std::bad_alloc |
The result - the data is inconsistent:
$ ansible sc-db-main-stage-0* -i inventory_stage -u lexusrules -b -m shell -a "mysql -e 'select count(*) from test_long_transaction_bug.usersDiscoveredLessons;'" |
sc-db-main-stage-01 | CHANGED | rc=0 >>
|
count(*)
|
632852
|
sc-db-main-stage-02 | CHANGED | rc=0 >>
|
count(*)
|
632852
|
sc-db-main-stage-03 | CHANGED | rc=0 >>
|
count(*)
|
632852
|
sc-db-main-stage-04 | CHANGED | rc=0 >>
|
count(*)
|
0
|
The cluster continues to operate with inconsistent data without any apparent issues:
$ ansible sc_maxscale -i inventory_stage -u lexusrules -b -m shell -a "maxctrl list servers" |
sc-maxscale-stage-01 | CHANGED | rc=0 >>
|
┌────────────────────────┬───────────────┬──────┬─────────────┬───────────────────────┬─────────────────┐
|
│ Server │ Address │ Port │ Connections │ State │ GTID │
|
├────────────────────────┼───────────────┼──────┼─────────────┼───────────────────────┼─────────────────┤
|
│ sc-db-main-stage-01 │ 172.18.16.194 │ 3306 │ 582 │ Master, Running │ 0-14-336882805 │
|
├────────────────────────┼───────────────┼──────┼─────────────┼───────────────────────┼─────────────────┤
|
│ sc-db-main-stage-02 │ 172.18.16.103 │ 3306 │ 383 │ Slave, Running │ 0-14-336882805 │
|
├────────────────────────┼───────────────┼──────┼─────────────┼───────────────────────┼─────────────────┤
|
│ sc-db-main-stage-03 │ 172.18.16.197 │ 3306 │ 383 │ Slave, Running │ 0-14-336882805 │
|
├────────────────────────┼───────────────┼──────┼─────────────┼───────────────────────┼─────────────────┤
|
│ sc-db-main-stage-04 │ 172.18.16.114 │ 3306 │ 0 │ Maintenance, Running │ 0-14-336882805 │
|
├────────────────────────┼───────────────┼──────┼─────────────┼───────────────────────┼─────────────────┤
|
│ sc-maxscale-binlog │ 172.18.16.69 │ 3306 │ 0 │ Binlog Relay, Running │ 0-14-336882805 │
|
└────────────────────────┴───────────────┴──────┴─────────────┴───────────────────────┴─────────────────┘
|
The standard binlog size is 1Gb, but due to large transactions, it can significantly increase (in this case, up to 17Gb). The sizes of new binlogs:
$ ansible sc-db-main-stage-0* -i inventory_stage -u lexusrules -b -m shell -a "ls -lath /var/lib/mysql/mysqld*" |
[DEPRECATION WARNING]: [defaults]callback_whitelist option, normalizing names to new standard, use callbacks_enabled instead. This feature will be removed from ansible-core in version 2.15. Deprecation warnings |
can be disabled by setting deprecation_warnings=False in ansible.cfg. |
sc-db-main-stage-01 | CHANGED | rc=0 >>
|
-rw-rw---- 1 mysql mysql 9.2M Jul 20 08:11 /var/lib/mysql/mysqld-bin.000403 |
-rw-rw---- 1 mysql mysql 17G Jul 20 07:58 /var/lib/mysql/mysqld-bin.000402 |
-rw-rw---- 1 mysql mysql 40 Jul 20 07:58 /var/lib/mysql/mysqld-bin.index |
sc-db-main-stage-02 | CHANGED | rc=0 >>
|
-rw-rw---- 1 mysql mysql 9.2M Jul 20 08:11 /var/lib/mysql/mysqld-bin.000401 |
-rw-rw---- 1 mysql mysql 17G Jul 20 08:05 /var/lib/mysql/mysqld-bin.000400 |
-rw-rw---- 1 mysql mysql 40 Jul 20 08:05 /var/lib/mysql/mysqld-bin.index |
sc-db-main-stage-03 | CHANGED | rc=0 >>
|
-rw-rw---- 1 mysql mysql 9.2M Jul 20 08:11 /var/lib/mysql/mysqld-bin.000006 |
-rw-rw---- 1 mysql mysql 17G Jul 20 08:04 /var/lib/mysql/mysqld-bin.000005 |
-rw-rw---- 1 mysql mysql 40 Jul 20 08:04 /var/lib/mysql/mysqld-bin.index |
sc-db-main-stage-04 | CHANGED | rc=0 >>
|
-rw-rw---- 1 mysql mysql 348M Jul 20 08:11 /var/lib/mysql/mysqld-bin.000008 |
-rw-rw---- 1 mysql mysql 20 Jul 19 13:39 /var/lib/mysql/mysqld-bin.index |
On version 6.4.3, we encountered the following error. Subsequently, the MaxScale binlog router became unresponsive and necessitated a restart.
2023-06-30 13:54:36 error: (srv_main_repl); Error received during replication from '172.21.17.206:3306': Could not write final ROTATE to /var/lib/maxscale/binlogs/mysqld-bin.385707 |
2023-06-30 13:56:03 error: (srv_main_repl); Error received during replication from '172.21.17.206:3306': Could not write final ROTATE to /var/lib/maxscale/binlogs/mysqld-bin.385710 |