[MXS-4404] Maxscale: KafkaCDC writes to current_gtid.txt causes high disk utilisation. - Jira

XML

Word

Printable

Details

Type: Bug
Status: Closed (View Workflow)
Priority: Major
Resolution: Fixed
Affects Version/s: 22.08.0
Fix Version/s: 2.5.24, 6.4.4, 22.08.3
Component/s: kafkacdc
Labels:
- Maxscale
- performance
Environment:

Hide
OS: RHEL 8.4
VM: vCenter 7.0.3, build 20150588, ESXi 6.7.
Hardware: Cisco UCS C220 M4 Small Form Factor (SFF) 1RU server.
Database: MariaDB v10.7
Load balancer: Maxscale v22.08
Three three database & three load balancer clusters hosted on three VMs.
Database cluster utilises Galera replication. Load balancers are using the read-write, read-only & kafka-cdc routers, and uses galeramon.

Show
OS: RHEL 8.4 VM: vCenter 7.0.3, build 20150588, ESXi 6.7. Hardware: Cisco UCS C220 M4 Small Form Factor (SFF) 1RU server. Database: MariaDB v10.7 Load balancer: Maxscale v22.08 Three three database & three load balancer clusters hosted on three VMs. Database cluster utilises Galera replication. Load balancers are using the read-write, read-only & kafka-cdc routers, and uses galeramon.

Description

Hi, the KafkaCDC truncates, then writes, to the current_gtid.txt file for each GTID it processes. The file lives in the Maxscale data directory. We've observed this is causing very high disk utilisation (almost 100%), and double the normal system IOWait. Disk utilisation was literally 0% prior to KafkaCDC. Data appears in Kafka topic to which KafakaCDC writes to, but KafkaCDC cannot keep up with database binary logs as they are purged before reading all of them. The Kafka topic has only one partition. Kafka broker is hosted on a three-host cluster. Database has only three tables, two of which KafkaCDC excludes. Note, to minimise database contention, the transaction binary logs, Galera cache file and database logfile reside on a different virtual disk to what the database resides on.

Can you provide an option to write the GTID value to memory, instead of/as well as to file?

tail -f /data10/maxscale/Kafka-CDC/current_gtid.txt
1-1-6459191831
tail: /data10/maxscale/Kafka-CDC/current_gtid.txt: file truncated
1-1-6459191834
tail: /data10/maxscale/Kafka-CDC/current_gtid.txt: file truncated
1-1-6459191835
tail: /data10/maxscale/Kafka-CDC/current_gtid.txt: file truncated
1-1-6459191836
tail: /data10/maxscale/Kafka-CDC/current_gtid.txt: file truncated
1-1-6459191839
tail: /data10/maxscale/Kafka-CDC/current_gtid.txt: file truncated
1-1-6459191840
tail: /data10/maxscale/Kafka-CDC/current_gtid.txt: file truncated
1-1-6459191841
tail: /data10/maxscale/Kafka-CDC/current_gtid.txt: file truncated
1-1-6459191842
tail: /data10/maxscale/Kafka-CDC/current_gtid.txt: file truncated
1-1-6459191843
tail: /data10/maxscale/Kafka-CDC/current_gtid.txt: file truncated
1-1-6459191844
tail: /data10/maxscale/Kafka-CDC/current_gtid.txt: file truncated
1-1-6459191845
tail: /data10/maxscale/Kafka-CDC/current_gtid.txt: file truncated
1-1-6459191846
tail: /data10/maxscale/Kafka-CDC/current_gtid.txt: file truncated
1-1-6459191847
tail: /data10/maxscale/Kafka-CDC/current_gtid.txt: file truncated
1-1-6459191848
tail: /data10/maxscale/Kafka-CDC/current_gtid.txt: file truncated

Database & Load balancer configs attached.
Netdata disk utilisation graph attached.

Thanks.

Attachments

- Sort By Name
- Sort By Date
- Ascending
- Descending
- Thumbnails
- List
- Download All

KafkaCDC-01.PNG
121 kB
2022-11-17 05:42
maxscale.cnf
2 kB
2022-11-17 05:39
my.cnf
3 kB
2022-11-17 05:39

Activity

People

Assignee:: markus makela

Reporter:: Presnickety

Votes:: 0 Vote for this issue

Watchers:: 2 Start watching this issue

Dates

Created:: 2022-11-17 05:23

Updated:: 2022-11-22 05:54

Resolved:: 2022-11-22 05:54

Git Integration

Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.