[MXS-2755] Slow CDC start up - Jira

Details

Type: Bug
Status: Closed (View Workflow)
Priority: Critical
Resolution: Won't Fix
Affects Version/s: 2.4.3
Fix Version/s: N/A
Component/s: avrorouter, cdc
Labels:
- performance

Description

Our mxs_adapter service runs on server cdc1 querying Maxscale on server maxscale1. On restart of the mxs_adapter service, maxscale takes ~2.5 mins to read the entire x.x.000001.avro file until it reaches the GTID position specified by mxs_adapter.

During this start up requests to maxscale fail with socket timout.

Once maxscale locates this position it seems to perform just fine.

Our .avro file is now 2.5GB in size after a day and half of processing data.

Is there a way to handle data pruning in the avro file? And is the some way to improve cdc request startups so that maxscale requests can be handled during the process?

Attachments

Activity

Ascending order - Click to sort in descending order

DBA666 added a comment - 2019-11-07 17:23

Our .avro file is now at 4GB after 2 days of operations. What is the process by which this file can be pruned or removed without the loss of streaming data?

Am I right in thinking the entire file is loaded into memory during cdc streaming too? Our RAM appears to be growing in correlation to this file size.

DBA666 added a comment - 2019-11-07 17:23 Our .avro file is now at 4GB after 2 days of operations. What is the process by which this file can be pruned or removed without the loss of streaming data? Am I right in thinking the entire file is loaded into memory during cdc streaming too? Our RAM appears to be growing in correlation to this file size.

markus makela added a comment - 2019-11-08 05:34 - edited

It's loaded into memory block by block which by default is around 16KB.

The safest way to prune data would be to stop MaxScale, remove the .avro files and then start MaxScale. Upon startup, MaxScale would use the .avsc files to read the table schemas into memory.

markus makela added a comment - 2019-11-08 05:34 - edited It's loaded into memory block by block which by default is around 16KB. The safest way to prune data would be to stop MaxScale, remove the .avro files and then start MaxScale. Upon startup, MaxScale would use the .avsc files to read the table schemas into memory.

DBA666 added a comment - 2019-11-08 09:23

Are previous 16KB blocks dropped from memory once read or are they retained for the duration of the cdc read session?

Pruning that way feel dangerous to me. Is there not a way to force a file increment to 000002.avro so that the previous can be removed we're confident cdc has processed the latest transactions?

DBA666 added a comment - 2019-11-08 09:23 Are previous 16KB blocks dropped from memory once read or are they retained for the duration of the cdc read session? Pruning that way feel dangerous to me. Is there not a way to force a file increment to 000002.avro so that the previous can be removed we're confident cdc has processed the latest transactions?

markus makela added a comment - 2019-11-08 10:02

I'll have to investigate this and see how the startup speed could be improved.

markus makela added a comment - 2019-11-08 10:02 I'll have to investigate this and see how the startup speed could be improved.

People

Assignee:: markus makela

Reporter:: DBA666

Votes:: 0 Vote for this issue

Watchers:: 3 Start watching this issue

Dates

Created:: 2019-11-07 10:51

Updated:: 2020-03-17 23:51

Resolved:: 2020-03-17 23:51

Git Integration

Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.

MariaDB MaxScale