Details

    Description

      Our mxs_adapter service runs on server cdc1 querying Maxscale on server maxscale1. On restart of the mxs_adapter service, maxscale takes ~2.5 mins to read the entire x.x.000001.avro file until it reaches the GTID position specified by mxs_adapter.

      During this start up requests to maxscale fail with socket timout.

      Once maxscale locates this position it seems to perform just fine.

      Our .avro file is now 2.5GB in size after a day and half of processing data.

      Is there a way to handle data pruning in the avro file? And is the some way to improve cdc request startups so that maxscale requests can be handled during the process?

      Attachments

        Activity

          DBA666 DBA666 added a comment -

          Our .avro file is now at 4GB after 2 days of operations. What is the process by which this file can be pruned or removed without the loss of streaming data?

          Am I right in thinking the entire file is loaded into memory during cdc streaming too? Our RAM appears to be growing in correlation to this file size.

          DBA666 DBA666 added a comment - Our .avro file is now at 4GB after 2 days of operations. What is the process by which this file can be pruned or removed without the loss of streaming data? Am I right in thinking the entire file is loaded into memory during cdc streaming too? Our RAM appears to be growing in correlation to this file size.
          markus makela markus makela added a comment - - edited

          It's loaded into memory block by block which by default is around 16KB.

          The safest way to prune data would be to stop MaxScale, remove the .avro files and then start MaxScale. Upon startup, MaxScale would use the .avsc files to read the table schemas into memory.

          markus makela markus makela added a comment - - edited It's loaded into memory block by block which by default is around 16KB. The safest way to prune data would be to stop MaxScale, remove the .avro files and then start MaxScale. Upon startup, MaxScale would use the .avsc files to read the table schemas into memory.
          DBA666 DBA666 added a comment -

          Are previous 16KB blocks dropped from memory once read or are they retained for the duration of the cdc read session?

          Pruning that way feel dangerous to me. Is there not a way to force a file increment to 000002.avro so that the previous can be removed we're confident cdc has processed the latest transactions?

          DBA666 DBA666 added a comment - Are previous 16KB blocks dropped from memory once read or are they retained for the duration of the cdc read session? Pruning that way feel dangerous to me. Is there not a way to force a file increment to 000002.avro so that the previous can be removed we're confident cdc has processed the latest transactions?
          markus makela markus makela added a comment -

          I'll have to investigate this and see how the startup speed could be improved.

          markus makela markus makela added a comment - I'll have to investigate this and see how the startup speed could be improved.

          People

            markus makela markus makela
            DBA666 DBA666
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Git Integration

                Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.