[MXS-4475] MaxScale 22.08.3 received fatal signal 11 Created: 2023-01-10  Updated: 2023-09-05  Resolved: 2023-09-04

Status: Closed
Project: MariaDB MaxScale
Component/s: kafkacdc
Affects Version/s: 22.08.3
Fix Version/s: N/A

Type: Bug Priority: Major
Reporter: Jayant Patil Assignee: markus makela
Resolution: Done Votes: 0
Labels: Kafka, MariaDB, Maxscale, cdc, replication
Environment:

centos7.9 , 8cpu, 32gb ram, mariadb 10.3 with AES encryption enabled, maxscale 22.8.3
[mariadb]
file_key_management_encryption_algorithm=AES_CTR


Sprint: MXS-SPRINT-175, MXS-SPRINT-176, MXS-SPRINT-177, MXS-SPRINT-190

 Description   

maxscale.cnf configuration

# # The MariaDB-to-Kafka CDC service
[Kafka-CDC]
type=service
router=kafkacdc
servers=server1
user=user
password=password
bootstrap_servers=backup-db-server:9092
topic=test-cdc-topic
gtid=0-1-32177538

We are getting this error and messages are not sent accross to Kafka Server
— Error

2023-01-10 08:53:42   alert  : MaxScale 22.08.3 received fatal signal 11. Commit ID: 2949f820a7d9de38d7fd51909f66d561627d1eed System name: Linux Release string: NAME="CentOS Linux"
2023-01-10 08:53:42   alert  : Statement currently being classified: none/unknown
2023-01-10 08:53:42   notice : For a more detailed stacktrace, install GDB and add 'debug=gdb-stacktrace' under the [maxscale] section.
  /usr/lib64/maxscale/libreplicator.so.1.0.0(ZN3Rpl22process_row_event_dataERK5TablePhS3_S3+0x1006): ??:?
  /usr/lib64/maxscale/libreplicator.so.1.0.0(_ZN3Rpl16handle_row_eventEP10REP_HEADERPh+0x4be): ??:?
  /usr/lib64/maxscale/libreplicator.so.1.0.0(_ZN3Rpl12handle_eventE10REP_HEADERPh+0x67): ??:?
  /usr/lib64/maxscale/libreplicator.so.1.0.0(_ZN3cdc10Replicator3Imp17process_one_eventERSt10unique_ptrI20st_mariadb_rpl_eventSt8functionIFvPS3_EEE+0xdd): ??:?
  /usr/lib64/maxscale/libreplicator.so.1.0.0(_ZN3cdc10Replicator3Imp14process_eventsEv+0x140): ??:?
  /usr/lib64/maxscale/libmaxscale-common.so.1.0.0(+0x4096d0): thread48.o:?
  /lib64/libpthread.so.0(+0x7ea5): pthread_create.c:?
  /lib64/libc.so.6(clone+0x6d): ??:?
alert  : Writing core dump.



 Comments   
Comment by markus makela [ 2023-01-10 ]

Do you know at which GTID the crash occurred? If you do, can you find out which binlog event was that caused it?

Comment by Jayant Patil [ 2023-01-10 ]

It is GTID '0-1-535510'

2023-01-10 11:40:31   notice : Started REST API on [127.0.0.1]:8989
2023-01-10 11:40:31   notice : 'server1' sent version string '10.3.30-MariaDB-log'. Detected type: 'MariaDB', version: 10.3.30.
2023-01-10 11:40:31   notice : Server 'server1' charset: latin1
2023-01-10 11:40:31   notice : Starting a total of 1 services...
2023-01-10 11:40:31   warning: Service 'Kafka-CDC' has no listeners defined.
2023-01-10 11:40:31   notice : Service 'Kafka-CDC' started (1/1)
2023-01-10 11:40:32   notice : Started replicating from [10.142.0.22]:3306 at GTID '0-1-535510'

Comment by Jayant Patil [ 2023-01-13 ]

Hi Markus,

Any resolution for this? We are not able to debug this and messages stuck to process to kafka server.

Comment by markus makela [ 2023-01-13 ]

Without the binlogs that cause this or a way to reproduce it, there's not much we can do. You can enable info level logging by adding log_info=true under the [maxscale] section. With the info level logging enabled, we'll have more information about where the crash is happening.

You can use SHOW BINLOG EVENTS and SHOW BINARY LOGS to inspect the events stored in the database. Alternatively, you can use mysqlbinlog to read and parse the binlog files in a more detailed manner.

Comment by Jayant Patil [ 2023-01-13 ]

Hi Markus,

Kindly see the log from maxscale.

2023-01-13 11:40:58 notice : Started replicating from [10.142.0.22]:3306 at GTID '0-1-79426003'
2023-01-13 11:40:58 error : Failed to read replicated event: 1236, Could not find GTID state requested by slave in any binlog files. Probably the slave state is too old and required binlog files have been purged.

Comment by markus makela [ 2023-01-13 ]

Yes, that tells us that MaxScale started from position 0-1-79426003. Does it log any other information after you turn on log_info?

Comment by Jayant Patil [ 2023-01-13 ]

Hi Markus,

Following is full log.

MariaDB MaxScale  /var/log/maxscale/maxscale.log  Fri Jan 13 13:16:50 2023
----------------------------------------------------------------------------
2023-01-13 13:16:50   notice : The systemd watchdog is Enabled. Internal timeout = 30s
2023-01-13 13:16:50   notice : The logging of info messages has been enabled.
2023-01-13 13:16:50   notice : Using up to 4.69GiB of memory for query classifier cache
2023-01-13 13:16:50   notice : syslog logging is disabled.
2023-01-13 13:16:50   notice : maxlog logging is enabled.
2023-01-13 13:16:50   notice : Host: 'prod-database-secondary-replica' OS: Linux@3.10.0-1160.80.1.el7.x86_64, #1 SMP Tue Nov 8 15:48:59 UTC 2022, x86_64 with 8 processor cores (8.00 available).
2023-01-13 13:16:50   notice : Total main memory: 31.26GiB (31.26GiB usable).
2023-01-13 13:16:50   notice : MariaDB MaxScale 22.08.3 started (Commit: 2949f820a7d9de38d7fd51909f66d561627d1eed)
2023-01-13 13:16:50   notice : MaxScale is running in process 19817
2023-01-13 13:16:50   notice : Configuration file: /etc/maxscale.cnf
2023-01-13 13:16:50   notice : Log directory: /var/log/maxscale
2023-01-13 13:16:50   notice : Data directory: /var/lib/maxscale
2023-01-13 13:16:50   notice : Module directory: /usr/lib64/maxscale
2023-01-13 13:16:50   notice : Service cache: /var/cache/maxscale
2023-01-13 13:16:50   notice : Working directory: /var/log/maxscale
2023-01-13 13:16:50   notice : Module 'qc_sqlite' loaded from '/usr/lib64/maxscale/libqc_sqlite.so'.
2023-01-13 13:16:50   info   : qc_sqlite loaded.
2023-01-13 13:16:50   notice : Query classification results are cached and reused. Memory used per thread: 600.15MiB
2023-01-13 13:16:50   notice : Password encryption key file '/var/lib/maxscale/.secrets' not found, using configured passwords as plaintext.
2023-01-13 13:16:50   info   : Epoll instance for listening sockets added to worker epoll instance.
2023-01-13 13:16:50   info   : Epoll instance for listening sockets added to worker epoll instance.
2023-01-13 13:16:50   info   : Epoll instance for listening sockets added to worker epoll instance.
2023-01-13 13:16:50   info   : Epoll instance for listening sockets added to worker epoll instance.
2023-01-13 13:16:50   info   : Epoll instance for listening sockets added to worker epoll instance.
2023-01-13 13:16:50   info   : Epoll instance for listening sockets added to worker epoll instance.
2023-01-13 13:16:50   info   : Epoll instance for listening sockets added to worker epoll instance.
2023-01-13 13:16:50   info   : Epoll instance for listening sockets added to worker epoll instance.
2023-01-13 13:16:50   info   : [qc_sqlite] In-memory sqlite database successfully opened for thread 140077005805312.
2023-01-13 13:16:50   info   : [qc_sqlite] In-memory sqlite database successfully opened for thread 140076997412608.
2023-01-13 13:16:50   info   : [qc_sqlite] In-memory sqlite database successfully opened for thread 140076989019904.
2023-01-13 13:16:50   info   : [qc_sqlite] In-memory sqlite database successfully opened for thread 140076980627200.
2023-01-13 13:16:50   info   : [qc_sqlite] In-memory sqlite database successfully opened for thread 140076972234496.
2023-01-13 13:16:50   info   : [qc_sqlite] In-memory sqlite database successfully opened for thread 140076963841792.
2023-01-13 13:16:50   info   : [qc_sqlite] In-memory sqlite database successfully opened for thread 140076955449088.
2023-01-13 13:16:50   info   : [qc_sqlite] In-memory sqlite database successfully opened for thread 140076947056384.
2023-01-13 13:16:50   notice : MaxScale started with 8 worker threads.
2023-01-13 13:16:50   info   : [qc_sqlite] In-memory sqlite database successfully opened for thread 140077238930560.
2023-01-13 13:16:50   info   : No 'auto_tune' parameters specified, no auto tuning will be performed.
2023-01-13 13:16:50   notice : Module 'kafkacdc' loaded from '/usr/lib64/maxscale/libkafkacdc.so'.
2023-01-13 13:16:50   notice : Module 'mariadbmon' loaded from '/usr/lib64/maxscale/libmariadbmon.so'.
2023-01-13 13:16:50   warning: match: Missing slashes (/) around a regular expression is deprecated.
2023-01-13 13:16:50   warning: [kafkacdc] [thrd:app]: Configuration property group.id is a consumer property and will be ignored by this producer instance
2023-01-13 13:16:50   notice : Continuing from GTID '0-1-103040432'
2023-01-13 13:16:50   info   : [qc_sqlite] In-memory sqlite database successfully opened for thread 140076427376384.
2023-01-13 13:16:50   notice : Using HS256 for JWT signatures
2023-01-13 13:16:50   warning: The MaxScale GUI is enabled but encryption for the REST API is not enabled, the GUI will not be enabled. Configure `admin_ssl_key` and `admin_ssl_cert` to enable HTTPS or add `admin_secure_gui=false` to allow use of the GUI without encryption.
2023-01-13 13:16:50   notice : Started REST API on [127.0.0.1]:8989
2023-01-13 13:16:50   notice : 'server1' sent version string '10.3.30-MariaDB-log'. Detected type: 'MariaDB', version: 10.3.30.
2023-01-13 13:16:50   notice : Server 'server1' charset: latin1
2023-01-13 13:16:50   info   : [kafkacdc] Kafka watermarks: High: 0 Low: 0
2023-01-13 13:16:50   notice : Starting a total of 1 services...
2023-01-13 13:16:50   warning: Service 'Kafka-CDC' has no listeners defined.
2023-01-13 13:16:50   notice : Service 'Kafka-CDC' started (1/1)
2023-01-13 13:16:50   notice : Started replicating from [10.142.0.22]:3306 at GTID '0-1-103040432'
2023-01-13 13:16:51   info   : GTID: 0-1-103040433
2023-01-13 13:16:51   info   : Row Event for 'nda_prod.test' at 329057205
2023-01-13 13:16:51   info   : [0] LONG
2023-01-13 13:16:51   info   : [1] BLOB: field: 3 bytes, data: 36 bytes
2023-01-13 13:16:51   info   : [2] TIMESTAMP: 1904-10-17 20:24:03
2023-01-13 13:16:51   info   : Row Event for 'nda_prod.patient' at 329057286
2023-01-13 13:16:51   info   : [0] LONG
2023-01-13 13:16:51   info   : [1] LONG
2023-01-13 13:16:51   info   : [2] NULL
2023-01-13 13:16:51   info   : [3] NULL
2023-01-13 13:16:51   info   : [4] NULL
2023-01-13 13:16:51   info   : [5] NULL
2023-01-13 13:16:51   info   : [6] LONG
2023-01-13 13:16:51   info   : [7] VARCHAR: field: 360 bytes, data: 20 bytes
2023-01-13 13:16:51   info   : [8] NULL
2023-01-13 13:16:51   info   : [9] VARCHAR: field: 15 bytes, data: 0 bytes
2023-01-13 13:16:51   info   : [10] VARCHAR: field: 150 bytes, data: 5 bytes
2023-01-13 13:16:51   info   : [11] VARCHAR: field: 150 bytes, data: 0 bytes
2023-01-13 13:16:51   info   : [12] VARCHAR: field: 150 bytes, data: 6 bytes
2023-01-13 13:16:51   info   : [13] VARCHAR: field: 288 bytes, data: 20 bytes
2023-01-13 13:16:51   info   : [14] NULL
2023-01-13 13:16:51   info   : [15] NULL
2023-01-13 13:16:51   info   : [16] NULL
2023-01-13 13:16:51   info   : [17] VARCHAR: field: 36 bytes, data: 0 bytes
2023-01-13 13:16:51   info   : [18] VARCHAR: field: 36 bytes, data: 12 bytes
2023-01-13 13:16:51   info   : [19] DATE: 1987-11-11
2023-01-13 13:16:51   info   : [20] ENUM: 1 bytes
2023-01-13 13:16:51   info   : [21] VARCHAR: field: 765 bytes, data: 27 bytes
2023-01-13 13:16:51   info   : [22] VARCHAR: field: 765 bytes, data: 0 bytes
2023-01-13 13:16:51   info   : [23] VARCHAR: field: 192 bytes, data: 9 bytes
2023-01-13 13:16:51   info   : [24] VARCHAR: field: 192 bytes, data: 2 bytes
2023-01-13 13:16:51   info   : [25] VARCHAR: field: 30 bytes, data: 5 bytes
2023-01-13 13:16:51   info   : [26] NULL
2023-01-13 13:16:51   info   : [27] TINY
2023-01-13 13:16:51   info   : [28] TINY
2023-01-13 13:16:51   info   : [29] NULL
2023-01-13 13:16:51   info   : [30] DATETIME2: 2023-01-13 13:15:49
2023-01-13 13:16:51   info   : [31] DATETIME2: 2023-01-13 13:15:49
2023-01-13 13:16:51   info   : [32] NULL
2023-01-13 13:16:51   info   : [33] VARCHAR: field: 360 bytes, data: 26 bytes
2023-01-13 13:16:51   info   : [34] NULL
2023-01-13 13:16:51   info   : [35] TINY
2023-01-13 13:16:51   info   : [36] VARCHAR: field: 3 bytes, data: 1 bytes
2023-01-13 13:16:51   info   : [37] LONG
2023-01-13 13:16:51   info   : [38] ENUM: 1 bytes
2023-01-13 13:16:51   info   : [39] NULL
2023-01-13 13:16:51   info   : [40] VARCHAR: field: 30 bytes, data: 6 bytes
2023-01-13 13:16:51   info   : [41] LONG
2023-01-13 13:16:51   info   : [42] LONGLONG
2023-01-13 13:16:51   info   : [43] LONG
2023-01-13 13:16:51   info   : [44] LONG
2023-01-13 13:16:51   info   : [45] NULL
2023-01-13 13:16:51   info   : [46] NULL
2023-01-13 13:16:51   info   : [47] NULL
2023-01-13 13:16:51   info   : [48] NULL
2023-01-13 13:16:51   info   : [49] VARCHAR: field: 765 bytes, data: 6 bytes
2023-01-13 13:16:51   info   : [50] VARCHAR: field: 765 bytes, data: 5 bytes
2023-01-13 13:16:51   info   : [51] NULL
2023-01-13 13:16:51   info   : [52] NULL
2023-01-13 13:16:51   info   : [53] NULL
2023-01-13 13:16:51   info   : [54] NULL
2023-01-13 13:16:51   info   : [55] NULL
2023-01-13 13:16:51   info   : [56] NULL
2023-01-13 13:16:51   info   : [57] NULL
2023-01-13 13:16:51   info   : [58] NULL
2023-01-13 13:16:51   info   : [59] NULL
2023-01-13 13:16:51   info   : [60] NULL
2023-01-13 13:16:51   info   : [61] NULL
2023-01-13 13:16:51   info   : [62] VARCHAR: field: 150 bytes, data: 7 bytes
2023-01-13 13:16:51   info   : [63] NULL
2023-01-13 13:16:51   info   : [64] NULL
2023-01-13 13:16:51   info   : [65] NULL
2023-01-13 13:16:51   info   : [66] ENUM: 1 bytes
2023-01-13 13:16:51   info   : [67] NULL
2023-01-13 13:16:51   info   : [68] VARCHAR: field: 600 bytes, data: 11 bytes
2023-01-13 13:16:51   info   : [69] VARCHAR: field: 600 bytes, data: 11 bytes
2023-01-13 13:16:51   info   : [70] LONG
2023-01-13 13:16:51   info   : [71] NULL
2023-01-13 13:16:51   info   : [72] ENUM: 1 bytes
2023-01-13 13:16:51   info   : [73] ENUM: 1 bytes
2023-01-13 13:16:51   info   : Row Event for 'nda_prod.patient_log' at 329057608
2023-01-13 13:16:51   info   : [0] LONG
2023-01-13 13:16:51   info   : [1] LONG
2023-01-13 13:16:51   info   : [2] NULL
2023-01-13 13:16:51   info   : [3] NULL
2023-01-13 13:16:51   info   : [4] NULL
2023-01-13 13:16:51   info   : [5] NULL
2023-01-13 13:16:51   info   : [6] LONG
2023-01-13 13:16:51   info   : [7] VARCHAR: field: 360 bytes, data: 20 bytes
2023-01-13 13:16:51   info   : [8] NULL
2023-01-13 13:16:51   info   : [9] VARCHAR: field: 15 bytes, data: 0 bytes
2023-01-13 13:16:51   info   : [10] VARCHAR: field: 150 bytes, data: 5 bytes
2023-01-13 13:16:51   info   : [11] VARCHAR: field: 150 bytes, data: 0 bytes
2023-01-13 13:16:51   info   : [12] VARCHAR: field: 150 bytes, data: 6 bytes
2023-01-13 13:16:51   info   : [13] VARCHAR: field: 288 bytes, data: 20 bytes
2023-01-13 13:16:51   info   : [14] NULL
2023-01-13 13:16:51   info   : [15] NULL
2023-01-13 13:16:51   info   : [16] NULL
2023-01-13 13:16:51   info   : [17] VARCHAR: field: 36 bytes, data: 0 bytes
2023-01-13 13:16:51   info   : [18] VARCHAR: field: 36 bytes, data: 12 bytes
2023-01-13 13:16:51   info   : [19] DATE: 1987-11-11
2023-01-13 13:16:51   info   : [20] ENUM: 1 bytes
2023-01-13 13:16:51   info   : [21] VARCHAR: field: 765 bytes, data: 27 bytes
2023-01-13 13:16:51   info   : [22] VARCHAR: field: 765 bytes, data: 0 bytes
2023-01-13 13:16:51   info   : [23] VARCHAR: field: 192 bytes, data: 9 bytes
2023-01-13 13:16:51   info   : [24] VARCHAR: field: 192 bytes, data: 2 bytes
2023-01-13 13:16:51   info   : [25] VARCHAR: field: 30 bytes, data: 5 bytes
2023-01-13 13:16:51   info   : [26] NULL
2023-01-13 13:16:51   info   : [27] TINY
2023-01-13 13:16:51   info   : [28] TINY
2023-01-13 13:16:51   info   : [29] NULL
2023-01-13 13:16:51   info   : [30] DATETIME2: 2023-01-13 13:15:49
2023-01-13 13:16:51   info   : [31] DATETIME2: 2023-01-13 13:15:49
2023-01-13 13:16:51   info   : [32] NULL
2023-01-13 13:16:51   info   : [33] VARCHAR: field: 360 bytes, data: 26 bytes
2023-01-13 13:16:51   info   : [34] NULL
2023-01-13 13:16:51   info   : [35] TINY
2023-01-13 13:16:51   info   : [36] VARCHAR: field: 3 bytes, data: 1 bytes
2023-01-13 13:16:51   info   : [37] LONG
2023-01-13 13:16:51   info   : [38] ENUM: 1 bytes
2023-01-13 13:16:51   info   : [39] NULL
2023-01-13 13:16:51   info   : [40] VARCHAR: field: 30 bytes, data: 6 bytes
2023-01-13 13:16:51   info   : [41] LONG
2023-01-13 13:16:51   info   : [42] LONGLONG
2023-01-13 13:16:51   info   : [43] LONG
2023-01-13 13:16:51   info   : [44] LONG
2023-01-13 13:16:51   info   : [45] NULL
2023-01-13 13:16:51   info   : [46] NULL
2023-01-13 13:16:51   info   : [47] NULL
2023-01-13 13:16:51   info   : [48] NULL
2023-01-13 13:16:51   info   : [49] VARCHAR: field: 765 bytes, data: 6 bytes
2023-01-13 13:16:51   info   : [50] VARCHAR: field: 765 bytes, data: 5 bytes
2023-01-13 13:16:51   info   : [51] NULL
2023-01-13 13:16:51   info   : [52] NULL
2023-01-13 13:16:51   info   : [53] NULL
2023-01-13 13:16:51   info   : [54] NULL
2023-01-13 13:16:51   info   : [55] NULL
2023-01-13 13:16:51   info   : [56] NULL
2023-01-13 13:16:51   info   : [57] NULL
2023-01-13 13:16:51   info   : [58] NULL
2023-01-13 13:16:51   info   : [59] NULL
2023-01-13 13:16:51   info   : [60] NULL
2023-01-13 13:16:51   info   : [61] NULL
2023-01-13 13:16:51   info   : [62] VARCHAR: field: 150 bytes, data: 7 bytes
2023-01-13 13:16:51   info   : [63] NULL
2023-01-13 13:16:51   info   : [64] NULL
2023-01-13 13:16:51   info   : [65] NULL
2023-01-13 13:16:51   info   : [66] ENUM: 1 bytes
2023-01-13 13:16:51   info   : [67] NULL
2023-01-13 13:16:51   info   : [68] VARCHAR: field: 600 bytes, data: 11 bytes
2023-01-13 13:16:51   info   : [69] VARCHAR: field: 600 bytes, data: 11 bytes
2023-01-13 13:16:51   info   : [70] LONG
2023-01-13 13:16:51   info   : [71] NULL
2023-01-13 13:16:51   info   : [72] DATETIME2: 2023-01-13 13:15:49
2023-01-13 13:16:51   info   : [73] ENUM: 1 bytes
2023-01-13 13:16:51   info   : [74] ENUM: 1 bytes
2023-01-13 13:16:51   info   : Row Event for 'nda_prod.patient_linking' at 329057935
2023-01-13 13:16:51   info   : [0] LONG
2023-01-13 13:16:51   info   : [1] LONG
2023-01-13 13:16:51   info   : [2] LONG
2023-01-13 13:16:51   info   : [3] LONG
2023-01-13 13:16:51   info   : [4] LONG
2023-01-13 13:16:51   info   : [5] LONG
alert  : MaxScale 22.08.3 received fatal signal 11. Commit ID: 2949f820a7d9de38d7fd51909f66d561627d1eed System name: Linux Release string: NAME="CentOS Linux"
 
 
2023-01-13 13:16:51   alert  : MaxScale 22.08.3 received fatal signal 11. Commit ID: 2949f820a7d9de38d7fd51909f66d561627d1eed System name: Linux Release string: NAME="CentOS Linux"
2023-01-13 13:16:51   alert  : Statement currently being classified: none/unknown
2023-01-13 13:16:51   notice : For a more detailed stacktrace, install GDB and add 'debug=gdb-stacktrace' under the [maxscale] section.
  /usr/lib64/maxscale/libreplicator.so.1.0.0(_ZN3Rpl22process_row_event_dataERK5TablePhS3_S3_+0x1006): ??:?
  /usr/lib64/maxscale/libreplicator.so.1.0.0(_ZN3Rpl16handle_row_eventEP10REP_HEADERPh+0x4be): ??:?
  /usr/lib64/maxscale/libreplicator.so.1.0.0(_ZN3Rpl12handle_eventE10REP_HEADERPh+0x67): ??:?
  /usr/lib64/maxscale/libreplicator.so.1.0.0(_ZN3cdc10Replicator3Imp17process_one_eventERSt10unique_ptrI20st_mariadb_rpl_eventSt8functionIFvPS3_EEE+0xdd): ??:?
  /usr/lib64/maxscale/libreplicator.so.1.0.0(_ZN3cdc10Replicator3Imp14process_eventsEv+0x140): ??:?
  /usr/lib64/maxscale/libmaxscale-common.so.1.0.0(+0x4096d0): thread48.o:?
  /lib64/libpthread.so.0(+0x7ea5): pthread_create.c:?
  /lib64/libc.so.6(clone+0x6d): ??:?
alert  : Writing core dump.

Comment by markus makela [ 2023-01-13 ]

Can you find out which binlog that event is in? If you can, you should be able to see with mysqlbinlog what was being inserted into the table. If you can also get the table schema for the nda_prod.patient_linking table, that would help us reproduce this.

Comment by Jayant Patil [ 2023-01-14 ]

Hi Markus,

We have more than 100 tables of which we just need to process only 1 table (not nda_prod.patient_linkin) from binlog to kafka using kafkacdc. Is there any way to skip those issues and old slave state which are now purged?

Comment by Jayant Patil [ 2023-01-17 ]

Hi Markus,

got this new issue. please check this as well related to same process.

notice : Started replicating from [10.142.0.22]:3306 at GTID '0-1-161157966'
2023-01-17 07:51:10 error : Failed to read replicated event: 4052, A slave with the same server_uuid/server_id as this slave has connected to the master; the first event '.' at 0, the last event read from 'mysql-bin.001211' at 6223070, the last byte read from 'mysql-bin.001211' at 6231130.
2023-01-17 07:51:15 notice : Started replicating from [10.142.0.22]:3306 at GTID '0-1-161157966'
2023-01-17 07:51:15 error : Failed to read replicated event: 4052, A slave with the same server_uuid/server_id as this slave has connected to the master; the first event '.' at 0, the last event read from 'mysql-bin.001211' at 6843690, the last byte read from 'mysql-bin.001211' at 6851750.
2023-01-17 07:51:20 notice : Started replicating from [10.142.0.22]:3306 at GTID '0-1-161157966'
2023-01-17 07:51:20 error : Failed to read replicated event: 4052, A slave with the same server_uuid/server_id as this slave has connected to the master; the first event '.' at 0, the last event read from 'mysql-bin.001211' at 5514762, the last byte read from 'mysql-bin.001211' at 5522794.
2023-01-17 07:51:25 notice : Started replicating from [10.142.0.22]:3306 at GTID '0-1-161157966'
2023-01-17 07:51:25 error : Failed to read replicated event: 4052, A slave with the same server_uuid/server_id as this slave has connected to the master; the first event '.' at 0, the last event read from 'mysql-bin.001211' at 5273802, the last byte read from 'mysql-bin.001211' at 5281834.
2023-01-17 07:51:30 notice : Started replicating from [10.142.0.22]:3306 at GTID '0-1-161157966'
2023-01-17 07:51:30 error : Failed to read replicated event: 4052, A slave with the same server_uuid/server_id as this slave has connected to the master; the first event '.' at 0, the last event read from 'mysql-bin.001211' at 1797285, the last byte read from 'mysql-bin.001211' at 1797416.
2023-01-17 07:51:35 notice : Started replicating from [10.142.0.22]:3306 at GTID '0-1-161157966'
2023-01-17 07:51:35 error : Failed to read replicated event: 4052, A slave with the same server_uuid/server_id as this slave has connected to the master; the first event '.' at 0, the last event read from 'mysql-bin.001211' at 5056938, the last byte read from 'mysql-bin.001211' at 5064970.

Comment by Jayant Patil [ 2023-01-19 ]

Hi Markus,

Kindly provide information for resolving on this issues details provided to you.

Comment by markus makela [ 2023-01-19 ]

We'll either need a way to reproduce this or the actual binlogs that cause these problems. If you can figure out which table and which statement is causing this problem, we can investigate further. Without any information on what the actual data the kafkacdc was processing, it's hard to deduce anything about this.

The CREATE TABLE statement for the nda_prod.patient_linking table would be a good starting point.

Right now there's no way to filter out these events. You could try to move past the problematic GTID by removing the current_gtid.txt file in /var/lib/maxscale/ and then using the gtid parameter to define the next event after that.

The error you're getting means there's some other server relicating with the same server_id. You'll need to specify a different one for the kafkacdc with the server_id parameter.

Comment by markus makela [ 2023-01-27 ]

We'll need some way to reproduce this to fix it.

Comment by Jayant Patil [ 2023-01-27 ]

Hi Markus,

What kind of binlog data would be this which can cause those issue?

1) size of message
2) tables dml or any other queries in binlog which may not be compatible with maxscale?
3) any other possible issues you fixed already similar like this?

Not sure, if you can possibly come on call with us, so we can do test and share online with you. Let me know

Comment by markus makela [ 2023-01-28 ]

jayantworld@gmail.com I think we can start off with the `CREATE TABLE` statement for the table. You can get that with SHOW CREATE TABLE nda_prod.patient_linking. Make sure to remove any confidental details from it before posting it here.

The next step would be to get the binlog event associated with the crash. Since it contains actual data, I'd recommend starting with just the CREATE TABLE statement.

Comment by Jayant Patil [ 2023-01-31 ]

Hi Markus,

we are receiving new error now: The size of message is around 10mb
error : [kafkacdc] Broker: Message size too large

Comment by markus makela [ 2023-01-31 ]

jayantworld@gmail.com based on this StackOverflow post, increasing the following values in Kafka should help:

message.max.bytes
replica.fetch.max.bytes
max.request.size

Comment by markus makela [ 2023-01-31 ]

I created a similar table to the one that you were having problems with but so far I haven't been able to reproduce it. It's possible that the problem relates to one of the earlier tables (e.g. the one right before it) and we might need the whole transaction that causes MaxScale to crash. If you can produce a small set of SQL commands that reproduces the problem, it would help us fix this.

Another option you can do is to install the debug version of the 22.08 packages(you can find them here). With them, if there are any internal assertions that fail, we'll have more information that could potentially help us fix this.

Comment by Jayant Patil [ 2023-01-31 ]

[Kafka-CDC]
type=service
router=kafkacdc
servers=server1
user=monitor_user
password=monitor_pw
bootstrap_servers=10.142.0.10:9092
topic=ocean-cdc-topic
match=h_prd_light_nov22[.](mes_etl|mes)
max.request.size=20000000
replica.fetch.max.bytes=20000000

[root@qa-mariadb10 opendr]# service maxscale restart
Redirecting to /bin/systemctl restart maxscale.service
Job for maxscale.service failed because the control process exited with error code. See "systemctl status maxscale.service" and "journalctl -xe" for details.
[root@qa-mariadb10 opendr]# journalctl -xe
– Defined-By: systemd
– Support: http://lists.freedesktop.org/mailman/listinfo/systemd-devel
– Documentation: http://www.freedesktop.org/wiki/Software/systemd/multiseat

– A new session with the ID 3473 has been created for the user opendr.

– The leading process of the session is 9787.
Jan 31 09:27:10 qa-mariadb10 sshd[9787]: pam_unix(sshd:session): session opened for user opendr by (uid=0)
Jan 31 09:28:12 qa-mariadb10 sudo[9870]: opendr : TTY=pts/0 ; PWD=/home/opendr ; USER=root ; COMMAND=/bin/su
Jan 31 09:28:12 qa-mariadb10 sudo[9870]: pam_unix(sudo:session): session opened for user root by opendr(uid=0)
Jan 31 09:28:12 qa-mariadb10 su[9871]: (to root) opendr on pts/0
Jan 31 09:28:12 qa-mariadb10 su[9871]: pam_unix(su:session): session opened for user root by opendr(uid=0)
Jan 31 09:29:40 qa-mariadb10 collectd[1069]: uc_update: Value too old: name = qa-mariadb10/processes-all/io_octets; value time = 1675157380.313; last cache update = 1675157380.314;
Jan 31 09:30:00 qa-mariadb10 polkitd[477]: Registered Authentication Agent for unix-process:10040:615446765 (system bus name :1.11174 [/usr/bin/pkttyagent --notify-fd 5 --fallback], o
Jan 31 09:30:00 qa-mariadb10 systemd[1]: Stopping MariaDB MaxScale Database Proxy...
– Subject: Unit maxscale.service has begun shutting down
– Defined-By: systemd
– Support: http://lists.freedesktop.org/mailman/listinfo/systemd-devel

– Unit maxscale.service has begun shutting down.
Jan 31 09:30:01 qa-mariadb10 systemd[1]: Stopped MariaDB MaxScale Database Proxy.
– Subject: Unit maxscale.service has finished shutting down
– Defined-By: systemd
– Support: http://lists.freedesktop.org/mailman/listinfo/systemd-devel

– Unit maxscale.service has finished shutting down.
Jan 31 09:30:01 qa-mariadb10 systemd[1]: Starting MariaDB MaxScale Database Proxy...
– Subject: Unit maxscale.service has begun start-up
– Defined-By: systemd
– Support: http://lists.freedesktop.org/mailman/listinfo/systemd-devel

– Unit maxscale.service has begun starting up.
Jan 31 09:30:01 qa-mariadb10 maxscale[10066]: The systemd watchdog is Enabled. Internal timeout = 30s
Jan 31 09:30:01 qa-mariadb10 maxscale[10066]: Using up to 2.33GiB of memory for query classifier cache
Jan 31 09:30:01 qa-mariadb10 systemd[1]: maxscale.service: control process exited, code=exited status=1
Jan 31 09:30:01 qa-mariadb10 systemd[1]: Failed to start MariaDB MaxScale Database Proxy.
– Subject: Unit maxscale.service has failed
– Defined-By: systemd
– Support: http://lists.freedesktop.org/mailman/listinfo/systemd-devel

– Unit maxscale.service has failed.

– The result is failed.
Jan 31 09:30:01 qa-mariadb10 systemd[1]: Unit maxscale.service entered failed state.
Jan 31 09:30:01 qa-mariadb10 systemd[1]: maxscale.service failed.
Jan 31 09:30:01 qa-mariadb10 polkitd[477]: Unregistered Authentication Agent for unix-process:10040:615446765 (system bus name :1.11174, object path /org/freedesktop/PolicyKit1/Authen
[root@qa-mariadb10 opendr]#

Comment by markus makela [ 2023-01-31 ]

Those are configuration options for the Kafka broker, not MaxScale: https://kafka.apache.org/documentation/#brokerconfigs_message.max.bytes

Comment by Jayant Patil [ 2023-01-31 ]

Hi markus,

we are using KAFKA CDC of MaxScale, so how to change that producer config for maxscale for size setting

Comment by markus makela [ 2023-01-31 ]

You'll have to edit the configuration on the Kafka instance where MaxScale is writing. Depending on how you installed, the location of the property file might differ. I'd recommend following the Kafka documentation on how to configure the broker: https://kafka.apache.org/documentation/#configuration

Comment by Jayant Patil [ 2023-01-31 ]

So, our setup is below mentioned:

Server1 (Primary) -> Server2(Maxscale) kafkacdc configured -> Server3(Kafka Server + Consumer)
so you mean to tell that we need to change in Server3 Kafka server.properties

Please clarify.

Comment by markus makela [ 2023-01-31 ]

Yes, that's what you'll need to do.

Comment by Jayant Patil [ 2023-01-31 ]

Hi markus,

It is still same error even changing the properties.

Comment by markus makela [ 2023-02-02 ]

Have you managed to reproduce this outside of your production environment?

Comment by markus makela [ 2023-09-04 ]

It might be that this commit fixed the problem. This was included in a recent release so you should be able to uprgade to it and test and see if it solves the problem.

Comment by markus makela [ 2023-09-04 ]

I'll close this as Done based on the fact that the problem appeared in the same function that the aforementioned commit fixes. If you can reproduce this with the latest release of 22.08, please let us know and we'll reopen this.

Generated at Thu Feb 08 04:28:54 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.