Uploaded image for project: 'MariaDB MaxScale'
  1. MariaDB MaxScale
  2. MXS-2567

CDC module can wind up in an infinite loop

    XMLWordPrintable

    Details

    • Type: Bug
    • Status: Closed (View Workflow)
    • Priority: Major
    • Resolution: Cannot Reproduce
    • Affects Version/s: 2.3.4, 2.3.8
    • Fix Version/s: N/A
    • Component/s: cdc
    • Labels:
      None
    • Environment:
      docker image mariadb/maxscale
    • Sprint:
      MXS-SPRINT-91

      Description

      Hello,

      i want to stream a mysql dump to kafka. I use a similar setup as in https://mariadb.com/resources/blog/real-time-data-streaming-to-kafka-with-maxscale-cdc/. It does work but maxscale CPU usage stays high (100% cpu usage) even when no event is streamed .
      The workflow is as follows.

      • start maxscale and mariadb docker containers
      • setup users, right etc ...
      • start maxscale as read slave
      • add a db dump to mariadb
      • start the cdc script
        • this is when the infinite loop happens

      The problem seems to be tied to the dump itself. It doesn't happen for all dumbs.
      It appears that the CDC component gets stuck in an infinite loop.
      Here the maxscale log:

      2019-06-19 09:47:23   notice : (2) avro-service: new connection from [::ffff:172.28.0.4]
      2019-06-19 09:47:23   warning: (8) Events have already been injected to current DCB, discarding existing.
      2019-06-19 09:47:23   warning: (8) Events have already been injected to current DCB, discarding existing.
      2019-06-19 09:47:23   warning: (8) Events have already been injected to current DCB, discarding existing.
      2019-06-19 09:47:23   warning: (8) Events have already been injected to current DCB, discarding existing.
      2019-06-19 09:47:23   warning: (8) Events have already been injected to current DCB, discarding existing.
      2019-06-19 09:47:23   warning: (8) Events have already been injected to current DCB, discarding existing.
      2019-06-19 09:47:23   warning: (8) Events have already been injected to current DCB, discarding existing.
      2019-06-19 09:47:23   warning: (8) Events have already been injected to current DCB, discarding existing.
      2019-06-19 09:47:23   warning: (8) Events have already been injected to current DCB, discarding existing.
      2019-06-19 09:47:23   warning: (8) Events have already been injected to current DCB, discarding existing. (subsequent similar messages suppressed for 10000 milliseconds)
      

      Following the warning let me to the following peace of code that suggest it might be an infinite loop:

      MaxScale/server/core/dcb.cc

       3225 static uint32_t dcb_handler(DCB* dcb, uint32_t events)
       3226 {
       3227     this_thread.current_dcb = dcb;
       3228     uint32_t rv = dcb_process_poll_events(dcb, events);
       3229 
       3230     // When all I/O events have been handled, we will immediately
       3231     // process an added fake event. As the handling of a fake event
       3232     // may lead to the addition of another fake event we loop until
       3233     // there is no fake event or the dcb has been closed.
       3234 
       3235     while ((dcb->n_close == 0) && (dcb->fake_event != 0))
       3236     {
       3237         events = dcb->fake_event;
       3238         dcb->fake_event = 0;
       3239 
       3240         rv |= dcb_process_poll_events(dcb, events);
       3241     }
       3242 
       3243     this_thread.current_dcb = NULL;
       3244 
       3245     return rv;
       3246 }
      

      Due to data privacy reasons I cannot provide the dump though.
      May be the replication diagnostics is helpful:

      	Service:                             replication-service
      	Router:                              binlogrouter
      	State:                               Started
      	Master connection DCB:               0x5613cf39b4b0
      	Master connection state:                     Binlog Dump
      	Binlog directory:                            /var/lib/maxscale
      	Heartbeat period (seconds):                  300
      	Number of master connects:                   1
      	Number of delayed reconnects:                0
      	Number of connect retries:                   0
      	Connect retry interval:                      60
      	Connect retry count limit:                   1000
      	Current binlog file:                         binlog.000003
      	Current binlog position:                     7351830
      	Number of slave servers:                     0
      	No. of binlog events received this session:  36943
      	Total no. of binlog events received:         36943
      	No. of bad CRC received from master:         0
      	Number of binlog events per minute
      	Current        5        10       15       30 Min Avg
      	      0       0.0      0.0      0.0      0.0
      	Number of fake binlog events:                0
      	Number of artificial binlog events:          3
      	Number of binlog events in error:            0
      	Number of binlog rotate events:              2
      	Number of heartbeat events:                  0
      	Number of packets received:                  168
      	Number of residual data packets:             0
      	Average events per packet:                   219.9
      	Last event from master at:                   Wed Jun 19 09:40:15 2019 (16 seconds ago)
      	Last event from master:                      0x2, Query Event
      	Last binlog event timestamp:                 1560937215 (Wed Jun 19 09:40:15 2019)
      	Events received:
      		Invalid                                  0
      		Start Event V3                           0
      		Query Event                              7392
      		Stop Event                               2
      		Rotate Event                             3
      		Integer Session Variable                 0
      		Load Event                               0
      		Slave Event                              0
      		Create File Event                        0
      		Append Block Event                       0
      		Exec Load Event                          0
      		Delete File Event                        0
      		New Load Event                           0
      		Rand Event                               0
      		User Variable Event                      0
      		Format Description Event                 3
      		Transaction ID Event (2 Phase Commit)    38
      		Begin Load Query Event                   0
      		Execute Load Query Event                 0
      		Table Map Event                          7221
      		Write Rows Event (v0)                    0
      		Update Rows Event (v0)                   0
      		Delete Rows Event (v0)                   0
      		Write Rows Event (v1)                    7627
      		Update Rows Event (v1)                   0
      		Delete Rows Event (v1)                   0
      		Incident Event                           0
      		Heartbeat Event                          0
      		Ignorable Event                          0
      		Rows Query Event                         0
      		Write Rows Event (v2)                    0
      		Update Rows Event (v2)                   0
      		Delete Rows Event (v2)                   0
      		GTID Event                               0
      		Anonymous GTID Event                     0
      		Previous GTIDS Event                     0
      		MariaDB 10 Annotate Rows Event                      7221
      		MariaDB 10 Binlog Checkpoint Event                  3
      		MariaDB 10 GTID Event                               7430
      		MariaDB 10 GTID List Event                          3
      		MariaDB 10 Start Encryption Event                   0
      	Started:                             Wed Jun 19 09:40:07 2019
      	Root user access:                    Disabled
      	Backend databases:
      		[mariadb_test]:3306    Protocol: mariadbbackend    Name: binlog_router_master_host
      	Total connections:                   3
      	Currently connected:                 2
      
      

        Attachments

          Activity

            People

            Assignee:
            markus makela markus makela
            Reporter:
            jseiler Jonas Seiler
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

              Dates

              Created:
              Updated:
              Resolved:

                Git Integration

                Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.