[MXS-2106] Maxscale CDC JSON output does not respect null values Created: 2018-10-18  Updated: 2018-11-08  Resolved: 2018-11-08

Status: Closed
Project: MariaDB MaxScale
Component/s: avrorouter, cdc
Affects Version/s: 2.2.14
Fix Version/s: 2.2.17

Type: Bug Priority: Major
Reporter: Sergiy Kyrychenko Assignee: markus makela
Resolution: Fixed Votes: 0
Labels: None
Environment:

Maxscale 2.2.14 connecting to MariaDB 10.1 for CDC



 Description   

I have a MariaDB 10.1 database and I've installed and configured Maxscale 2.2.14 to try BinlogRouter, Avrorouter, and CDC Protocol.

The problem is with JSON output from CDC Protocol. It seems that null values are not coming out as nulls but rather zeroes and empty strings on insert, and when value is updated to null, after_update event has old value reported instead of null or 0.

Below are sample table and DML statements, relevant CDC Protocol output, maxscale.cnf and binlog/master.ini.

Any help would be appreciated.
-------------------------------------------------------------
Sample table and DMLs:
-------------------------------------------------------------

CREATE TABLE `test1` (
  `test1_id` int(10) unsigned NOT NULL AUTO_INCREMENT,
  `some_id` int(10) unsigned DEFAULT NULL,
  `desc` varchar(50) DEFAULT NULL,
  `some_date` timestamp NULL DEFAULT NULL,
  `updated` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,
  PRIMARY KEY (`test1_id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4;
 
INSERT INTO test1(some_id,`desc`,some_date)
VALUES
(1,NULL,NULL),
(NULL,'value1',NOW()),
(NULL,NULL,NULL);
 
UPDATE test1
SET some_id = NULL, `desc` = 'value2', some_date = NOW()
WHERE test1_id = 1;
UPDATE test1
SET some_id = 35, `desc` = NULL, some_date = NULL
WHERE test1_id = 2;

-------------------------------------------------------------
CDC Output:
-------------------------------------------------------------

{"namespace": "MaxScaleChangeDataSchema.avro", "type": "record", "name": "ChangeRecord", "fields": [{"name": "domain", "type": "int"}, {"name": "server_id", "type": "int"}, {"name": "sequence", "type": "int"}, {"name": "event_number", "type": "int"}, {"name": "timestamp", "type": "int"}, {"name": "event_type", "type": {"type": "enum", "name": "EVENT_TYPES", "symbols": ["insert", "update_before", "update_after", "delete"]}}, {"name": "test1_id", "type": "int", "real_type": "int", "length": 10}, {"name": "some_id", "type": "int", "real_type": "int", "length": 10}, {"name": "desc", "type": "string", "real_type": "varchar", "length": 50}, {"name": "some_date", "type": "string", "real_type": "timestamp", "length": -1}, {"name": "updated", "type": "string", "real_type": "timestamp", "length": -1}]}
{"domain": 0, "server_id": 11009, "sequence": 4552, "event_number": 1, "timestamp": 1539306025, "event_type": "insert", "test1_id": 1, "some_id": 1, "desc": "", "some_date": "", "updated": "2018-10-11 19:00:25"}
{"domain": 0, "server_id": 11009, "sequence": 4552, "event_number": 2, "timestamp": 1539306025, "event_type": "insert", "test1_id": 2, "some_id": 1, "desc": "value1", "some_date": "2018-10-11 19:00:25", "updated": "2018-10-11 19:00:25"}
{"domain": 0, "server_id": 11009, "sequence": 4552, "event_number": 3, "timestamp": 1539306025, "event_type": "insert", "test1_id": 3, "some_id": 1, "desc": "value1", "some_date": "2018-10-11 19:00:25", "updated": "2018-10-11 19:00:25"}
{"domain": 0, "server_id": 11009, "sequence": 5282, "event_number": 1, "timestamp": 1539306107, "event_type": "update_before", "test1_id": 1, "some_id": 1, "desc": "", "some_date": "", "updated": "2018-10-11 19:00:25"}
{"domain": 0, "server_id": 11009, "sequence": 5282, "event_number": 2, "timestamp": 1539306107, "event_type": "update_after", "test1_id": 1, "some_id": 1, "desc": "value2", "some_date": "2018-10-11 19:01:47", "updated": "2018-10-11 19:01:47"}
{"domain": 0, "server_id": 11009, "sequence": 5552, "event_number": 1, "timestamp": 1539306135, "event_type": "update_before", "test1_id": 2, "some_id": 0, "desc": "value1", "some_date": "2018-10-11 19:00:25", "updated": "2018-10-11 19:00:25"}
{"domain": 0, "server_id": 11009, "sequence": 5552, "event_number": 2, "timestamp": 1539306135, "event_type": "update_after", "test1_id": 2, "some_id": 35, "desc": "value1", "some_date": "2018-10-11 19:00:25", "updated": "2018-10-11 19:02:15"}

-------------------------------------------------------------
masxscale.cnf:
-------------------------------------------------------------

[vagrant@localhost ~]$ cat /etc/maxscale.cnf
# MaxScale documentation:
# https://mariadb.com/kb/en/mariadb-enterprise/mariadb-maxscale-22/
 
# Global parameters
#
# Complete list of configuration options:
# https://mariadb.com/kb/en/mariadb-enterprise/mariadb-maxscale-22-mariadb-maxscale-configuration-usage-scenarios/
 
[maxscale]
threads=auto
 
# Server definitions
#
# Set the address of the server to the network
# address of a MariaDB server.
#
 
[server1]
type=server
address=bpdb.co1.nar.beatportcorp.net
port=3306
protocol=mariadbbackend
 
# Monitor for the servers
#
# This will keep MaxScale aware of the state of the servers.
# MariaDB Monitor documentation:
# https://mariadb.com/kb/en/mariadb-enterprise/mariadb-maxscale-22-mariadb-monitor/
 
[MaxAdmin-Service]
type=service
router=cli
 
[MaxAdmin-Listener]
type=listener
service=MaxAdmin-Service
protocol=maxscaled
socket=default
 
[replication-service]
type=service
router=binlogrouter
router_options=server_id=11009,
               master_id=11009,
               user=testuser,
               password=testpass,
               binlogdir=/var/lib/maxscale/binlog/,
               mariadb10-compatibility=1
user=testuser
passwd=testpass
 
 
[avro-service]
type=service
router=avrorouter
source=replication-service
router_options=avrodir=/var/lib/maxscale/avro/,
               filestem=master-bin,
               start_index=1
user=testuser
passwd=testpass
 
[replication-listener]
type=listener
service=replication-service
protocol=MySQLClient
port=5308
 
[avro-listener]
type=listener
service=avro-service
protocol=CDC
port=4001

-------------------------------------------------------------

[root@localhost binlog]# cat master.ini
[binlog_configuration]
master_host=bpdb.co1.nar.beatportcorp.net
master_port=3306
master_user=testuser
master_password=testpass
filestem=master-bin
master_heartbeat_period=300
master_connect_retry=60



 Comments   
Comment by markus makela [ 2018-10-29 ]

Managed to reproduce this, appears to be caused by either the generated Avro schema and/or the client library.

Comment by Sergiy Kyrychenko [ 2018-10-29 ]

Can this be worked around by supplying custom avro spec to the router instead of generated one? I tried it and could not figure out proper format for avsc file to handle nulls.

Comment by markus makela [ 2018-10-29 ]

I tried a quick fix of turning all values into union types but that didn't seem work with the Avro C client library and caused other sorts of errors. I'll continue investigating this but at this point it looks like there's no workaround.

Comment by markus makela [ 2018-11-01 ]

Turns out that the union value handling in the C API had a different functionality from the normal values and I simply misunderstood how it worked. With a set of changes to the avrorouter and associated libraries, NULL values can be properly handled. The fix to this is currently on the MXS-2.2-markusjm branch and will be a part of the 2.2.17 release of MaxScale. I'll build development packages for verification and testing since we're just publishing the 2.2.16 release and 2.2.17 will take some time to ship.

Comment by markus makela [ 2018-11-01 ]

Here are the packages that can be used for testing: http://max-tst-01.mariadb.com/ci-repository/MXS-2.2-markusjm-nov1/mariadb-maxscale/

These haven't gone through full regression testing so they should not be used in production environments.

Comment by Sergiy Kyrychenko [ 2018-11-01 ]

That is great news. I'll try it in my environment. One question though:
Which directory should I download package from? What is the difference?

[DIR] 7/ 2018-11-01 11:52 -
[DIR] 7Server/ 2018-11-01 11:52 -
[DIR] 7server/ 2018-11-01 11:52 -

Comment by markus makela [ 2018-11-01 ]

They are all the same, those are just symbolic links that some operating systems required.

Comment by Sergiy Kyrychenko [ 2018-11-02 ]

Fix seems to work. Looking forward for 2.2.17 release. Thanks for addressing it in such a short time.

Comment by Sergiy Kyrychenko [ 2018-11-02 ]

Are you planning to update MaxScale/server/modules/protocol/examples/cdc_schema.py and MaxScale/server/modules/protocol/examples/cdc_schema.go so that correct schema files can be generated for existing tables?

Comment by markus makela [ 2018-11-02 ]

Oh yes, those seem to have slipped past me. I'll fix them immediately.

Generated at Thu Feb 08 04:11:43 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.