[MXS-4459] Improve match/exclude documentation for avrorouter and kafkacdc Created: 2022-12-20  Updated: 2023-02-24  Resolved: 2023-02-24

Status: Closed
Project: MariaDB MaxScale
Component/s: avrorouter
Affects Version/s: 22.08.3
Fix Version/s: 2.5.25, 6.4.6, 22.08.5

Type: Bug Priority: Major
Reporter: Naresh Chandra Assignee: markus makela
Resolution: Fixed Votes: 0
Labels: None

Sprint: MXS-SPRINT-177

 Description   

The documentation on the match and exclude parameters for the avrorouter and kafkacdc are not very clear about what they should be used for. The parameters could also explain the common use-case of matching a small set of tables without matching substrings in other tables.


Original description:

When we are matching tables with match pattern its even replicating the tables which are matching partially like below.

CASE1: When we use below match case in the avro router.
EX: match=test[.]avro_test1

After that if we create any tables like avro_test1, avro_test2, avro_test3 then its only replicating/filtering avro_test1 table.

CASE2: When we use below match case in the avro router.
EX: match=test[.]avro_test

After that if we create any tables like avro_test1, avro_test2, avro_test3 then its replicating/filtering all the tables which is matching with avro_test table. But we are expecting it should filter only avro_test table.

CASE3: When we use below match case in the avro router.
EX: match=tds[.](tot_extension|tots)|test[.](avro_test1|avro_test)

If we use above match case then we are still even getting the partially matched tables.
[root@test404 avro]# ls -lrth
total 3.0M
rw-rr- 1 maxscale maxscale 9.4K Dec 20 10:08 tds.tot_extension.000001.avsc
rw-rr- 1 maxscale maxscale 6.6K Dec 20 10:08 tds.tots.000001.avsc
rw-rr- 1 maxscale maxscale 641 Dec 20 10:09 test.avro_test.000001.avsc
rw-rr- 1 maxscale maxscale 641 Dec 20 10:09 test.avro_test.000001.avro
rw-rr- 1 maxscale maxscale 642 Dec 20 10:09 test.avro_test1.000001.avsc
rw-rr- 1 maxscale maxscale 641 Dec 20 10:09 test.avro_test1.000001.avro
rw-rr- 1 maxscale maxscale 642 Dec 20 10:09 test.avro_test2.000001.avsc
rw-rr- 1 maxscale maxscale 679 Dec 20 10:09 test.avro_test2.000001.avro
rw-rr- 1 maxscale maxscale 547 Dec 20 10:09 test.avro_test3.000001.avsc
rw-rr- 1 maxscale maxscale 574 Dec 20 10:09 test.avro_test3.000001.avro
rw-rr- 1 maxscale maxscale 547 Dec 20 10:09 test.avro_test4.000001.avsc
rw-rr- 1 maxscale maxscale 574 Dec 20 10:09 test.avro_test4.000001.avro
rw-rr- 1 maxscale maxscale 547 Dec 20 10:09 test.avro_test5.000001.avsc
rw-rr- 1 maxscale maxscale 574 Dec 20 10:09 test.avro_test5.000001.avro
rw-rr- 1 maxscale maxscale 547 Dec 20 10:09 test.avro_test6.000001.avsc
rw-rr- 1 maxscale maxscale 640 Dec 20 10:09 test.avro_test6.000001.avro
rw-rr- 1 maxscale maxscale 547 Dec 20 10:09 test.avro_test7.000001.avsc
rw-rr- 1 maxscale maxscale 574 Dec 20 10:09 test.avro_test7.000001.avro
rw-rr- 1 maxscale maxscale 547 Dec 20 10:09 test.avro_test8.000001.avsc
rw-rr- 1 maxscale maxscale 574 Dec 20 10:09 test.avro_test8.000001.avro
rw-rr- 1 maxscale maxscale 547 Dec 20 10:09 test.avro_test9.000001.avsc
rw-rr- 1 maxscale maxscale 607 Dec 20 10:09 test.avro_test9.000001.avro
rw-rr- 1 maxscale maxscale 548 Dec 20 10:10 test.avro_test10.000001.avsc
rw-rr- 1 maxscale maxscale 574 Dec 20 10:10 test.avro_test10.000001.avro
rw-rr- 1 maxscale maxscale 775K Dec 20 10:11 tds.tots.000001.avro
rw-rr- 1 maxscale maxscale 2.1M Dec 20 10:11 tds.tot_extension.000001.avro
rw-r---- 1 maxscale maxscale 66 Dec 20 10:11 current_gtid.txt
[root@test404 avro]#

CASE4: When we use below match case in the avro router.
EX: match=tds[.](tot_extension|tots)|test[.](avro_test1|avro_test)

But in this case I have created the create table avro_tes(id int); table then its not filtering.

[root@test404 avro]# ls -lrht test.avro_tes*
rw-rr- 1 maxscale maxscale 641 Dec 20 10:09 test.avro_test.000001.avsc
rw-rr- 1 maxscale maxscale 641 Dec 20 10:09 test.avro_test.000001.avro
rw-rr- 1 maxscale maxscale 642 Dec 20 10:09 test.avro_test1.000001.avsc
rw-rr- 1 maxscale maxscale 641 Dec 20 10:09 test.avro_test1.000001.avro
rw-rr- 1 maxscale maxscale 642 Dec 20 10:09 test.avro_test2.000001.avsc
rw-rr- 1 maxscale maxscale 679 Dec 20 10:09 test.avro_test2.000001.avro
rw-rr- 1 maxscale maxscale 547 Dec 20 10:09 test.avro_test3.000001.avsc
rw-rr- 1 maxscale maxscale 574 Dec 20 10:09 test.avro_test3.000001.avro
rw-rr- 1 maxscale maxscale 547 Dec 20 10:09 test.avro_test4.000001.avsc
rw-rr- 1 maxscale maxscale 574 Dec 20 10:09 test.avro_test4.000001.avro
rw-rr- 1 maxscale maxscale 547 Dec 20 10:09 test.avro_test5.000001.avsc
rw-rr- 1 maxscale maxscale 574 Dec 20 10:09 test.avro_test5.000001.avro
rw-rr- 1 maxscale maxscale 547 Dec 20 10:09 test.avro_test6.000001.avsc
rw-rr- 1 maxscale maxscale 640 Dec 20 10:09 test.avro_test6.000001.avro
rw-rr- 1 maxscale maxscale 547 Dec 20 10:09 test.avro_test7.000001.avsc
rw-rr- 1 maxscale maxscale 574 Dec 20 10:09 test.avro_test7.000001.avro
rw-rr- 1 maxscale maxscale 547 Dec 20 10:09 test.avro_test8.000001.avsc
rw-rr- 1 maxscale maxscale 574 Dec 20 10:09 test.avro_test8.000001.avro
rw-rr- 1 maxscale maxscale 547 Dec 20 10:09 test.avro_test9.000001.avsc
rw-rr- 1 maxscale maxscale 607 Dec 20 10:09 test.avro_test9.000001.avro
rw-rr- 1 maxscale maxscale 548 Dec 20 10:10 test.avro_test10.000001.avsc
rw-rr- 1 maxscale maxscale 574 Dec 20 10:10 test.avro_test10.000001.avro
[root@test404 avro]#

I am trying to list the table of "avro_tes" but I am unable to find it.

[root@test404 avro]# ls -lrht test.avro_tes.*
ls: cannot access test.avro_tes.*: No such file or directory
[root@test404 avro]#

In the 4th CASE, I don't see the "avro_tes" table in the partially matching list.

Filtering should work when the table/schema matches exactly otherwise it should ignore if anything matches partially.



 Comments   
Comment by markus makela [ 2022-12-20 ]

The match and exclude are regular expression patterns, PCRE2 to be specific. Some of this behavior you are seeing is expected behavior as the patterns aren't strict enough. If you want to prevent excessive matching, use ^ at the start of the pattern and $ at the end of the pattern. This will cause it to match only if the string fully matches the pattern.

Comment by Naresh Chandra [ 2022-12-21 ]

Markus, Thanks for the help it is working fine. It's not documented anywhere on the Avro router or Kafka router pages.
We can close the ticket for now.

match=tds[.](tot_extension|tots)|^test[.](avro_test|avro_test1)$

Comment by markus makela [ 2022-12-21 ]

The documentation for both avrorouter and kafkacdc already mentions that they are regular expressions.

Comment by Naresh Chandra [ 2022-12-21 ]

Thanks Markus, Maybe I am only looking for examples. Thanks for the help.

Comment by markus makela [ 2022-12-21 ]

No worries, regular expressions are somewhat tricky to get right. I think we could still improve the documentation as the use-case for filtering only one table seems common enough.

Comment by Naresh Chandra [ 2022-12-21 ]

Thanks, Markus, for the doc improvement. As a user we don't how to use so please update all the use cases which are possible?

Comment by markus makela [ 2023-02-24 ]

Added some examples to the documentation on how to match one or more tables.

Generated at Thu Feb 08 04:28:48 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.