|
Please let me know , if you need anything else from me.
I am trying to implement the autofailover feature of 2.2.2 in production after the testing is over
|
|
I am using ubuntu 16.04
|
|
Managed to reproduce the failure with 2.2.2.
|
|
Bug was caused by a regression in TLS/SSL handling code. The TLS handshake was sent twice.
|
|
We've build packages to verify that the fix indeed works. You can find the current development packages from here: http://max-tst-01.mariadb.com/ci-repository/2.2-markusjm-mar5/mariadb-maxscale/
If possible, try to test these packages to see if the SSL error is fixed. The packages were built from commit 4f6b9d2bc3514e96da13d64650a6491eb0144186.
|
|
The issue has been fixed ... I can connect to the DB via maxscale.
I am seeing another issue now ... where maxscale just crashes and will not come up even after restart, when I bring up the old master after failover to new master.
The error I see is
2018-03-05 18:51:52 error : [mariadbmon] debug assert at /home/ubuntu/MaxScale/server/modules/monitor/mariadbmon/mariadbmon.cc:209 failed: found
2018-03-05 18:51:52 alert : Fatal: MaxScale 2.2.3 received fatal signal 6. Attempting backtrace.
2018-03-05 18:51:52 alert : Commit ID: 4f6b9d2bc3514e96da13d64650a6491eb0144186 System name: Linux Release string: Ubuntu 16.04.3 LTS
2018-03-05 18:51:52 alert : /usr/bin/maxscale() [0x405268]: ??:0
2018-03-05 18:51:52 alert : /lib/x86_64-linux-gnu/libpthread.so.0(+0x11390) [0x7f0c16c73390]: ??:?
2018-03-05 18:51:52 alert : /lib/x86_64-linux-gnu/libpthread.so.0(raise+0x29) [0x7f0c16c73269]: ??:?
2018-03-05 18:51:53 alert : /usr/lib/x86_64-linux-gnu/maxscale/libmariadbmon.so(_ZN4GtidC2EPKcl+0x243) [0x7f0c11fe4d87]: /home/ubuntu/MaxScale/server/modules/monitor/mariadbmon/mariadbmon.cc:211
2018-03-05 18:51:53 alert : /usr/lib/x86_64-linux-gnu/maxscale/libmariadbmon.so(+0x19ade) [0x7f0c11fe1ade]: /home/ubuntu/MaxScale/server/modules/monitor/mariadbmon/mariadbmon.cc:4011 (discriminator 4)
2018-03-05 18:51:53 alert : /usr/lib/x86_64-linux-gnu/maxscale/libmariadbmon.so(+0x1c10f) [0x7f0c11fe410f]: /home/ubuntu/MaxScale/server/modules/monitor/mariadbmon/mariadbmon.cc:4579
2018-03-05 18:51:53 alert : /usr/lib/x86_64-linux-gnu/maxscale/libmariadbmon.so(+0x1c4ad) [0x7f0c11fe44ad]: /home/ubuntu/MaxScale/server/modules/monitor/mariadbmon/mariadbmon.cc:4678 (discriminator 1)
2018-03-05 18:51:53 alert : /usr/lib/x86_64-linux-gnu/maxscale/libmariadbmon.so(+0x152cf) [0x7f0c11fdd2cf]: /home/ubuntu/MaxScale/server/modules/monitor/mariadbmon/mariadbmon.cc:2571
2018-03-05 18:51:53 alert : /lib/x86_64-linux-gnu/libpthread.so.0(+0x76ba) [0x7f0c16c696ba]: ??:?
2018-03-05 18:51:53 alert : /lib/x86_64-linux-gnu/libc.so.6(clone+0x6d) [0x7f0c155ea3dd]: ??:0
MariaDB MaxScale /var/log/maxscale/maxscale.log Mon Mar 5 18:51:53 2018
----------------------------------------------------------------------------
2018-03-05 18:51:53 error : No data read from child process pipe.
2018-03-05 18:51:53 MariaDB MaxScale is shut down.
----------------------------------------------------
|
|
I am not sure , if the above issue might be related with the fix you made. Can you please help ?
|
|
When I shutdown the the DB server which was the original master ... after that I can start maxscale again
root@node2:~# service maxscale restart
root@node2:~# maxctrl list servers
┌────────┬─────────────┬──────┬─────────────┬─────────────────┬─────────────────────┐
│ Server │ Address │ Port │ Connections │ State │ GTID │
├────────┼─────────────┼──────┼─────────────┼─────────────────┼─────────────────────┤
│ node4 │ 10.10.10.13 │ 3306 │ 0 │ Down │ │
├────────┼─────────────┼──────┼─────────────┼─────────────────┼─────────────────────┤
│ node5 │ 10.10.10.14 │ 3306 │ 0 │ Master, Running │ 10101014-10101014-2 │
├────────┼─────────────┼──────┼─────────────┼─────────────────┼─────────────────────┤
│ node6 │ 10.10.10.15 │ 3306 │ 0 │ Slave, Running │ 10101014-10101014-2 │
└────────┴─────────────┴──────┴─────────────┴─────────────────┴─────────────────────┘
|
|
This was a debug version of the packages which appears to have triggered a debug assertion. A look at the source code points out that it is dealing with the GTID tracking mechanism and it assumes that the slaves use the same GTID as the master. Can you show the output of the following query?
SHOW VARIABLES LIKE '%gtid%';
|
Meanwhile, I'll remove the assertion and start a fresh build.
|
|
From node4 (Old master)
+------------------------+----------------------+
|
| Variable_name | Value |
|
+------------------------+----------------------+
|
| gtid_binlog_pos | 10101013-10101013-20 |
|
| gtid_binlog_state | 10101013-10101013-20 |
|
| gtid_current_pos | 10101013-10101013-20 |
|
| gtid_domain_id | 10101013 |
|
| gtid_ignore_duplicates | ON |
|
| gtid_seq_no | 0 |
|
| gtid_slave_pos | |
|
| gtid_strict_mode | ON |
|
| last_gtid | |
|
| wsrep_gtid_domain_id | 0 |
|
| wsrep_gtid_mode | OFF |
|
+------------------------+----------------------+
|
Node6 (Slave)
MariaDB [(none)]> SHOW VARIABLES LIKE '%gtid%';
|
+------------------------+------------------------------------------+
|
| Variable_name | Value |
|
+------------------------+------------------------------------------+
|
| gtid_binlog_pos | 10101013-10101013-20,10101014-10101014-2 |
|
| gtid_binlog_state | 10101013-10101013-20,10101014-10101014-2 |
|
| gtid_current_pos | 10101013-10101013-20,10101014-10101014-2 |
|
| gtid_domain_id | 10101015 |
|
| gtid_ignore_duplicates | ON |
|
| gtid_seq_no | 0 |
|
| gtid_slave_pos | 10101013-10101013-20,10101014-10101014-2 |
|
| gtid_strict_mode | ON |
|
| last_gtid | |
|
| wsrep_gtid_domain_id | 0 |
|
| wsrep_gtid_mode | OFF |
|
+------------------------+------------------------------------------+
|
Node5 (New Master / Earlier slave)
—
vagrant@node5:~$ sudo mysql -e "SHOW VARIABLES LIKE '%gtid%';"
|
+------------------------+------------------------------------------+
|
| Variable_name | Value |
|
+------------------------+------------------------------------------+
|
| gtid_binlog_pos | 10101013-10101013-20,10101014-10101014-2 |
|
| gtid_binlog_state | 10101013-10101013-20,10101014-10101014-2 |
|
| gtid_current_pos | 10101013-10101013-20,10101014-10101014-2 |
|
| gtid_domain_id | 10101014 |
|
| gtid_ignore_duplicates | ON |
|
| gtid_seq_no | 0 |
|
| gtid_slave_pos | 10101013-10101013-20 |
|
| gtid_strict_mode | ON |
|
| last_gtid | |
|
| wsrep_gtid_domain_id | 0 |
|
| wsrep_gtid_mode | OFF |
|
+------------------------+------------------------------------------+
|
|
|
I would expect mariadb to sync the new slave (old master) from new master or just show down but not crash maxscale
|
|
The custom build you tried was a debug build so in a release build it would not have crashed. I have build new packages from commit cfa7a02a0818a0f001912169149ddb7fa640a391 which fixes the debug assertion: http://max-tst-01.mariadb.com/ci-repository/2.2-markusjm-mar5-2/mariadb-maxscale/
As these are development packages, debug assertions are enabled to help us catch bugs when testing MaxScale. For verification purposes, please try and see if these packages fix the problems.
|
|
Thanks for taking care of this bug so soon. the new build does not crash maxscale.
Will this fix be part of 2.2.3 or 2.3 ?
|
|
The fix will be a part of 2.2.3.
|