[MXS-19] bugzillaId-718: maxscale stops and kill -9 is necessary Created: 2015-01-04  Updated: 2015-08-19  Resolved: 2015-03-19

Status: Closed
Project: MariaDB MaxScale
Component/s: Core
Affects Version/s: 1.0.4
Fix Version/s: 1.0.5

Type: Bug Priority: Minor
Reporter: lisu87 Assignee: Timofey Turenko
Resolution: Cannot Reproduce Votes: 0
Labels: None
Environment:

Linux



 Description   

This is import of http://bugs.mariadb.com/show_bug.cgi?id=718

Yesterday we upgraded from 1.0.1beta to 1.0.4 stable version (installed using maxscale repository for ubuntu 14.04).

During the night we observed two crashes with no info logged into skygw_err1.log.

During the crashes maxscale just stopped respond to any requests and we were unable to restart and the only way to make it working again was to kill the process with -9 signal.

After the crashes we decide to downgrade to 1.0.1beta and it seems to be working fine for last few hours.



 Comments   
Comment by Dipti Joshi (Inactive) [ 2015-03-09 ]

This is comment history from bugzilla http://bugs.mariadb.com/show_bug.cgi?id=718

Comment 1 Mark Riddoch 2015-02-13 10:00:48 UTC
Please could you let us have some more information regarding this. Can we have your configuration file to look at?

Also, is it possible to let us have your other log files to see if there are any more clues here that could help us trace the issue.

Could you give us some indication of the characteristic of the traffic your are passing through MaxScale?

Thanks
Mark

Comment 2 lisu87 2015-02-16 09:38:47 UTC
Created attachment 185 [details]
Current maxscale config file

Comment 3 lisu87 2015-02-16 09:43:53 UTC
Current config file attached. It is here http://bugs.mariadb.com/attachment.cgi?id=185

Regarding other log files, I had no debug or trace logs enabled so they were empty.

Currently we're using maxscale to rw split common mysql queries.

I'm going to try current configuration also with 1.0.5-ga and debug log enabled and will give you more info soon.

Comment 4 lisu87 2015-02-17 09:09:06 UTC
It happened again during the night at 01:10 AM with 1.0.5-ga. This time I've got the following error in skygw_err1.log:

2015-02-17 01:10:02 Fatal: MaxScale received fatal signal 11. Attempting backtrace.
2015-02-17 01:10:02 /usr/local/skysql/maxscale/bin/maxscale() [0x547435]
2015-02-17 01:10:02 /lib/x86_64-linux-gnu/libpthread.so.0(+0x10340) [0x7f94e8b86340]
2015-02-17 01:10:02 [0x29905a0]

And syslog:

Feb 17 01:10:02 psy-mysqlproxy-2 MaxScale[9742]: Fatal: MaxScale received fatal signal 11. Attempting backtrace.
Feb 17 01:10:02 psy-mysqlproxy-2 MaxScale[9742]: /usr/local/skysql/maxscale/bin/maxscale() [0x547435]
Feb 17 01:10:02 psy-mysqlproxy-2 MaxScale[9742]: /lib/x86_64-linux-gnu/libpthread.so.0(+0x10340) [0x7f94e8b86340]
Feb 17 01:10:02 psy-mysqlproxy-2 MaxScale[9742]: [0x29905a0]

And another strange thing: when I run 1.0.5-ga with my config I enabled debug log in maxadmin, but despite this skygw_debug1.log contains logs just from 2 hours.

Comment 5 Timofey Turenko 2015-02-17 12:50:18 UTC
(In reply to comment #2)
> Created attachment 185 [details]
> Current maxscale config file

Probably it is unrelated to the problem, but this configuration does not have monitor, does it?

Comment 6 lisu87 2015-02-17 13:01:54 UTC
That's right, there is no monitor. We had some problems using it with older versions of mysql (5.1) so we disabled it and now all the master / slave flags are set up manually via maxadmin.

As you said it's unrelated, because this config works with maxscale 1.0.1beta.

Comment 7 Timofey Turenko 2015-02-17 13:10:05 UTC
I'm trying to reproduce it.

Following info can be useful for me:

  • which version of MySQL/MariaDB do you use for backend servers?
  • is backend Master/Slave setup? or Galera?
  • what kind of load do you have?

Comment 8 lisu87 2015-02-17 15:53:26 UTC
Versions:

carlsberg: mysql 5.0.96
psy-carslave-1: mariadb 5.5.40
psy-carslave-2: mariadb 5.5.40

billing: mysql 5.0.96
psy-bilslave-1: mariadb 5.5.40
psy-bilslave-2: mariadb 5.5.40

fatfrog: mysql 5.0.95
psy-fatslave-1: mariadb 5.5.40

We are running master/slave setup for all set of above servers.

Average loads:

carlsberg (2 cores): 1.1, 1.09, 1.06 (1min, 5min, 15min)
psy-carslave-1 (4 cores): 0.05, 0.05, 0.07 (1min, 5min, 15min)
psy-carslave-2 (4 cores): 0.14, 0.11, 0.11 (1min, 5min, 15min)

billing (2 cores): 0.1, 0.12, 0.13 (1min, 5min, 15min)
psy-bilslave-1 (4 cores): 0.10, 0.12, 0.13 (1min, 5min, 15min)
psy-bilslave-2 (4 cores): 0.19, 0.15, 0.14 (1min, 5min, 15min)

fatfrog (2 cores): 0.44, 0.41, 0.38 (1min, 5min, 15min)
psy-fatslave-1 (2 cores): 0.01, 0.02, 0.05 (1min, 5min, 15min)

That's basically it. If you need I can also provide some graphs with CPU Load and MySQL queries beeing performed of the above servers.

Comment 9 Timofey Turenko 2015-02-17 19:38:49 UTC
we managed to reproduce one crash scenario. Continue testing.

Comment by lisu87 [ 2015-03-11 ]

Any update on this?

Comment by Timofey Turenko [ 2015-03-18 ]

one crash was connected to stopped monitor (at some point maxscale was trying to stop monitor again and crashed). I tried to reproduce crash after monitor fix, but was not able to do it. I will try to run long lasting test

Comment by Timofey Turenko [ 2015-03-19 ]

tried to reproduce with 1.0.6 - not reproducible yet

Generated at Thu Feb 08 03:56:10 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.