[MXS-3226] Fatal: MaxScale 2.4.4 received fatal signal 11 (maxscale crash) Created: 2020-10-07  Updated: 2020-10-22  Resolved: 2020-10-21

Status: Closed
Project: MariaDB MaxScale
Component/s: readwritesplit
Affects Version/s: 2.4.4
Fix Version/s: N/A

Type: Bug Priority: Critical
Reporter: steven lin Assignee: Unassigned
Resolution: Fixed Votes: 0
Labels: None
Environment:

CentOS Linux release 7.7.1908 (Core)


Attachments: JPEG File Noname.jpg    

 Description   

Our maxscale crashed frequently.
16 core cpu , 8G ram
----------------------------------------------------------------------------------------------------
2020-10-06 20:37:04 alert : Fatal: MaxScale 2.4.4 received fatal signal 11. Commit ID: 231f68b6dc70804b785df356679601ab3eb8e379 System name: Linux Release string: NAME="CentOS Linux"
2020-10-06 20:37:05 alert :
/usr/lib64/maxscale/libreadwritesplit.so(_ZN14RWSplitSession17handle_got_targetEP5GWBUFPN8maxscale9RWBackendEb+0xab): include/maxscale/protocol/mysql.hh:633
/usr/lib64/maxscale/libreadwritesplit.so(_ZN14RWSplitSession17route_single_stmtEP5GWBUF+0x667): server/modules/routing/readwritesplit/rwsplit_route_stmt.cc:375
/usr/lib64/maxscale/libreadwritesplit.so(_ZN14RWSplitSession10routeQueryEP5GWBUF+0x216): server/modules/routing/readwritesplit/rwsplitsession.cc:162
/usr/lib64/maxscale/libreadwritesplit.so(_ZN8maxscale6RouterI7RWSplit14RWSplitSessionE10routeQueryEP10mxs_routerP18mxs_router_sessionP5GWBUF+0x21): include/maxscale/router.hh:452
/usr/lib64/maxscale/libmaxscale-common.so.1.0.0(+0x102440): server/core/session.cc:1065
/usr/lib64/maxscale/libmaxscale-common.so.1.0.0(_ZN7maxbase6Worker4tickEv+0x22f): maxutils/maxbase/include/maxbase/worker.hh:790
/usr/lib64/maxscale/libmaxscale-common.so.1.0.0(_ZN7maxbase11WorkerTimer6handleEPNS_6WorkerEj+0x57): maxutils/maxbase/src/worker.cc:256
/usr/lib64/maxscale/libmaxscale-common.so.1.0.0(_ZN7maxbase6Worker15poll_waiteventsEv+0x196): maxutils/maxbase/src/worker.cc:858
/usr/lib64/maxscale/libmaxscale-common.so.1.0.0(_ZN7maxbase6Worker3runEPNS_9SemaphoreE+0x53): maxutils/maxbase/src/worker.cc:559
/usr/lib64/maxscale/libmaxscale-common.so.1.0.0(+0x1b4e9f): thread48.o:?
/lib64/libpthread.so.0(+0x7e65): pthread_create.c:?
/lib64/libc.so.6(clone+0x6d): ??:?
---------------------------------------------------------------------------------------------
/etc/maxscale.cnf

[enode1]
type=server
address=10.1.0.138
port=3306
protocol=mariadbbackend
priority=2

[enode2]
type=server
address=10.1.0.139
port=3306
protocol=mariadbbackend
priority=3

[enode3]
type=server
address=10.1.0.130
port=3306
protocol=mariadbbackend
priority=1

[ESTORE_Monitor]
type=monitor
module=galeramon
servers=enode1,enode2,enode3
user=maxscale_monitor
password=xxxxxxxxxxxxxxxxxxxxxxxxxxxx
monitor_interval=3000ms
use_priority=true
#disable_master_failback=true

[ESTORE_Service]
type=service
router=readwritesplit
#slave_selection_criteria=LEAST_CURRENT_OPERATIONS
slave_selection_criteria=LEAST_GLOBAL_CONNECTIONS
servers=enode1,enode2,enode3
user=maxscale_monitor
password=xxxxxxxxxxxxxxxxxxxxxxxxxx
enable_root_user=1
connection_timeout=1800s
max_connections=1500
max_slave_connections=1
log_auth_warnings=true
master_accept_reads=true
master_failure_mode=fail_on_write
master_reconnection=true
max_sescmd_history=50
prune_sescmd_history=true

[ESTORE_Listener]
type=listener
service=ESTORE_Service
protocol=mariadbclient
address=0.0.0.0
port=3308



 Comments   
Comment by steven lin [ 2020-10-07 ]

It seems that when we had large write SQLs , maxscale crashed.
for example : insert into table values (xxxx),(xxxx),(xxxx)

Comment by steven lin [ 2020-10-07 ]

I use Percona XtrDB Cluster 5.7 as backend database.

Thanks

Comment by markus makela [ 2020-10-07 ]

Does this happen with the latest 2.4 release?

Comment by steven lin [ 2020-10-07 ]

We can't upgrade right now, because this is our production database.

Comment by markus makela [ 2020-10-07 ]

It's likely that this is fixed in newer releases. There are a few commits that could relate to this (MXS-2585) so I'd recommend upgrading to a recent release.

Comment by steven lin [ 2020-10-07 ]

I've upgraded to the latest version 2.4.12.
I'll check the status of new version.

Thanks for your recommendation.

Comment by steven lin [ 2020-10-16 ]

May I ask a question?
Why did maxscale process use so much memory ? And It 's still increasing.
our clients have closed their connections after using database.

Comment by markus makela [ 2020-10-16 ]

Looks like a memory leak. Is this a production setup or can you run MaxScale under Valgrind to see whether it truly leaks memory?

Comment by steven lin [ 2020-10-16 ]

Yes, It's our production. And the version is 2.4.12.

Comment by markus makela [ 2020-10-16 ]

Could be related to MXS-3068.

Comment by steven lin [ 2020-10-16 ]

I think if you have a busy maxscale.
You will be very easy to find the issue.
The newest version 2.4.12 also has the issue.
That means every version has the issue.

Comment by markus makela [ 2020-10-19 ]

Have you been able to reproduce this crash with the latest 2.4 release?

Comment by steven lin [ 2020-10-21 ]

I modified max_sescmd_history from 100 to 1000.
Actually ,that was just a try.

The version,2.4.12 is stable and not crash now.
And the memory use keeps at 8G.

If there is a crash, I will report here.
Thanks

Comment by markus makela [ 2020-10-21 ]

I'll close this as Fixed in 2.4.12 since the latest version doesn't suffer from it. If you see the crash again, let us know and we'll reopen the issue.

Regarding the memory use: if MaxScale is using large amounts of memory with the configuration you provided, there's probably something strange going on. Can you open a separate bug report for that if the memory usage is a lot higher than in the older releases?

Comment by steven lin [ 2020-10-22 ]

I created here : MXS-3253

Generated at Thu Feb 08 04:19:52 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.