[MXS-2547] Stop MaxScale during Rest-API query cause process hung Created: 2019-06-06 Updated: 2020-01-08 Resolved: 2019-06-19 |
|
| Status: | Closed |
| Project: | MariaDB MaxScale |
| Component/s: | REST-API |
| Affects Version/s: | 2.2.21, 2.3.7 |
| Fix Version/s: | 2.2.22, 2.3.9 |
| Type: | Bug | Priority: | Major |
| Reporter: | lishubing | Assignee: | markus makela |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | None | ||
| Description |
|
Stopping MaxScale will stop all the running workers(including worker#0), and the exit of worker#0 interrupt the processing of microhttpd API query. In the resource_handle_request function, microhttpd thread will post a task to worker#0 and wait for the semaphore. But when worker#0 shutdowns without finish processing that semaphore, microhttpd thread will be blocked. And then the stopping of MaxScale continues, calls MHD_stop_daemon afterwhile, that will call thread_join on microhttpd threads, then dramatically hung by the blocked thread. You may reproduce the bug in this way:
In most cases, the MaxScale process will be hung and stop responding to any request. (It's really easy to reproduce for me) And here is a sample stack info of a blocked micorhttpd thread: do_futex_wait.constprop 0x00007ffff7bccafb |
| Comments |
| Comment by lishubing [ 2019-06-06 ] | |||||||||||||||||||||||||||
|
My workaround is: before stop MaxScale workers, call MHD_quiesce_daemon to stop API server from listening, then sleep(1) to wait for worker finish their task, then the process goes on and terminate Maxscale successfully. | |||||||||||||||||||||||||||
| Comment by markus makela [ 2019-06-10 ] | |||||||||||||||||||||||||||
|
I think the REST API must be stopped before the workers are stopped to prevent this from happening. | |||||||||||||||||||||||||||
| Comment by lishubing [ 2019-06-11 ] | |||||||||||||||||||||||||||
|
In my case, an external monitor service keeps catching MaxScale information by calling the REST API (with 3s interval). Apparently, the external service is not aware of when to stop calling the REST API, so it's a common case of calling REST API during shutdown MaxScale. Back to your question, "REST API must be stopped before the workers are stopped", however, the shutdown procedure is simply exit workers. It means that when a shutdown is triggered, the worker is already stopped, so the REST API just cannot be stopped before the shutdown. | |||||||||||||||||||||||||||
| Comment by markus makela [ 2019-06-18 ] | |||||||||||||||||||||||||||
|
Managed to partially reproduce this by adding a debug assertion that catches if a message is posted to a worker that has already stopped. | |||||||||||||||||||||||||||
| Comment by markus makela [ 2019-06-18 ] | |||||||||||||||||||||||||||
|
Stacktrace:
|