Details
-
Bug
-
Status: Closed (View Workflow)
-
Major
-
Resolution: Cannot Reproduce
-
None
-
None
Description
I'm not sure how to recreate this - it sometimes happen when a client disconnects while the backend server is still processing a request. With 4 threads configured, maxscale uses 300% CPU (I'm guessing 3 epoll infinite loops, 1 working with the backend server).
It's sometimes impossible to get debug information when this happens, but I managed to get 'show epoll':
MaxScale> show epoll
|
|
Poll Statistics.
|
|
No. of epoll cycles: 109414647
|
No. of epoll cycles with wait: 64
|
No. of epoll calls returning events: 32
|
No. of non-blocking calls returning events: 21
|
No. of read events: 14
|
No. of write events: 17
|
No. of error events: 0
|
No. of hangup events: 3
|
No. of accept events: 3
|
No. of times no threads polling: 4
|
Current event queue length: 2
|
Maximum event queue length: 4
|
No. of DCBs with pending events: 1
|
No. of wakeups with pending queue: 2
|
No of poll completions with descriptors
|
No. of descriptors No. of poll completions.
|
1 32
|
2 0
|
3 0
|
4 0
|
5 0
|
6 0
|
7 0
|
8 0
|
9 0
|
>= 10 0
|
Maxscale has been running for a few seconds, notice the amount of epoll cycles.
It seems to be an infinite non-blocking loop with no events, here's the gdb session:
458 if (pollStats.evq_pending == 0 && timeout_bias < 10)
|
(gdb) n
|
463 atomic_add(&n_waiting, 1);
|
(gdb)
|
471 if (thread_data)
|
(gdb)
|
473 thread_data[thread_id].state = THREAD_POLLING;
|
(gdb)
|
476 atomic_add(&pollStats.n_polls, 1);
|
(gdb)
|
477 if ((nfds = epoll_wait(epoll_fd, events, MAX_EVENTS, 0)) == -1)
|
(gdb)
|
499 else if (nfds == 0 && pollStats.evq_pending == 0 && poll_spins++ > number_poll_spins)
|
(gdb)
|
514 atomic_add(&n_waiting, -1);
|
(gdb)
|
517 if (n_waiting == 0)
|
(gdb)
|
523 if (nfds > 0)
|
(gdb)
|
609 if (process_pollq(thread_id))
|
(gdb)
|
612 if (thread_data)
|
(gdb)
|
613 thread_data[thread_id].state = THREAD_ZPROCESSING;
|
(gdb)
|
614 zombies = dcb_process_zombies(thread_id);
|
(gdb)
|
615 if (thread_data)
|
(gdb)
|
616 thread_data[thread_id].state = THREAD_IDLE;
|
(gdb)
|
618 if (do_shutdown)
|
(gdb)
|
633 if (thread_data)
|
(gdb)
|
635 thread_data[thread_id].state = THREAD_IDLE;
|
(gdb)
|
637 } /*< while(1) */
|
(gdb)
|
458 if (pollStats.evq_pending == 0 && timeout_bias < 10)
|
Here's an excerpt from
{strace -tt}:
17:05:48.556305 epoll_wait(12, {}, 1000, 0) = 0
|
17:05:48.556319 epoll_wait(12, {}, 1000, 0) = 0
|
17:05:48.556334 epoll_wait(12, {}, 1000, 0) = 0
|
17:05:48.556349 epoll_wait(12, {}, 1000, 0) = 0
|
17:05:48.556363 epoll_wait(12, {}, 1000, 0) = 0
|
17:05:48.556378 epoll_wait(12, {}, 1000, 0) = 0
|
17:05:48.556392 epoll_wait(12, {}, 1000, 0) = 0
|
17:05:48.556407 epoll_wait(12, {}, 1000, 0) = 0
|
17:05:48.556421 epoll_wait(12, {}, 1000, 0) = 0
|
17:05:48.556436 epoll_wait(12, {}, 1000, 0) = 0
|
17:05:48.556450 epoll_wait(12, {}, 1000, 0) = 0
|
17:05:48.556465 epoll_wait(12, {}, 1000, 0) = 0
|
17:05:48.556479 epoll_wait(12, {}, 1000, 0) = 0
|
17:05:48.556493 epoll_wait(12, {}, 1000, 0) = 0
|
17:05:48.556507 epoll_wait(12, {}, 1000, 0) = 0
|
17:05:48.556522 epoll_wait(12, {}, 1000, 0) = 0
|
17:05:48.556536 epoll_wait(12, {}, 1000, 0) = 0
|