hholzgra, to the general problem about why does not server writes the last goodbye message to peer when closing the connection. In many cases, such as pooled connections, client won't be actively reading, and server sends a message, and closes the socket, and since there is data on that connection, and the client does not close it, the connection will continue to persist in TCP, using its resources, and client's ephemeral ports.
Another example:
Server sends large result set. Client is being too slow reading it. socket send buffer fills up, server's send() runs into timeout. Here, sending more to the client, e.g error packet "I'm closing you because I'm running into send() timeout" does not make any sense, in fact, server will run into send() timeout once again while trying to send it.
So, the correct thing to do, in most cases, where server closes the connection first, is I think is just to close the connection and do it "forcefully", with TCP reset. https://jira.mariadb.org/browse/MDEV-14113 discusses that in some more detail. Closing connection gracefully will only result in client reading 0 bytes, and TCP ephemeral port shortage on client, and other TCP resources on server, due to TIME_WAIT state.
There are some exception to this, when client connection is active, i.e query is running, and connection is being killed. A notable case where server should send a message are problems during connection setup, "Too many connections", "user blocked", "authentication error", and such. client is actively reading server response in this case.
But otherwise, I guess logging the error server-side, and killing TCP connection without the last goodbye, would be the correct thing to do.
Finally, there used to be some problems with "out-of-band" error messages, misinterpreted by multiple client drivers as protocol breakage, see MDEV-19893 for example.
Also, the well-meant "connection was killed" message turned up pretty bad for the client connection pools. https://github.com/sqlalchemy/sqlalchemy/issues/4945. TCP close with reset would be much-much better.
What settings do you use , so that queueing time actually matters? 30 sec queueing as in your example is a long time.
The responsibility of inactivity can be on the server side, due to misconfiguration, or network can be blamed, too. Server cannot accurately count client's idle time, it is not the same process, and is often a separate machine .
We do not know for sure when the client request arrives either. It can stay in the epoll queue, until it is picked up by listener. And listener might temporarily assume "worker" responsibility, so that a thread group has no listener at all during that time. Usually this period is short, but this of course can be misconfigured with huge stall_limit.