[MDEV-22173] OSX built mariadbd cannot connections [ERROR] Error in accept: Bad file descriptor Created: 2020-04-07  Updated: 2020-07-07  Resolved: 2020-07-07

Status: Closed
Project: MariaDB Server
Component/s: Server
Affects Version/s: 10.5.2, 10.4
Fix Version/s: 10.1.46, 10.2.33, 10.3.24, 10.4.14, 10.5.5

Type: Bug Priority: Critical
Reporter: Daniel Black Assignee: Vladislav Vaintroub
Resolution: Fixed Votes: 0
Labels: not-10.1
Environment:

OSX only


Issue Links:
Blocks
blocks MDEV-20178 MariaDB server does not compile on AIX Closed
blocks MDEV-22592 Travis-CI broken for 10.5 in recent c... Closed
Duplicate

 Description   

http://buildbot.askmonty.org/buildbot/builders/mac-1012-bintar/builds/6977

and travis e.g.

https://travis-ci.org/github/MariaDB/server/jobs/671255481

2020-04-06 14:47:52 0 [Note] Added new Master_info '' to hash table
2020-04-06 14:47:52 0 [Note] /Users/buildbot/maria-slave/mac-1012-bintar/build/sql/mariadbd: ready for connections.
Version: '10.5.3-MariaDB-log'  socket: '/Users/buildbot/maria-slave/mac-1012-bintar/build/mysql-test/var/tmp/mysqld.3.sock'  port: 16002  MariaDB Server
2020-04-06 14:47:53 0 [ERROR] Error in accept: Bad file descriptor



 Comments   
Comment by Daniel Black [ 2020-04-26 ]

still failing to pass any single test case as of merge https://github.com/MariaDB/server/commit/fbe2712705d464bf8488df249c36115e2c1f63f7

https://travis-ci.org/github/MariaDB/server/jobs/679492536
http://buildbot.askmonty.org/buildbot/builders/mac-1012-bintar/builds/7062/steps/test/logs/stdio

Comment by Etienne Guesnet [ 2020-04-28 ]

Hi,
This bug is also present on AIX.
Error is from accept(), which is present only at include/mysql/psi/mysql_socket.h, function inline_mysql_socket_accept().
If I have correctly understood the trouble, situation is the following,
In sql/mysqld.cc, in function handle_connections_sockets(), the function mysql_socket_accept() ( = inline_mysql_socket_accept) is called a first time. It works. As there is a loop, it is called a second time, it fails at

socket_accept.fd= accept(socket_listen.fd, addr, addr_len);

, but the value of fd (-1, because it has failed) is not tested. So, the next line

flags= fcntl(socket_accept.fd, F_GETFD);

fails also and we achieve the final error that is not really a Bad File.

Manual test seems OK despite this error message. I have add an error check after accept() in include/mysql/psi/mysql_socket.h:

    if (socket_accept.fd == -1) {
      return socket_accept;
    }

This code must be added twice, after the two accept(), line ~1044 and 1068.

With this, test suite is OK with similar results than before on AIX.

I suppose OSX and AIX uses accept() and Linux uses accept4(). They probably differ slightly when the socket is in use.

I do not know which commit or code added has created this issue.

Comment by Daniel Black [ 2020-04-29 ]

Thanks EGuesnet, that seems to pass the tests against 10.5 https://travis-ci.org/github/grooverdan/mariadb-server/jobs/680842564.

I in the code, I couldn't immediately return as their was a small bit of instrumentation end that needed to happen, same effect overall however.

https://github.com/MariaDB/server/pull/1518 submitted.

Original commit of that error was dced5146bdfc, though I suspect something else caused this to be noticed now. EAGAIN/EINTR are handled in handle_connections_sockets so it might be some effect of non-blocking (701e2a7edb30 maybe?).

Did you happen notice the errno? And which socket (unix vs tcp)? If its not one of the EAGAIN/INTR it will get logged back in handle_connection_sockets (though only 1 in 255).

Comment by Etienne Guesnet [ 2020-04-30 ]

> Did you happen notice the errno?

perror is

Resource temporarily unavailable

So, EAGAIN (11 on AIX). Socket is unix, I think (.sock file in local machine).

Comment by Marko Mäkelä [ 2020-05-19 ]

The pull request is for 10.1.

Recently, I have seen Mac OS X tests fail in 10.4 already. 10.2 and 10.3 appear to pass some tests, while for 10.1 44c8d84908e9d697bbcea55d6ebcd5f2250c4727, only plugins.feedback_plugin_send failed (in a similar way as it fails on 10.2+ on kvm-asan).

Generated at Thu Feb 08 09:12:45 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.