Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-22173

OSX built mariadbd cannot connections [ERROR] Error in accept: Bad file descriptor

Details

    Description

      http://buildbot.askmonty.org/buildbot/builders/mac-1012-bintar/builds/6977

      and travis e.g.

      https://travis-ci.org/github/MariaDB/server/jobs/671255481

      2020-04-06 14:47:52 0 [Note] Added new Master_info '' to hash table
      2020-04-06 14:47:52 0 [Note] /Users/buildbot/maria-slave/mac-1012-bintar/build/sql/mariadbd: ready for connections.
      Version: '10.5.3-MariaDB-log'  socket: '/Users/buildbot/maria-slave/mac-1012-bintar/build/mysql-test/var/tmp/mysqld.3.sock'  port: 16002  MariaDB Server
      2020-04-06 14:47:53 0 [ERROR] Error in accept: Bad file descriptor
      

      Attachments

        Issue Links

          Activity

            danblack Daniel Black added a comment - still failing to pass any single test case as of merge https://github.com/MariaDB/server/commit/fbe2712705d464bf8488df249c36115e2c1f63f7 https://travis-ci.org/github/MariaDB/server/jobs/679492536 http://buildbot.askmonty.org/buildbot/builders/mac-1012-bintar/builds/7062/steps/test/logs/stdio

            Hi,
            This bug is also present on AIX.
            Error is from accept(), which is present only at include/mysql/psi/mysql_socket.h, function inline_mysql_socket_accept().
            If I have correctly understood the trouble, situation is the following,
            In sql/mysqld.cc, in function handle_connections_sockets(), the function mysql_socket_accept() ( = inline_mysql_socket_accept) is called a first time. It works. As there is a loop, it is called a second time, it fails at

            socket_accept.fd= accept(socket_listen.fd, addr, addr_len);
            

            , but the value of fd (-1, because it has failed) is not tested. So, the next line

            flags= fcntl(socket_accept.fd, F_GETFD);
            

            fails also and we achieve the final error that is not really a Bad File.

            Manual test seems OK despite this error message. I have add an error check after accept() in include/mysql/psi/mysql_socket.h:

                if (socket_accept.fd == -1) {
                  return socket_accept;
                }
            

            This code must be added twice, after the two accept(), line ~1044 and 1068.

            With this, test suite is OK with similar results than before on AIX.

            I suppose OSX and AIX uses accept() and Linux uses accept4(). They probably differ slightly when the socket is in use.

            I do not know which commit or code added has created this issue.

            EGuesnet Etienne Guesnet added a comment - Hi, This bug is also present on AIX. Error is from accept() , which is present only at include/mysql/psi/mysql_socket.h , function inline_mysql_socket_accept() . If I have correctly understood the trouble, situation is the following, In sql/mysqld.cc , in function handle_connections_sockets() , the function mysql_socket_accept() ( = inline_mysql_socket_accept ) is called a first time. It works. As there is a loop, it is called a second time, it fails at socket_accept.fd= accept(socket_listen.fd, addr, addr_len); , but the value of fd (-1, because it has failed) is not tested. So, the next line flags= fcntl(socket_accept.fd, F_GETFD); fails also and we achieve the final error that is not really a Bad File. Manual test seems OK despite this error message. I have add an error check after accept() in include/mysql/psi/mysql_socket.h : if (socket_accept.fd == -1) { return socket_accept; } This code must be added twice, after the two accept() , line ~1044 and 1068. With this, test suite is OK with similar results than before on AIX. I suppose OSX and AIX uses accept() and Linux uses accept4() . They probably differ slightly when the socket is in use. I do not know which commit or code added has created this issue.
            danblack Daniel Black added a comment -

            Thanks EGuesnet, that seems to pass the tests against 10.5 https://travis-ci.org/github/grooverdan/mariadb-server/jobs/680842564.

            I in the code, I couldn't immediately return as their was a small bit of instrumentation end that needed to happen, same effect overall however.

            https://github.com/MariaDB/server/pull/1518 submitted.

            Original commit of that error was dced5146bdfc, though I suspect something else caused this to be noticed now. EAGAIN/EINTR are handled in handle_connections_sockets so it might be some effect of non-blocking (701e2a7edb30 maybe?).

            Did you happen notice the errno? And which socket (unix vs tcp)? If its not one of the EAGAIN/INTR it will get logged back in handle_connection_sockets (though only 1 in 255).

            danblack Daniel Black added a comment - Thanks EGuesnet , that seems to pass the tests against 10.5 https://travis-ci.org/github/grooverdan/mariadb-server/jobs/680842564 . I in the code, I couldn't immediately return as their was a small bit of instrumentation end that needed to happen, same effect overall however. https://github.com/MariaDB/server/pull/1518 submitted. Original commit of that error was dced5146bdfc, though I suspect something else caused this to be noticed now. EAGAIN/EINTR are handled in handle_connections_sockets so it might be some effect of non-blocking (701e2a7edb30 maybe?). Did you happen notice the errno? And which socket (unix vs tcp)? If its not one of the EAGAIN/INTR it will get logged back in handle_connection_sockets (though only 1 in 255).

            > Did you happen notice the errno?

            perror is

            Resource temporarily unavailable
            

            So, EAGAIN (11 on AIX). Socket is unix, I think (.sock file in local machine).

            EGuesnet Etienne Guesnet added a comment - > Did you happen notice the errno? perror is Resource temporarily unavailable So, EAGAIN (11 on AIX). Socket is unix, I think (.sock file in local machine).

            The pull request is for 10.1.

            Recently, I have seen Mac OS X tests fail in 10.4 already. 10.2 and 10.3 appear to pass some tests, while for 10.1 44c8d84908e9d697bbcea55d6ebcd5f2250c4727, only plugins.feedback_plugin_send failed (in a similar way as it fails on 10.2+ on kvm-asan).

            marko Marko Mäkelä added a comment - The pull request is for 10.1. Recently, I have seen Mac OS X tests fail in 10.4 already. 10.2 and 10.3 appear to pass some tests, while for 10.1 44c8d84908e9d697bbcea55d6ebcd5f2250c4727, only plugins.feedback_plugin_send failed (in a similar way as it fails on 10.2+ on kvm-asan).

            People

              wlad Vladislav Vaintroub
              danblack Daniel Black
              Votes:
              0 Vote for this issue
              Watchers:
              6 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.