[MDEV-19508] Issue with: #define SI_KERNEL in include/my_pthread.h Created: 2019-05-17  Updated: 2021-07-07  Resolved: 2021-07-07

Status: Closed
Project: MariaDB Server
Component/s: Compiling
Affects Version/s: 10.4
Fix Version/s: 10.4.19, 10.5.10, 10.6.0

Type: Bug Priority: Major
Reporter: Anel Husakovic Assignee: Daniel Black
Resolution: Fixed Votes: 0
Labels: compile, foundation
Environment:

AIX


Issue Links:
Blocks
blocks MDEV-20178 MariaDB server does not compile on AIX Closed
Relates
relates to MDEV-19512 Foundation: MCAs patches Open

 Description   

Compilation error:

mariadb-10.4.3/sql/mysqld.cc:3272:36: error: 'SI_KERNEL' was not declared in this scope
       if (!abort_loop && origin != SI_KERNEL)



 Comments   
Comment by Etienne Guesnet [ 2019-06-20 ]

I confirm this bug on AIX 7.2 Power 8, with MariaDB 10.4.6.

SI_KERNEL is defined in include/my_pthread.h, line 866,

#ifdef HAVE_SIGWAITINFO
[...]
#else
#define SI_KERNEL 128

I find in CMakeCache.txt that we have HAVE_SIGWAITINFO to 1.

This bug can probably be found in all plateform with HAVE_SIGWAITINFO.

MariaDB may define SI_KERNEL in all case, or ensure the code using it (sql/mysqld.cc, line 3292) is not compiled if HAVE_SIGWAITINFO is 1.

Comment by Anel Husakovic [ 2019-06-26 ]

Hi EGuesnet,
can you please create patch and raise an Pull request, so your patch could be visible to other developers ?
Thanks.

Comment by Etienne Guesnet [ 2019-07-08 ]

Done [here](https://github.com/MariaDB/server/pull/1357). Sorry for the delay. I hope it is correct, I do not use Github frequently.

Comment by Sergey Vojtovich [ 2019-07-12 ]

I wonder why original fix didn't do something like setsid() instead?

Comment by Sergei Golubchik [ 2019-08-24 ]

EGuesnet, I see that AIX has sigwaitinfo() and siginfo_t. What are possible values of siginfo_t.si_code ?

Comment by Etienne Guesnet [ 2019-08-26 ]

It is an integer from 0 to 74, or -3. What exactly do you want?

Comment by Sergei Golubchik [ 2019-08-26 ]

In Linux it's something like

enum
{
  SI_ASYNCNL = -60,		/* Sent by asynch name lookup completion.  */
  SI_DETHREAD = -7,		/* Sent by execve killing subsidiary threads.  */
  SI_TKILL,			/* Sent by tkill.  */
  SI_SIGIO,			/* Sent by queued SIGIO. */
  SI_ASYNCIO,			/* Sent by AIO completion.  */
  SI_MESGQ,			/* Sent by real time mesq state change.  */
  SI_TIMER,			/* Sent by timer expiration.  */
  SI_QUEUE,			/* Sent by sigqueue.  */
  SI_USER,			/* Sent by kill, sigsend.  */
  SI_KERNEL = 0x80		/* Send by kernel.  */
};

and the code distinguishes SIGHUP sent by the user (kill -HUP) from the SIGHUP sent by the kernel (killing the conrolling terminal). Does the same logic work on AIX? May be with a different si_code value?

Comment by Etienne Guesnet [ 2019-08-26 ]

The code are

/*
 * the following are the code macros that are stored in the si_code
 * element of the siginfo_t struct, on a signal-specific basis.
 */
 
#define SI_USER         0       /* signal sent by another process with kill */
#define SI_UNDEFINED    8       /* siginfo_t contains partial information   */
#define SI_EMPTY        9       /* siginfo_t contains no useful information */
#define SI_QUEUE        71
#define SI_TIMER        72
#define SI_ASYNCIO      73
#define SI_MESGQ        74

As I know, there are no differences user and kernel SIGHUP.

Comment by Sergei Golubchik [ 2019-08-28 ]

Okay, then, indeed, perhaps we'd better use setsid() instead.

anel, to clarify the above, I mean:

  • revert my commit with SI_KERNEL
  • push into buildbot to verify that installation on Debian started failing again
  • implement setsid() fix
  • verify that installation on Debian no longer fails
Comment by Sergey Vojtovich [ 2019-08-28 ]

Note it may happen so that calling setsid() has to be conditional. It may have to go inline with setting up SIGHUP signal handler. E.g. "debug_gdb" and "bootstrap" modes may have to stay attached to controlling terminal, probably there're more modes to consider.

Comment by Daniel Black [ 2020-04-07 ]

The solution was here all along:

`<= SI_USER` is equivalent to `!= SI_KERNEL`. Works for Linux, AIX, Solaris and FreeBSD.

For completeness - FreeBSD-12.1

#if __POSIX_VISIBLE || __XSI_VISIBLE
#define SI_NOINFO       0               /* No signal info besides si_signo. */
#define SI_USER         0x10001         /* Signal sent by kill(). */
#define SI_QUEUE        0x10002         /* Signal sent by the sigqueue(). */
#define SI_TIMER        0x10003         /* Signal generated by expiration of */
                                        /* a timer set by timer_settime(). */
#define SI_ASYNCIO      0x10004         /* Signal generated by completion of */
                                        /* an asynchronous I/O request.*/
#define SI_MESGQ        0x10005         /* Signal generated by arrival of a */
                                        /* message on an empty message queue. */
#define SI_KERNEL       0x10006
#define SI_LWP          0x10007         /* Signal sent by thr_kill */
#endif

PR ready to go: https://github.com/MariaDB/server/pull/1392

EGuesnet has tested per https://github.com/MariaDB/server/pull/1357#issuecomment-610341999

Comment by Daniel Black [ 2020-04-07 ]

Or if absolute existing compatibility is required (as some FreeBSD make the substitution less obvious):

#ifdef SI_KERNEL
      if (!abort_loop && origin != SI_KERNEL)
#else
      if (!abort_loop && origin <= SI_USER)
#endif

Comment by Sergey Vojtovich [ 2020-04-28 ]

Competing fix: https://github.com/MariaDB/server/commit/6b032039cc2c50167558e805b52a13decd9fb52b

Comment by Daniel Black [ 2020-04-29 ]

svoj,

Nice reference https://www.freedesktop.org/software/systemd/man/daemon.html!

Note: https://developer.apple.com/library/archive/documentation/MacOSX/Conceptual/BPSystemStartup/Chapters/CreatingLaunchdJobs.html "Do not call setsid to create a new session.", maybe not applicable - I don't know how MariaDB works on OSX.

Comment by Sergey Vojtovich [ 2020-04-29 ]

One doesn't need to call setsid() under systemd either, because it is already called. But it must be harmless: it should be no-op in this case.

Comment by Daniel Black [ 2020-05-21 ]

Great. Can it be merged?

Comment by Robert Bindar [ 2020-10-28 ]

I reviewed svoj's patch a while ago and IIRC the only concern I had at that time was whether or not `setsid()` would affect the distribution of signals, and potentially break parts of the server or multi-process setups such as Galera or Debian scripts or even external unknown-to-us scripts that rely on signals happening in a particular way.

Svoj argued that windows and embedded are not affected and that systemd/launchd should work.

The question is how do we test the patch to make sure at least scripts we can control don't break?
Elena suggested that testing this is "as whitebox as it gets" and that, in terms of external scripts we don't control, we should also think if we want to guarantee any specific behavior upon sending any specific signals.

Can danblack or serg help with any suggestions on this?

Comment by Daniel Black [ 2020-11-03 ]

serg's test plan from comment 2019-08-29

Test 2 from 1st paragraph of original commit 07e9b1389857, run mysqld in terminal in foreground. gdb attach from different terminal. Kill terminal. Trace generated SIGHUP and if it results in a reload. Add setsid patch. Observe behaviour.

Comment by Robert Bindar [ 2020-11-11 ]

Hi danblack, sorry for the delay. Sure, but manual test will only conclude that SIGHUP is properly handled and that Sergei's bug doesn't reproduce anymore. But I don't see how it addresses the concerns Svoj, Elena and Robert pointed out, setsid might have side effects in some weird corner cases. It is also true that I don't have a solution on how to test this myself properly unless some more experienced dev helps me out. So if serg is ok with taking the risk to manually test this like he suggested, push the patch and address (if any) bugs that are discovered at some later point, feel free to push.

Comment by Daniel Black [ 2020-11-17 ]

By Elena's quote above, suggests that it's up to us to define if there is expected signal behaviour. By your concerns you seem to think we should conform to something regarding signal handling. If these two are compatible correct statements its up to you to examine what behaviour existing scripts, just a minimum of galera and debian packaging expect. Given these are bash scripts the signal handling is usually fairly explicit. Come up with an expected signal behaviour.

"debug_gdb" mode, I assume this is a `mtr --gdb`, have you tested this? --bootstrap AFAIK doesn't depend on any signalling or session information, just that stdin contains the bootstrap information.

Comment by Daniel Black [ 2021-04-08 ]

A work around commit was finally added in 10.4 - https://github.com/MariaDB/server/commit/f69c1c9dcb815d7597ec2035470a81ac3b6c9380

Generated at Thu Feb 08 08:52:14 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.