[MXS-4811] Error handling differences between running maxctrl directly or in a subshell Created: 2023-10-13  Updated: 2023-11-14  Resolved: 2023-10-16

Status: Closed
Project: MariaDB MaxScale
Component/s: maxctrl
Affects Version/s: 23.02.2
Fix Version/s: 6.4.11, 22.08.9, 23.02.5, 23.08.2

Type: Bug Priority: Major
Reporter: Hartmut Holzgraefe Assignee: markus makela
Resolution: Fixed Votes: 0
Labels: None

Issue Links:
Problem/Incident
is caused by MXS-4810 --timeout doesn't work with multiple ... Closed

 Description   

This came up in the same context as MXS-4810, background is trying to run maxctrl in a shell script, fetching the output into a variable, basically:

result=$(maxctrl ... 2>&1)
if test $?
then
  ... error handling ...
fi

When hitting MXS-4810 a direct call of maxctrl leads to a non-zero error code being returned as expected, but when using maxctrl in a subshell to asign its output to a variable the error message changes, and the returned error code is zero, indicating success.

When running in a different error than MXS-4810, e.g. when only a single host is given and the 10 second timeout kicks in a non-zero error code is returned in both cases as expected.

Below is "how to reproduce" for both this and MXS-4810 as I originally mentioned in chat:

This is what I get when trying to contact a maxscale that does not actually exist:

$ time maxctrl list servers --hosts 172.29.7.245:8989
Error: timeout of 10000ms exceeded
 
real	0m10.275s
user	0m0.233s
sys	0m0.048s

but when trying with two host entries with no maxscale instance behind them it takes 2min10s instead of 10s, and the error message is different:

$ time maxctrl list servers --hosts 172.29.7.245:8989,172.29.7.246:8989
Error: connect ETIMEDOUT 172.29.7.246:8989
    at TCPConnectWrap.afterConnect [as oncomplete] (net.js:1159:16) {
  errno: -110,
  code: 'ETIMEDOUT',
  syscall: 'connect',
  address: '172.29.7.246',
  port: 8989,
[... lots of additiona JSON lines ...]
  response: undefined,
  isAxiosError: true,
  toJSON: [Function: toJSON]
}
 
real	2m9.818s
user	0m0.413s
sys	0m0.030s

in both cases the exit status in $? is non-zero at least, as expected
now lets try this in a subshell to fetch the returned text on stdout in a variable:

$ result=$(time maxctrl list servers --hosts 172.29.7.245:8989)
Error: timeout of 10000ms exceeded
 
real	0m10.226s
user	0m0.180s
sys	0m0.039s
 
$ echo $?
1

no surprise yet, but now with two hosts again:

$ result=$(maxctrl list servers --hosts 172.29.7.245:8989,172.29.7.246:8989)
(node:74204) UnhandledPromiseRejectionWarning: TypeError: undefined is not a function
    at Object.strip_colors (/snapshot/maxctrl/lib/utils.js)
    at print (/snapshot/maxctrl/maxctrl.js)
    at err (/snapshot/maxctrl/maxctrl.js)
    at processTicksAndRejections (internal/process/task_queues.js:95:5)
(Use `maxctrl --trace-warnings ...` to show where the warning was created)
(node:74204) UnhandledPromiseRejectionWarning: Unhandled promise rejection. This error originated either by throwing inside of an async function without a catch block, or by rejecting a promise which was not handled with .catch(). To terminate the node process on unhandled promise rejection, use the CLI flag `--unhandled-rejections=strict` (see https://nodejs.org/api/cli.html#cli_unhandled_rejections_mode). (rejection id: 2)
(node:74204) [DEP0018] DeprecationWarning: Unhandled promise rejections are deprecated. In the future, promise rejections that are not handled will terminate the Node.js process with a non-zero exit code.
 
$ echo $?
0

so now getting a weird node exception instead, and worse: a SUCCESS exit code

i can understand the different timeout behavior if the application level timeout of 10s is not getting set up again a 2nd time before trying the next host entry

but I am at a total loss at why it would throw different errors when executed directly vs. within a subshell?



 Comments   
Comment by markus makela [ 2023-10-13 ]

Managed to reproduce this locally:

[markusjm@monolith build-develop]$ result=$(bin/maxctrl list servers --hosts 172.29.7.245:8989,172.29.7.246:8989)
(node:654053) UnhandledPromiseRejectionWarning: TypeError: undefined is not a function
    at Object.strip_colors (/snapshot/maxctrl/lib/utils.js)
    at print (/snapshot/maxctrl/maxctrl.js)
    at err (/snapshot/maxctrl/maxctrl.js)
    at processTicksAndRejections (internal/process/task_queues.js:95:5)
(Use `maxctrl --trace-warnings ...` to show where the warning was created)
(node:654053) UnhandledPromiseRejectionWarning: Unhandled promise rejection. This error originated either by throwing inside of an async function without a catch block, or by rejecting a promise which was not handled with .catch(). To terminate the node process on unhandled promise rejection, use the CLI flag `--unhandled-rejections=strict` (see https://nodejs.org/api/cli.html#cli_unhandled_rejections_mode). (rejection id: 2)
(node:654053) [DEP0018] DeprecationWarning: Unhandled promise rejections are deprecated. In the future, promise rejections that are not handled will terminate the Node.js process with a non-zero exit code.
[markusjm@monolith build-develop]$ echo $?
0

With the fix to MXS-4810 applied, the behavior seems to be correct:

[markusjm@monolith build-develop]$ result=$(bin/maxctrl list servers --hosts 172.29.7.245:8989,172.29.7.246:8989)
[markusjm@monolith build-develop]$ echo $?
1

Generated at Thu Feb 08 04:31:19 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.