[MDEV-23915] error ER_KILL_DENIED_ERROR not passed a thread id Created: 2020-10-08  Updated: 2022-03-15  Resolved: 2022-03-15

Status: Closed
Project: MariaDB Server
Component/s: Server
Affects Version/s: 10.5.5
Fix Version/s: 10.2.44, 10.3.35, 10.4.25, 10.5.16, 10.6.8, 10.7.4, 10.8.3

Type: Bug Priority: Major
Reporter: Otto Kekäläinen Assignee: Daniel Black
Resolution: Fixed Votes: 0
Labels: None


 Description   

While importing 10.5.5 to Debian, I noticed it is constantly failing on this test on arch powerpc:

main.grant_kill                          w15 [ fail ]
        Test ended at 2020-10-07 22:22:01
 
CURRENT_TEST: main.grant_kill
--- /<<PKGBUILDDIR>>/mysql-test/main/grant_kill.result	2020-08-07 12:57:35.000000000 +0000
+++ /<<PKGBUILDDIR>>/mysql-test/main/grant_kill.reject	2020-10-07 22:22:01.121564314 +0000
@@ -20,7 +20,7 @@
 foo
 root
 KILL ID;
-ERROR HY000: You are not owner of thread ID
+ERROR HY000: You are not owner of thread 0
 disconnect foo;
 disconnect bar;
 connection default;

See build example at
https://buildd.debian.org/status/fetch.php?pkg=mariadb-10.5&arch=powerpc&ver=1%3A10.5.5-3%7Eexp2&stamp=1602109617&raw=0

Build passes since the debian/rules file is set to ignore powerpc test failures as in previous MariaDB versions there were way too many.

There is nothing to check in previous MariaDB versions since this check didn't exist until 10.5 when introduced by bar in a1e330de5a37f88339f4a5b46231a41eb60f43d2



 Comments   
Comment by Otto Kekäläinen [ 2020-10-11 ]

I filed this downstream in Debian as https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=972056 in case some powerpc experts from Debian could help out.

Comment by Elena Stepanova [ 2020-10-18 ]

Do you have access to the machine? Can you run this test separately, previously having commented the line 26 (--replace_result $id ID).
The only partial explanation I have for the phenomenon is that the processlist returns as a process ID something which the error message is forced to convert into zero. I wonder what this "something" is.

Comment by Marko Mäkelä [ 2020-10-19 ]

I see the following in the build log:

-- Check if the system is big endian - big endian

The error code is ER_KILL_DENIED_ERROR. It is a possible return value of kill_one_thread() and kill_threads_for_user(). In sql_kill_user(), we seem to be issuing a mismatching error message:

static void __attribute__ ((noinline))
sql_kill_user(THD *thd, LEX_USER *user, killed_state state)
{
  uint error;
  ha_rows rows;
  if (likely(!(error= kill_threads_for_user(thd, user, state, &rows))))
    my_ok(thd, rows);
  else
  {
    /*
      This is probably ER_OUT_OF_RESOURCES, but in the future we may
      want to write the name of the user we tried to kill
    */
    my_error(error, MYF(0), user->host.str, user->user.str);
  }
}

The message expects one parameter of type unsigned long, not two "%s" parameters:

        eng "You are not owner of thread %lu"

However, the reported error probably is in another caller:

static
void sql_kill(THD *thd, longlong id, killed_state state, killed_type type)
{
  uint error;
  if (likely(!(error= kill_one_thread(thd, id, state, type))))
  {
    if (!thd->killed)
      my_ok(thd);
    else
      thd->send_kill_message();
  }
  else
    my_error(error, MYF(0), id);
}

Note: we are passing longlong id to my_error(), even though the format string is expecting unsigned long. Let us ignore the sign mismatch and consider what happens if sizeof(unsigned long) < sizeof(longlong). On a little-endian system, we would pass the least significant bytes first, and the output message would truncate the parameter to the least signficant 32 bits. But, on a big-endian system, we would pass the most significant bytes first, and we might output id >> 32, which would explain the difference.

I think that two things need to be fixed:

  1. The format parameter has to be changed to "%lld" to match sql_kill(). (This should fix the reported failure.)
  2. The function sql_kill_user() must be changed to pass correct parameters.
Comment by Daniel Black [ 2020-10-19 ]

From the build logs you are actually cross compiling - ppc64 -> powerpc

Cross compiling requires a set of parameters related to the destination architecture to prevent the detection that occurs from compile time checks.

ref:
https://mariadb.com/kb/en/cross-compiling-mariadb/

e.g. in the build log

from build log

-- Check size of sigset_t
-- Check size of sigset_t - done
-- Check size of mode_t
-- Check size of mode_t - done
-- Check size of sighandler_t
-- Check size of sighandler_t - done
-- Check size of in_addr_t
-- Check size of in_addr_t - done
-- Check size of char *
-- Check size of char * - done
-- Check size of long
-- Check size of long - done
-- Check size of size_t
-- Check size of size_t - done
-- Check size of short
-- Check size of short - done
-- Check size of int
-- Check size of int - done
-- Check size of long long
-- Check size of long long - done
-- Check size of off_t
-- Check size of off_t - done
..
-- Check size of uint
-- Check size of uint - done
-- Check size of ulong
-- Check size of ulong - done

Some of these are used in the code (git grep SIZEOF_). Without accurate numbers that represent the powerpc runtime these are likely to generate unpredictable results.

Most of the include header, symbol checks are going to be ok regarding the cross compile however get the sizes right, the stack direction is the same.

So its really only the sizes you need to get right. I assume something else in the debian build for powerpc has a cmake toolchain file that has these values.

Comment by Tuukka Pasanen [ 2022-02-22 ]

danblack and marko Is this bug relevant anymore? Do I understand this is mainly cross compiling problem which should be addressed on Debian build system?

Comment by Daniel Black [ 2022-02-22 ]

https://github.com/MariaDB/server/pull/1805 (merged) has the hooks to do a cross compulation properly. But yes Debian build system problem that could be worked around, but really are there powerpc (the really old 32bit) users that care?

Comment by Daniel Black [ 2022-02-22 ]

Actually, marko is correct also. Debian's build system is faking it wrong, but this is our error. Writing patch now.

Comment by Otto Kekäläinen [ 2022-02-26 ]

This is still present on 10.6.7 when uploaded to Debian, see log at https://buildd.debian.org/status/fetch.php?pkg=mariadb-10.6&arch=powerpc&ver=1%3A10.6.7-2%7Eexp1&stamp=1645782869&raw=0

Comment by Daniel Black [ 2022-03-01 ]

code in pull request 2028

Comment by Otto Kekäläinen [ 2022-03-02 ]

I backported this to 10.6 in Debian and it worked so far on both Launchpad and official Debian builders. Thanks!
https://salsa.debian.org/mariadb-team/mariadb-server/-/commit/d8b6f1399697c701ae769c8055c57d762f44e50a

Comment by Otto Kekäläinen [ 2022-03-13 ]

Weirdly arch powerpc only fails now with:

```
main.grant_kill w10 [ fail ]
Test ended at 2022-03-11 17:19:58

CURRENT_TEST: main.grant_kill
— /<<PKGBUILDDIR>>/mysql-test/main/grant_kill.result 2022-02-10 20:07:03.000000000 +0000
+++ /<<PKGBUILDDIR>>/mysql-test/main/grant_kill.reject 2022-03-11 17:19:58.158444112 +0000
@@ -20,7 +20,7 @@
foo
root
KILL ID;
-ERROR HY000: You are not owner of thread ID
+ERROR HY000: You are not owner of thread 0
disconnect foo;
disconnect bar;
connection default;

mysqltest: Result length mismatch
```

Full log: https://buildd.debian.org/status/fetch.php?pkg=mariadb-10.6&arch=powerpc&ver=1%3A10.6.7-3&stamp=1647019335&raw=0
Build overview of 1:10.6.7-3: https://buildd.debian.org/status/package.php?p=mariadb-10.6

Comment by Michael Widenius [ 2022-03-15 ]

Ok to push

Generated at Thu Feb 08 09:26:01 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.