[MDEV-6756] map a linux pid (child pid) to a connection id shown in the output of SHOW PROCESSLIST Created: 2014-09-18  Updated: 2016-03-18  Resolved: 2015-09-17

Status: Closed
Project: MariaDB Server
Component/s: OTHER
Fix Version/s: 10.1.8

Type: Task Priority: Major
Reporter: Harry Higgs Assignee: Oleksandr Byelkin
Resolution: Fixed Votes: 4
Labels: None

Issue Links:
Relates
relates to MDEV-9760 mariadb 10.1 Could add OS Process ID ... Closed
Sprint: 10.1.6-1, 10.1.6-2, 10.1.8-1, 10.1.8-2

 Description   

We have have a multi-threaded program with multiple connections to the database. While this program is running, we often get a single child process (pid) that just hangs. This particular child process consumes 100% of any given CPU. We are trying to troubleshoot this and trying to determine which child pid maps to the SHOW PROCESSLIST Id.

With that in mind, we would like to know if there is a way to map a linux child pid that's associated with the linux parent pid (ppid) of the mysqld process to the Id shown in the output of SHOW PROCESSLIST.



 Comments   
Comment by Harry Higgs [ 2014-09-19 ]

Following is a suggestion, provided by one of our senior developers, which could be used to accomplish this feature request.

There is a code suggestion, made by Chrisian Hammers, 2008, back for MySQL 5.0.x,
documented in MySQL feature request:

http://bugs.mysql.com/bug.php?id=33799.

However, I have a correction for the 'FIXME' line:

- thd_info->posix_thread_id = *((unsigned int *)(tmp->real_id + 72)); // FIXME this is from glibc-2.6.1/nptl/descr.h
+ thd_info->posix_thread_id = gettid(); // #include <sys/types.h>

The PID &/or Thread IDs are easily available, but only to the parent process and child thread or process.

A) From the parent process perspective, the child PID or thread_id is only made available during thread creation:

The return code from fork(), is the PID of the child process. (Doc here http://pubs.opengroup.org/onlinepubs/007908799/xsh/fork.html)
The first parameter from the POSIX function pthread_create(). (Doc here http://pubs.opengroup.org/onlinepubs/007908799/xsh/pthread_create.html)
The sixth parameter from the Solaris function thr_create(). (Doc here http://docs.oracle.com/cd/E19253-01/816-5137/sthreads-69878/index.html)

B) From within the child Thread/Lwp (via pthread_create()), the thread_id is made available through:

The return code from POSIX function, gettid().
EX:

    #include <sys/types.h>
    pid_t gettid(void);

The return code from syscall(SYS_gettid), where SYS_gettid is an enum.
EX:

    #include <sys/types.h>
    #include <sys/syscall.h>
    #define gettid() syscall(SYS_gettid);

The return code from Solaris function, thr_self().
EX:

    #include <thread.h>
    thread_t thr_self();

C) From within the child process (via fork()), the PID is made available, at anytime through:

The return code from getpid().
EX:

    #include <unistd.h>
    pid_t getpid();

Comment by Michael Widenius [ 2014-09-24 ]

Hi!

> With that in mind, we would like to know if there is a way to map a
> linux child pid that's associated with the linux parent pid (ppid) of
> the mysqld process to the Id shown in the output of SHOW PROCESSLIST".

To be able to help you, it would be useful to know how you plan to use
the linux child pid.

Is it to understand which thread, as given from 'ps -eLf', maps
to which connection id?

First, there is no stable association with a connection, which has a
THD associated with it, and a thread. If you are running with at
thread pool, then any thread may handle questions from any connection.

For the moment we store, and keep up the date, in each THD->real_id the
value of pthread_self(). This is the same value that is returned
from pthread_create(). This is however not the value returned from gettid().
This code is in sql_class.cc, see THD::store_globals().

It would be trivial to add into THD , the value of gettid() and show it in
information_schema.PROCESSLIST.

Would that help you in any way solving your problems.

I don't know if this is something that is useful for everyone,
especially as this would only work on Linux.

Regards,
Monty

Comment by Harry Higgs [ 2014-10-02 ]

Hi Monty, I apologize for not getting back to you sooner as I was on vacation last week. Anyway, below may be a better explanation of what we are trying to accomplish and may provide you with answers to your questions. This was provided from our senior developer who also provided the information above:

Senior Developer:

No, we are not using ‘thread pool’. Our systems tend to be ‘batch’ processing orientated and establish many SQL queries that are individually treated as a ‘open cursor’ where the results are iterated across by the client application.

Yes, Please!
Having the ‘gettid()’ value present in the information_schema.PROCESSLIST would greatly help achieve the association between a Unix LWP child of the mysqld (visible through the tools used by Sys Admins & DBAs) and the offending SQL query. Once the offending SQL query is identified, the originating process can be tracked down.

Regarding the display of gettid() vs pthread_self(), in the info_schema.PROCESSLIST, this seems to be platform specific where the value displayed should correlate to the output of tools used by Sys Admins (unix: ps -ef, htop, top, pgrep; Windows: taskmgr, plist). However, IF you know of a means to translate the pthread_self() value into those reported by unix Sys Admin tools (ps -efL, htop, top), then I think everyone would desire the POSIX standard value.

To facilitate our DBAs, our developers are required to embedded comments into the SQL queries that identify the originating process. Ex: “-- Program:$0 Host:$HOSTNAME ProgPid:$$ Params:$* DESC:Customer Cursor --\n SELECT x,y,z FROM table WHERE filter = ?” – Where $$ is the unix pid of the process/script.

Ultimately, the problem we are experiencing is a random MySQL LWP thread that sustains 100% busy for many minutes, sometimes hours. This is very visible from the unix tools ‘top’, ‘htop’ and ‘ps -efL’. However, we have not been able to identify which SQL query to which the mysqld LWP thread is servicing. During this situation, the info_schema.PROCESSLIST does not report any state or activity changes, thus we are unable to identify the offending SQL query. Once the offending SQL query is identified, the affected processes can be identified and debugged.

Thanks, Harry

Comment by Sergei Golubchik [ 2014-10-09 ]

Do you need it in 5.5 or 10.0?

Comment by Harry Higgs [ 2014-10-09 ]

We would like it in both versions. Currently we are on version 5.5.37 but at some point in the future we will upgrade to 10.0. I just don't have a hard date as to when we will go to 10.0.

Thanks, Harry

Comment by Michael Widenius [ 2014-10-30 ]

Would it be acceptable for you to get a patch for 5.5.37 and have it in 10.0 ?
(This would mean that you have to compile MariaDB yourselves for 5.5).

The reason is that we don't want to do too many user visible changes to stable versions of MariaDB.

Comment by John Dzilvelis [ 2014-10-30 ]

Hi Monty,

Harry and I talked about this a little earlier. I definitely agree with your point about visible changes on the 5.5.x versions. I expect that we will begin studying a migration to the 10.x sometime in early 2015.

As it turns out we are currently upgrading to Mariadb 5.5.40, so we'll only be on 5.5.37 for a couple more weeks in production. We are required to stay current to comply with our security practices.

We compile MariaDB for 5.5 now, so that is not a problem. However, whether you want to issue an additional patch for 5.5.40 is up to you.

I just want you to know that I'll always need to stay current on the 5.5.x version and I don't know how "portable" the patch would be for future 5.5 versions. I doubt you would want to keep re-issuing patches for the same feature. We already have your patch for our SkySQL issue #8015 ( the slow shutdown issue we discussed a few months ago) , and it's working fine on 5.5.40.

Thank You,

John Dzilvelis

Comment by Daniël van Eeden [ 2015-07-20 ]

This relates to some work I did recently.
I wrote some UDF's:

Note that tid and pthread id are related, but not equal. The pthread id is already shown in SHOW ENGINE INNODB STATUS.
Note that changed to the thread (name, cgroup) will stay effective after disconnect if the thread_cache is used.

Note that threads are already named on Windows.

Related MySQL bugs:
https://bugs.mysql.com/bug.php?id=70858

Comment by Jean Weisbuch [ 2015-07-25 ]

These UDFs seems interesting, especially the one to set the cgroup (if it could be used to limit resources per user/db, it would be really useful on a shared hosting environment for example).

It would be ideal to be able to add informations such as "CPU Time" or "Memory usage" of the process corresponding to the thread directly on I_S.PROCESSLIST which would be better than having to execute an external command to retrieve informations from "/proc" or use "ps".

Comment by Oleksandr Byelkin [ 2015-08-12 ]

I found no way to see ID returned by gettid() (actually syscall()) in output of ps command (its TID for MariaDB threads is equal to PID) but the number clearly belong to the same numeration no doubts (found in experimenting)

getpid() works and allow to find the victim in the ps output.

Comment by Oleksandr Byelkin [ 2015-08-13 ]

revision-id: 6811945eaf6a6e979d92739399b084bf4e00ae2d (mariadb-10.1.6-12-g6811945e)
parent(s): 86a3613d4e981c341e38291c9eeec5dc9f836fae
committer: Oleksandr Byelkin
timestamp: 2015-08-12 23:09:48 +0200
message:

MDEV-6756: map a linux pid (child pid) to a connection id shown in the output of SHOW PROCESSLIST

Added tid & pid for Linux.

Comment by Oleksandr Byelkin [ 2015-09-14 ]

revision-id: 7f7cbb132fa0a40687e4b25a9c9e163c93ec6d15 (mariadb-10.1.6-12-g7f7cbb1)
parent(s): 86a3613d4e981c341e38291c9eeec5dc9f836fae
committer: Oleksandr Byelkin
timestamp: 2015-09-14 12:47:57 +0200
message:

MDEV-6756: map a linux pid (child pid) to a connection id shown in the output of SHOW PROCESSLIST

Added tid (thread ID) for system where it is present.

ps -eL -o tid,pid,command

shows the thread on Linux

Comment by Oleksandr Byelkin [ 2015-09-15 ]

revision-id: a1f31cfa8d0f04cb166813786f7854db3dbcf715 (mariadb-10.1.6-12-ga1f31cf)
parent(s): 86a3613d4e981c341e38291c9eeec5dc9f836fae
committer: Oleksandr Byelkin
timestamp: 2015-09-15 21:50:28 +0200
message:

MDEV-6756: map a linux pid (child pid) to a connection id shown in the output of SHOW PROCESSLIST

Added tid (thread ID) for system where it is present.

ps -eL -o tid,pid,command

shows the thread on Linux

Generated at Thu Feb 08 07:14:21 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.