Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-6756

map a linux pid (child pid) to a connection id shown in the output of SHOW PROCESSLIST

Details

    • Task
    • Status: Closed (View Workflow)
    • Major
    • Resolution: Fixed
    • 10.1.8
    • OTHER
    • None
    • 10.1.6-1, 10.1.6-2, 10.1.8-1, 10.1.8-2

    Description

      We have have a multi-threaded program with multiple connections to the database. While this program is running, we often get a single child process (pid) that just hangs. This particular child process consumes 100% of any given CPU. We are trying to troubleshoot this and trying to determine which child pid maps to the SHOW PROCESSLIST Id.

      With that in mind, we would like to know if there is a way to map a linux child pid that's associated with the linux parent pid (ppid) of the mysqld process to the Id shown in the output of SHOW PROCESSLIST.

      Attachments

        Issue Links

          Activity

            hhiggs Harry Higgs added a comment - - edited

            Following is a suggestion, provided by one of our senior developers, which could be used to accomplish this feature request.

            There is a code suggestion, made by Chrisian Hammers, 2008, back for MySQL 5.0.x,
            documented in MySQL feature request:

            http://bugs.mysql.com/bug.php?id=33799.

            However, I have a correction for the 'FIXME' line:

            - thd_info->posix_thread_id = *((unsigned int *)(tmp->real_id + 72)); // FIXME this is from glibc-2.6.1/nptl/descr.h
            + thd_info->posix_thread_id = gettid(); // #include <sys/types.h>

            The PID &/or Thread IDs are easily available, but only to the parent process and child thread or process.

            A) From the parent process perspective, the child PID or thread_id is only made available during thread creation:

            The return code from fork(), is the PID of the child process. (Doc here http://pubs.opengroup.org/onlinepubs/007908799/xsh/fork.html)
            The first parameter from the POSIX function pthread_create(). (Doc here http://pubs.opengroup.org/onlinepubs/007908799/xsh/pthread_create.html)
            The sixth parameter from the Solaris function thr_create(). (Doc here http://docs.oracle.com/cd/E19253-01/816-5137/sthreads-69878/index.html)

            B) From within the child Thread/Lwp (via pthread_create()), the thread_id is made available through:

            The return code from POSIX function, gettid().
            EX:

                #include <sys/types.h>
                pid_t gettid(void);

            The return code from syscall(SYS_gettid), where SYS_gettid is an enum.
            EX:

                #include <sys/types.h>
                #include <sys/syscall.h>
                #define gettid() syscall(SYS_gettid);

            The return code from Solaris function, thr_self().
            EX:

                #include <thread.h>
                thread_t thr_self();

            C) From within the child process (via fork()), the PID is made available, at anytime through:

            The return code from getpid().
            EX:

                #include <unistd.h>
                pid_t getpid();

            hhiggs Harry Higgs added a comment - - edited Following is a suggestion, provided by one of our senior developers, which could be used to accomplish this feature request. There is a code suggestion, made by Chrisian Hammers, 2008, back for MySQL 5.0.x, documented in MySQL feature request: http://bugs.mysql.com/bug.php?id=33799 . However, I have a correction for the 'FIXME' line: - thd_info->posix_thread_id = *((unsigned int *)(tmp->real_id + 72)); // FIXME this is from glibc-2.6.1/nptl/descr.h + thd_info->posix_thread_id = gettid(); // #include <sys/types.h> The PID &/or Thread IDs are easily available, but only to the parent process and child thread or process. A) From the parent process perspective, the child PID or thread_id is only made available during thread creation: The return code from fork(), is the PID of the child process. (Doc here http://pubs.opengroup.org/onlinepubs/007908799/xsh/fork.html ) The first parameter from the POSIX function pthread_create(). (Doc here http://pubs.opengroup.org/onlinepubs/007908799/xsh/pthread_create.html ) The sixth parameter from the Solaris function thr_create(). (Doc here http://docs.oracle.com/cd/E19253-01/816-5137/sthreads-69878/index.html ) B) From within the child Thread/Lwp (via pthread_create()), the thread_id is made available through: The return code from POSIX function, gettid(). EX: #include <sys/types.h> pid_t gettid(void); The return code from syscall(SYS_gettid), where SYS_gettid is an enum. EX: #include <sys/types.h> #include <sys/syscall.h> #define gettid() syscall(SYS_gettid); The return code from Solaris function, thr_self(). EX: #include <thread.h> thread_t thr_self(); C) From within the child process (via fork()), the PID is made available, at anytime through: The return code from getpid(). EX: #include <unistd.h> pid_t getpid();
            monty Michael Widenius added a comment - - edited

            Hi!

            > With that in mind, we would like to know if there is a way to map a
            > linux child pid that's associated with the linux parent pid (ppid) of
            > the mysqld process to the Id shown in the output of SHOW PROCESSLIST".

            To be able to help you, it would be useful to know how you plan to use
            the linux child pid.

            Is it to understand which thread, as given from 'ps -eLf', maps
            to which connection id?

            First, there is no stable association with a connection, which has a
            THD associated with it, and a thread. If you are running with at
            thread pool, then any thread may handle questions from any connection.

            For the moment we store, and keep up the date, in each THD->real_id the
            value of pthread_self(). This is the same value that is returned
            from pthread_create(). This is however not the value returned from gettid().
            This code is in sql_class.cc, see THD::store_globals().

            It would be trivial to add into THD , the value of gettid() and show it in
            information_schema.PROCESSLIST.

            Would that help you in any way solving your problems.

            I don't know if this is something that is useful for everyone,
            especially as this would only work on Linux.

            Regards,
            Monty

            monty Michael Widenius added a comment - - edited Hi! > With that in mind, we would like to know if there is a way to map a > linux child pid that's associated with the linux parent pid (ppid) of > the mysqld process to the Id shown in the output of SHOW PROCESSLIST". To be able to help you, it would be useful to know how you plan to use the linux child pid. Is it to understand which thread, as given from 'ps -eLf', maps to which connection id? First, there is no stable association with a connection, which has a THD associated with it, and a thread. If you are running with at thread pool, then any thread may handle questions from any connection. For the moment we store, and keep up the date, in each THD->real_id the value of pthread_self(). This is the same value that is returned from pthread_create(). This is however not the value returned from gettid(). This code is in sql_class.cc, see THD::store_globals(). It would be trivial to add into THD , the value of gettid() and show it in information_schema.PROCESSLIST. Would that help you in any way solving your problems. I don't know if this is something that is useful for everyone, especially as this would only work on Linux. Regards, Monty
            hhiggs Harry Higgs added a comment -

            Hi Monty, I apologize for not getting back to you sooner as I was on vacation last week. Anyway, below may be a better explanation of what we are trying to accomplish and may provide you with answers to your questions. This was provided from our senior developer who also provided the information above:

            Senior Developer:

            No, we are not using ‘thread pool’. Our systems tend to be ‘batch’ processing orientated and establish many SQL queries that are individually treated as a ‘open cursor’ where the results are iterated across by the client application.

            Yes, Please!
            Having the ‘gettid()’ value present in the information_schema.PROCESSLIST would greatly help achieve the association between a Unix LWP child of the mysqld (visible through the tools used by Sys Admins & DBAs) and the offending SQL query. Once the offending SQL query is identified, the originating process can be tracked down.

            Regarding the display of gettid() vs pthread_self(), in the info_schema.PROCESSLIST, this seems to be platform specific where the value displayed should correlate to the output of tools used by Sys Admins (unix: ps -ef, htop, top, pgrep; Windows: taskmgr, plist). However, IF you know of a means to translate the pthread_self() value into those reported by unix Sys Admin tools (ps -efL, htop, top), then I think everyone would desire the POSIX standard value.

            To facilitate our DBAs, our developers are required to embedded comments into the SQL queries that identify the originating process. Ex: “-- Program:$0 Host:$HOSTNAME ProgPid:$$ Params:$* DESC:Customer Cursor --\n SELECT x,y,z FROM table WHERE filter = ?” – Where $$ is the unix pid of the process/script.

            Ultimately, the problem we are experiencing is a random MySQL LWP thread that sustains 100% busy for many minutes, sometimes hours. This is very visible from the unix tools ‘top’, ‘htop’ and ‘ps -efL’. However, we have not been able to identify which SQL query to which the mysqld LWP thread is servicing. During this situation, the info_schema.PROCESSLIST does not report any state or activity changes, thus we are unable to identify the offending SQL query. Once the offending SQL query is identified, the affected processes can be identified and debugged.

            Thanks, Harry

            hhiggs Harry Higgs added a comment - Hi Monty, I apologize for not getting back to you sooner as I was on vacation last week. Anyway, below may be a better explanation of what we are trying to accomplish and may provide you with answers to your questions. This was provided from our senior developer who also provided the information above: Senior Developer: No, we are not using ‘thread pool’. Our systems tend to be ‘batch’ processing orientated and establish many SQL queries that are individually treated as a ‘open cursor’ where the results are iterated across by the client application. Yes, Please! Having the ‘gettid()’ value present in the information_schema.PROCESSLIST would greatly help achieve the association between a Unix LWP child of the mysqld (visible through the tools used by Sys Admins & DBAs) and the offending SQL query. Once the offending SQL query is identified, the originating process can be tracked down. Regarding the display of gettid() vs pthread_self(), in the info_schema.PROCESSLIST, this seems to be platform specific where the value displayed should correlate to the output of tools used by Sys Admins (unix: ps -ef, htop, top, pgrep; Windows: taskmgr, plist). However, IF you know of a means to translate the pthread_self() value into those reported by unix Sys Admin tools (ps -efL, htop, top), then I think everyone would desire the POSIX standard value. To facilitate our DBAs, our developers are required to embedded comments into the SQL queries that identify the originating process. Ex: “-- Program:$0 Host:$HOSTNAME ProgPid:$$ Params:$* DESC:Customer Cursor --\n SELECT x,y,z FROM table WHERE filter = ?” – Where $$ is the unix pid of the process/script. Ultimately, the problem we are experiencing is a random MySQL LWP thread that sustains 100% busy for many minutes, sometimes hours. This is very visible from the unix tools ‘top’, ‘htop’ and ‘ps -efL’. However, we have not been able to identify which SQL query to which the mysqld LWP thread is servicing. During this situation, the info_schema.PROCESSLIST does not report any state or activity changes, thus we are unable to identify the offending SQL query. Once the offending SQL query is identified, the affected processes can be identified and debugged. Thanks, Harry

            Do you need it in 5.5 or 10.0?

            serg Sergei Golubchik added a comment - Do you need it in 5.5 or 10.0?
            hhiggs Harry Higgs added a comment -

            We would like it in both versions. Currently we are on version 5.5.37 but at some point in the future we will upgrade to 10.0. I just don't have a hard date as to when we will go to 10.0.

            Thanks, Harry

            hhiggs Harry Higgs added a comment - We would like it in both versions. Currently we are on version 5.5.37 but at some point in the future we will upgrade to 10.0. I just don't have a hard date as to when we will go to 10.0. Thanks, Harry

            Would it be acceptable for you to get a patch for 5.5.37 and have it in 10.0 ?
            (This would mean that you have to compile MariaDB yourselves for 5.5).

            The reason is that we don't want to do too many user visible changes to stable versions of MariaDB.

            monty Michael Widenius added a comment - Would it be acceptable for you to get a patch for 5.5.37 and have it in 10.0 ? (This would mean that you have to compile MariaDB yourselves for 5.5). The reason is that we don't want to do too many user visible changes to stable versions of MariaDB.

            Hi Monty,

            Harry and I talked about this a little earlier. I definitely agree with your point about visible changes on the 5.5.x versions. I expect that we will begin studying a migration to the 10.x sometime in early 2015.

            As it turns out we are currently upgrading to Mariadb 5.5.40, so we'll only be on 5.5.37 for a couple more weeks in production. We are required to stay current to comply with our security practices.

            We compile MariaDB for 5.5 now, so that is not a problem. However, whether you want to issue an additional patch for 5.5.40 is up to you.

            I just want you to know that I'll always need to stay current on the 5.5.x version and I don't know how "portable" the patch would be for future 5.5 versions. I doubt you would want to keep re-issuing patches for the same feature. We already have your patch for our SkySQL issue #8015 ( the slow shutdown issue we discussed a few months ago) , and it's working fine on 5.5.40.

            Thank You,

            John Dzilvelis

            johndz John Dzilvelis added a comment - Hi Monty, Harry and I talked about this a little earlier. I definitely agree with your point about visible changes on the 5.5.x versions. I expect that we will begin studying a migration to the 10.x sometime in early 2015. As it turns out we are currently upgrading to Mariadb 5.5.40, so we'll only be on 5.5.37 for a couple more weeks in production. We are required to stay current to comply with our security practices. We compile MariaDB for 5.5 now, so that is not a problem. However, whether you want to issue an additional patch for 5.5.40 is up to you. I just want you to know that I'll always need to stay current on the 5.5.x version and I don't know how "portable" the patch would be for future 5.5 versions. I doubt you would want to keep re-issuing patches for the same feature. We already have your patch for our SkySQL issue #8015 ( the slow shutdown issue we discussed a few months ago) , and it's working fine on 5.5.40. Thank You, John Dzilvelis

            This relates to some work I did recently.
            I wrote some UDF's:

            Note that tid and pthread id are related, but not equal. The pthread id is already shown in SHOW ENGINE INNODB STATUS.
            Note that changed to the thread (name, cgroup) will stay effective after disconnect if the thread_cache is used.

            Note that threads are already named on Windows.

            Related MySQL bugs:
            https://bugs.mysql.com/bug.php?id=70858

            dveeden Daniël van Eeden added a comment - This relates to some work I did recently. I wrote some UDF's: https://github.com/dveeden/udf_gettid Gets the Linux tid (task id, child of pid) https://github.com/dveeden/udf_pthread_name Set the pthread name (usable in top and "ps -eLo pid,tid,comm | grep $(pgrep -x mysqld)") https://github.com/dveeden/udf_cgroup move thread to cgroup Note that tid and pthread id are related, but not equal. The pthread id is already shown in SHOW ENGINE INNODB STATUS. Note that changed to the thread (name, cgroup) will stay effective after disconnect if the thread_cache is used. Note that threads are already named on Windows. Related MySQL bugs: https://bugs.mysql.com/bug.php?id=70858
            jb-boin Jean Weisbuch added a comment -

            These UDFs seems interesting, especially the one to set the cgroup (if it could be used to limit resources per user/db, it would be really useful on a shared hosting environment for example).

            It would be ideal to be able to add informations such as "CPU Time" or "Memory usage" of the process corresponding to the thread directly on I_S.PROCESSLIST which would be better than having to execute an external command to retrieve informations from "/proc" or use "ps".

            jb-boin Jean Weisbuch added a comment - These UDFs seems interesting, especially the one to set the cgroup (if it could be used to limit resources per user/db, it would be really useful on a shared hosting environment for example). It would be ideal to be able to add informations such as "CPU Time" or "Memory usage" of the process corresponding to the thread directly on I_S.PROCESSLIST which would be better than having to execute an external command to retrieve informations from "/proc" or use "ps".

            I found no way to see ID returned by gettid() (actually syscall()) in output of ps command (its TID for MariaDB threads is equal to PID) but the number clearly belong to the same numeration no doubts (found in experimenting)

            getpid() works and allow to find the victim in the ps output.

            sanja Oleksandr Byelkin added a comment - I found no way to see ID returned by gettid() (actually syscall()) in output of ps command (its TID for MariaDB threads is equal to PID) but the number clearly belong to the same numeration no doubts (found in experimenting) getpid() works and allow to find the victim in the ps output.

            revision-id: 6811945eaf6a6e979d92739399b084bf4e00ae2d (mariadb-10.1.6-12-g6811945e)
            parent(s): 86a3613d4e981c341e38291c9eeec5dc9f836fae
            committer: Oleksandr Byelkin
            timestamp: 2015-08-12 23:09:48 +0200
            message:

            MDEV-6756: map a linux pid (child pid) to a connection id shown in the output of SHOW PROCESSLIST

            Added tid & pid for Linux.

            sanja Oleksandr Byelkin added a comment - revision-id: 6811945eaf6a6e979d92739399b084bf4e00ae2d (mariadb-10.1.6-12-g6811945e) parent(s): 86a3613d4e981c341e38291c9eeec5dc9f836fae committer: Oleksandr Byelkin timestamp: 2015-08-12 23:09:48 +0200 message: MDEV-6756 : map a linux pid (child pid) to a connection id shown in the output of SHOW PROCESSLIST Added tid & pid for Linux. —

            revision-id: 7f7cbb132fa0a40687e4b25a9c9e163c93ec6d15 (mariadb-10.1.6-12-g7f7cbb1)
            parent(s): 86a3613d4e981c341e38291c9eeec5dc9f836fae
            committer: Oleksandr Byelkin
            timestamp: 2015-09-14 12:47:57 +0200
            message:

            MDEV-6756: map a linux pid (child pid) to a connection id shown in the output of SHOW PROCESSLIST

            Added tid (thread ID) for system where it is present.

            ps -eL -o tid,pid,command

            shows the thread on Linux

            sanja Oleksandr Byelkin added a comment - revision-id: 7f7cbb132fa0a40687e4b25a9c9e163c93ec6d15 (mariadb-10.1.6-12-g7f7cbb1) parent(s): 86a3613d4e981c341e38291c9eeec5dc9f836fae committer: Oleksandr Byelkin timestamp: 2015-09-14 12:47:57 +0200 message: MDEV-6756 : map a linux pid (child pid) to a connection id shown in the output of SHOW PROCESSLIST Added tid (thread ID) for system where it is present. ps -eL -o tid,pid,command shows the thread on Linux —

            revision-id: a1f31cfa8d0f04cb166813786f7854db3dbcf715 (mariadb-10.1.6-12-ga1f31cf)
            parent(s): 86a3613d4e981c341e38291c9eeec5dc9f836fae
            committer: Oleksandr Byelkin
            timestamp: 2015-09-15 21:50:28 +0200
            message:

            MDEV-6756: map a linux pid (child pid) to a connection id shown in the output of SHOW PROCESSLIST

            Added tid (thread ID) for system where it is present.

            ps -eL -o tid,pid,command

            shows the thread on Linux

            sanja Oleksandr Byelkin added a comment - revision-id: a1f31cfa8d0f04cb166813786f7854db3dbcf715 (mariadb-10.1.6-12-ga1f31cf) parent(s): 86a3613d4e981c341e38291c9eeec5dc9f836fae committer: Oleksandr Byelkin timestamp: 2015-09-15 21:50:28 +0200 message: MDEV-6756 : map a linux pid (child pid) to a connection id shown in the output of SHOW PROCESSLIST Added tid (thread ID) for system where it is present. ps -eL -o tid,pid,command shows the thread on Linux —

            People

              sanja Oleksandr Byelkin
              hhiggs Harry Higgs
              Votes:
              4 Vote for this issue
              Watchers:
              9 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.