Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-34918

Sort aborted, warning causing application issues.

Details

    • Bug
    • Status: Closed (View Workflow)
    • Major
    • Resolution: Incomplete
    • 10.5.9
    • N/A
    • Memory management

    Description

      There are "Sort Abort" Warning messages Noticed in MariaDB Error log file. Increased the variables "sort_buffer_size=20Mb" and "innodb_sort_buffer_size=16Mb". After a period the alert reappears.

      2024-09-12  9:12:33 2771708 *[Warning] Sort aborted*, host: compute.internal, user: unicauser, thread: 2771708, query: SELECT OLTMMEID,OLTBATCHID,OLTSEQUENCENO,OLTRECIPIENTTYPE,OLTBOUNDARY,CellCode,CellID,OLTSERVICE_ID,CUSTOMER_NAME,TreatmentCode,TRACKINGCODE,EMAIL_ADDR FROM OLT1537 ORDER BY OLTSEQUENCENO
      2024-09-12 10:45:01 2782289 [Warning] Sort aborted, host:compute.internal, user: unicauser, thread: 2782289, query: SELECT * FROM UAC_217310_104 ORDER BY InternalRowNumber ASC, TreatmentInstID ASC, DirectResponse DESC
      2024-09-12 13:47:07 2802875 [Warning] Sort aborted, host:compute.internal, user: unicauser, thread: 2802875, query: SELECT OLTMMEID,OLTBATCHID,OLTSEQUENCENO,OLTRECIPIENTTYPE,OLTBOUNDARY,CellCode,CellID,OLTSERVICE_ID,TreatmentCode,TRACKINGCODE,EMAIL_ADDR,CUSTOMER_NAME FROM OLT1537 ORDER BY OLTSEQUENCENO
      2024-09-12 13:56:48 2803392 [Warning] Sort aborted, host: compute.internal, user: 
      

      Attachments

        Activity

          MurthyRamachandrula R S N Murthy added a comment - - edited

          we are getting error messages in application logs

          09/12/2024 23:02:18.920 (5767, T = 2842850880)  [I] [TABLE ACC] [IP16_CH]       CAMP_SYS_TABLES (thread 0x7f65a9154640): Data retrieval completed; 28885 records retrieved and returned to caller. [sdbtacc:408]
          09/12/2024 23:02:18.929 (5767, T = 2842850880)  [I] [CELL ACC]  [IP16_CH]       Cell Access: Not eligible for in-db processing:  Accessing data from multiple tables.
          09/12/2024 23:02:18.939 (5767, T = 2842850880)  [I] [CELL ACC]  [IP16_CH]       CellAccess: Requesting data from table "Extract".
          09/12/2024 23:02:18.953 (5767, T = 2836743744)  [I] [DB QUERY]  [IP16_CH]       CAMP_SYS_TABLES (thread 0x7f65a9154640): select SERVICE_NUM, ID_DOCUMENT_NUM, DERIVED_EMAIL_ADDR, SUBSCRIPTION_SQN, CUSTOMER_NAME, MASKED_MOBILE, BASE, CUST_ANONY_ID from UAC_EX_307537_16 order by SERVICE_NUM  [sdbtacc:3065]
          09/12/2024 23:02:18.972 (5767, T = 2836743744)  [E] [TABLE ACC] Database error (5): SQL call failed.
          

          cessSpecificFunction":"","result":"","row":652,"runState":0,"runStatus":0,"selected":false,"type":6}]}
          09/12/2024 23:02:20.543 (5767, T = 2842850880)  [D] [PROCESS]   [IP16_CH]       ReCopy Single JSON Response nID = 1, processID = 0, func_tag = 27, jsonResp = {"ProgressInfo":{"lastSuccessRunDate":"","lastSuccessRunMode":false,"percentDone":0.0,"runEndTime":0,"runMode":0,"runStartTime":0,"runState":2,"runStatus":0,"runType":0}}
          09/12/2024 23:02:20.554 (5767, T = 2842850880)  [E] [PROCESS]   [IP16_CH]       MailList PROCESS_RUN_ABORT: CODE 11528: Database Error
          Database error (5): SQL call failed.
          DB vendor code: 1028

          MurthyRamachandrula R S N Murthy added a comment - - edited we are getting error messages in application logs 09/12/2024 23:02:18.920 (5767, T = 2842850880) [I] [TABLE ACC] [IP16_CH] CAMP_SYS_TABLES (thread 0x7f65a9154640): Data retrieval completed; 28885 records retrieved and returned to caller. [sdbtacc:408] 09/12/2024 23:02:18.929 (5767, T = 2842850880) [I] [CELL ACC] [IP16_CH] Cell Access: Not eligible for in-db processing: Accessing data from multiple tables. 09/12/2024 23:02:18.939 (5767, T = 2842850880) [I] [CELL ACC] [IP16_CH] CellAccess: Requesting data from table "Extract". 09/12/2024 23:02:18.953 (5767, T = 2836743744) [I] [DB QUERY] [IP16_CH] CAMP_SYS_TABLES (thread 0x7f65a9154640): select SERVICE_NUM, ID_DOCUMENT_NUM, DERIVED_EMAIL_ADDR, SUBSCRIPTION_SQN, CUSTOMER_NAME, MASKED_MOBILE, BASE, CUST_ANONY_ID from UAC_EX_307537_16 order by SERVICE_NUM [sdbtacc:3065] 09/12/2024 23:02:18.972 (5767, T = 2836743744) [E] [TABLE ACC] Database error (5): SQL call failed. cessSpecificFunction":"","result":"","row":652,"runState":0,"runStatus":0,"selected":false,"type":6}]} 09/12/2024 23:02:20.543 (5767, T = 2842850880) [D] [PROCESS] [IP16_CH] ReCopy Single JSON Response nID = 1, processID = 0, func_tag = 27, jsonResp = {"ProgressInfo":{"lastSuccessRunDate":"","lastSuccessRunMode":false,"percentDone":0.0,"runEndTime":0,"runMode":0,"runStartTime":0,"runState":2,"runStatus":0,"runType":0}} 09/12/2024 23:02:20.554 (5767, T = 2842850880) [E] [PROCESS] [IP16_CH] MailList PROCESS_RUN_ABORT: CODE 11528: Database Error Database error (5): SQL call failed. DB vendor code: 1028
          MurthyRamachandrula R S N Murthy added a comment - - edited

          Plesae check , RAM is 39Gb but total MariaDB consuming 121G, is there a way to better configure the variables to come out of "Sort Abort" issue.

          [~]$ free -h
                        total        used        free      shared  buff/cache   available
          Mem:           39Gi        23Gi       2.1Gi       0.0Ki        14Gi        15Gi
          Swap:            0B          0B          0B
          [~]$
          

          MariaDB [(none)]> SELECT ROUND(
              ->     ( @@GLOBAL.key_buffer_size
              ->      + @@GLOBAL.query_cache_size
              ->      + @@GLOBAL.tmp_table_size
              ->      + @@GLOBAL.innodb_buffer_pool_size
              ->      + @@GLOBAL.innodb_log_buffer_size
              ->      + @@GLOBAL.max_connections * (
              ->          @@GLOBAL.sort_buffer_size
              ->        + @@GLOBAL.read_buffer_size
              ->        + @@GLOBAL.read_rnd_buffer_size
              ->        + @@GLOBAL.join_buffer_size
              ->        + @@GLOBAL.thread_stack
              ->        + @@GLOBAL.binlog_cache_size)
              ->     ) / 1024 / 1024, 1) `total MB`;
          +----------+
          | total MB |
          +----------+
          | 121635.0 |
          +----------+
          1 row in set (0.000 sec)
           
           
          MariaDB [(none)]> select @@GLOBAL.key_buffer_size, @@GLOBAL.query_cache_size, @@GLOBAL.tmp_table_size, @@GLOBAL.innodb_buffer_pool_size, @@GLOBAL.innodb_log_buffer_size, @@GLOBAL.max_connections,@@GLOBAL.sort_buffer_size, @@GLOBAL.read_buffer_size, @@GLOBAL.read_rnd_buffer_size, @@GLOBAL.join_buffer_size, @@GLOBAL.thread_stack, @@GLOBAL.binlog_cache_size \G
          *************************** 1. row ***************************
                  @@GLOBAL.key_buffer_size: 134217728
                 @@GLOBAL.query_cache_size: 33554432
                   @@GLOBAL.tmp_table_size: 134217728
          @@GLOBAL.innodb_buffer_pool_size: 17179869184
           @@GLOBAL.innodb_log_buffer_size: 268435456
                  @@GLOBAL.max_connections: 5000
                 @@GLOBAL.sort_buffer_size: 20971520
                 @@GLOBAL.read_buffer_size: 131072
             @@GLOBAL.read_rnd_buffer_size: 262144
                 @@GLOBAL.join_buffer_size: 262144
                     @@GLOBAL.thread_stack: 299008
                @@GLOBAL.binlog_cache_size: 32768
          1 row in set (0.000 sec)
           
          MariaDB [(none)]>
          

          MurthyRamachandrula R S N Murthy added a comment - - edited Plesae check , RAM is 39Gb but total MariaDB consuming 121G, is there a way to better configure the variables to come out of "Sort Abort" issue. [~]$ free -h total used free shared buff/cache available Mem: 39Gi 23Gi 2.1Gi 0.0Ki 14Gi 15Gi Swap: 0B 0B 0B [~]$ MariaDB [(none)]> SELECT ROUND( -> ( @@GLOBAL.key_buffer_size -> + @@GLOBAL.query_cache_size -> + @@GLOBAL.tmp_table_size -> + @@GLOBAL.innodb_buffer_pool_size -> + @@GLOBAL.innodb_log_buffer_size -> + @@GLOBAL.max_connections * ( -> @@GLOBAL.sort_buffer_size -> + @@GLOBAL.read_buffer_size -> + @@GLOBAL.read_rnd_buffer_size -> + @@GLOBAL.join_buffer_size -> + @@GLOBAL.thread_stack -> + @@GLOBAL.binlog_cache_size) -> ) / 1024 / 1024, 1) `total MB`; +----------+ | total MB | +----------+ | 121635.0 | +----------+ 1 row in set (0.000 sec)     MariaDB [(none)]> select @@GLOBAL.key_buffer_size, @@GLOBAL.query_cache_size, @@GLOBAL.tmp_table_size, @@GLOBAL.innodb_buffer_pool_size, @@GLOBAL.innodb_log_buffer_size, @@GLOBAL.max_connections,@@GLOBAL.sort_buffer_size, @@GLOBAL.read_buffer_size, @@GLOBAL.read_rnd_buffer_size, @@GLOBAL.join_buffer_size, @@GLOBAL.thread_stack, @@GLOBAL.binlog_cache_size \G *************************** 1. row *************************** @@GLOBAL.key_buffer_size: 134217728 @@GLOBAL.query_cache_size: 33554432 @@GLOBAL.tmp_table_size: 134217728 @@GLOBAL.innodb_buffer_pool_size: 17179869184 @@GLOBAL.innodb_log_buffer_size: 268435456 @@GLOBAL.max_connections: 5000 @@GLOBAL.sort_buffer_size: 20971520 @@GLOBAL.read_buffer_size: 131072 @@GLOBAL.read_rnd_buffer_size: 262144 @@GLOBAL.join_buffer_size: 262144 @@GLOBAL.thread_stack: 299008 @@GLOBAL.binlog_cache_size: 32768 1 row in set (0.000 sec)   MariaDB [(none)]>
          MurthyRamachandrula R S N Murthy added a comment - - edited

          There was no issues noticed on memory or CPU. Please find the SAR report

          CPU utilization:
          ============
           
          Linux 4.18.0-553.16.1.el8_10.x86_64 (puawsapse1stardbs1001.star.aws.hclsw.internal)     09/18/2024      _x86_64_        (16 CPU)
           
          12:32:06 PM     CPU     %user     %nice   %system   %iowait    %steal     %idle
          12:32:07 PM     all     18.56      0.00      0.50      0.06      0.00     80.88
          12:32:08 PM     all     18.66      0.00      0.19      0.00      0.00     81.15
          12:32:09 PM     all     18.68      0.00      0.37      0.00      0.00     80.95
          Average:        all     18.63      0.00      0.35      0.02      0.00     80.99
           
           
          Memory Usage:
          ===========
           
          12:35:52 PM kbmemfree   kbavail kbmemused  %memused kbbuffers  kbcached  kbcommit   %commit  kbactive   kbinact   kbdirty
          12:35:53 PM   1490836  15803032  40450312     96.45    287380  14104784  37231328     88.77   8509792  30515732         0
          12:35:54 PM   1490864  15803064  40450284     96.45    287380  14104788  37231328     88.77   8509792  30515736         4
          12:35:55 PM   1490808  15803008  40450340     96.45    287380  14104788  37231328     88.77   8509792  30515736         4
          Average:      1490836  15803035  40450312     96.45    287380  14104787  37231328     88.77   8509792  30515735         3
           
           
          SWAP USAGE :
          ==========
           
          12:36:43 PM kbswpfree kbswpused  %swpused  kbswpcad   %swpcad
          12:36:44 PM         0         0      0.00         0      0.00
          12:36:45 PM         0         0      0.00         0      0.00
          12:36:46 PM         0         0      0.00         0      0.00
          Average:            0         0      0.00         0      0.00
           
          IO Utilization:
          ===========
          12:37:38 PM       tps      rtps      wtps   bread/s   bwrtn/s
          12:37:39 PM      1.00      0.00      1.00      0.00      8.00
          12:37:40 PM     11.00      1.00     10.00     64.00    112.00
          12:37:41 PM      5.00      0.00      5.00      0.00    152.00
          Average:         5.67      0.33      5.33     21.33     90.67
          

          MurthyRamachandrula R S N Murthy added a comment - - edited There was no issues noticed on memory or CPU. Please find the SAR report CPU utilization: ============   Linux 4.18.0-553.16.1.el8_10.x86_64 (puawsapse1stardbs1001.star.aws.hclsw.internal) 09/18/2024 _x86_64_ (16 CPU)   12:32:06 PM CPU %user %nice %system %iowait %steal %idle 12:32:07 PM all 18.56 0.00 0.50 0.06 0.00 80.88 12:32:08 PM all 18.66 0.00 0.19 0.00 0.00 81.15 12:32:09 PM all 18.68 0.00 0.37 0.00 0.00 80.95 Average: all 18.63 0.00 0.35 0.02 0.00 80.99     Memory Usage: ===========   12:35:52 PM kbmemfree kbavail kbmemused %memused kbbuffers kbcached kbcommit %commit kbactive kbinact kbdirty 12:35:53 PM 1490836 15803032 40450312 96.45 287380 14104784 37231328 88.77 8509792 30515732 0 12:35:54 PM 1490864 15803064 40450284 96.45 287380 14104788 37231328 88.77 8509792 30515736 4 12:35:55 PM 1490808 15803008 40450340 96.45 287380 14104788 37231328 88.77 8509792 30515736 4 Average: 1490836 15803035 40450312 96.45 287380 14104787 37231328 88.77 8509792 30515735 3     SWAP USAGE : ==========   12:36:43 PM kbswpfree kbswpused %swpused kbswpcad %swpcad 12:36:44 PM 0 0 0.00 0 0.00 12:36:45 PM 0 0 0.00 0 0.00 12:36:46 PM 0 0 0.00 0 0.00 Average: 0 0 0.00 0 0.00   IO Utilization: =========== 12:37:38 PM tps rtps wtps bread/s bwrtn/s 12:37:39 PM 1.00 0.00 1.00 0.00 8.00 12:37:40 PM 11.00 1.00 10.00 64.00 112.00 12:37:41 PM 5.00 0.00 5.00 0.00 152.00 Average: 5.67 0.33 5.33 21.33 90.67

          Right, so it seems your application does not log the error text. It could've been, for example

          Sort aborted: Server shutdown in progress
          

          (this was an example, you'll likely see something else)

          If you application does not log the error log text, you can use the sql_error_log plugin to log all application errors in a file.

          serg Sergei Golubchik added a comment - Right, so it seems your application does not log the error text. It could've been, for example Sort aborted: Server shutdown in progress (this was an example , you'll likely see something else) If you application does not log the error log text, you can use the sql_error_log plugin to log all application errors in a file.
          MurthyRamachandrula R S N Murthy added a comment - - edited

          The application log is noticed with "11528: Database Error. Database error (5): SQL call failed. " along with Sort Errors

          ===========Appliction Log====================

          09/12/2024 23:02:18.920 (5767, T = 2842850880)  [I] [TABLE ACC] [IP16_CH]       CAMP_SYS_TABLES (thread 0x7f65a9154640): Data retrieval completed; 28885 records retrieved and returned to caller. [sdbtacc:408]
          09/12/2024 23:02:18.929 (5767, T = 2842850880)  [I] [CELL ACC]  [IP16_CH]       Cell Access: Not eligible for in-db processing:  Accessing data from multiple tables.
          09/12/2024 23:02:18.939 (5767, T = 2842850880)  [I] [CELL ACC]  [IP16_CH]       CellAccess: Requesting data from table "Extract".
          09/12/2024 23:02:18.953 (5767, T = 2836743744)  [I] [DB QUERY]  [IP16_CH]       CAMP_SYS_TABLES (thread 0x7f65a9154640): select SERVICE_NUM, ID_DOCUMENT_NUM, DERIVED_EMAIL_ADDR, SUBSCRIPTION_SQN, CUSTOMER_NAME, MASKED_MOBILE, BASE, CUST_ANONY_ID from UAC_EX_307537_16 order by SERVICE_NUM  [sdbtacc:3065]
          09/12/2024 23:02:18.972 (5767, T = 2836743744)  [E] [TABLE ACC] Database error (5): SQL call failed.
          

          cessSpecificFunction":"","result":"","row":652,"runState":0,"runStatus":0,"selected":false,"type":6}]}
          09/12/2024 23:02:20.543 (5767, T = 2842850880)  [D] [PROCESS]   [IP16_CH]       ReCopy Single JSON Response nID = 1, processID = 0, func_tag = 27, jsonResp = {"ProgressInfo":{"lastSuccessRunDate":"","lastSuccessRunMode":false,"percentDone":0.0,"runEndTime":0,"runMode":0,"runStartTime":0,"runState":2,"runStatus":0,"runType":0}}
          09/12/2024 23:02:20.554 (5767, T = 2842850880)  [E] [PROCESS]   [IP16_CH]       MailList PROCESS_RUN_ABORT: CODE 11528: Database Error
          Database error (5): SQL call failed.
          DB vendor code: 1028

          ===========================

          MurthyRamachandrula R S N Murthy added a comment - - edited The application log is noticed with "11528: Database Error. Database error (5): SQL call failed. " along with Sort Errors ===========Appliction Log==================== 09/12/2024 23:02:18.920 (5767, T = 2842850880) [I] [TABLE ACC] [IP16_CH] CAMP_SYS_TABLES (thread 0x7f65a9154640): Data retrieval completed; 28885 records retrieved and returned to caller. [sdbtacc:408] 09/12/2024 23:02:18.929 (5767, T = 2842850880) [I] [CELL ACC] [IP16_CH] Cell Access: Not eligible for in-db processing: Accessing data from multiple tables. 09/12/2024 23:02:18.939 (5767, T = 2842850880) [I] [CELL ACC] [IP16_CH] CellAccess: Requesting data from table "Extract". 09/12/2024 23:02:18.953 (5767, T = 2836743744) [I] [DB QUERY] [IP16_CH] CAMP_SYS_TABLES (thread 0x7f65a9154640): select SERVICE_NUM, ID_DOCUMENT_NUM, DERIVED_EMAIL_ADDR, SUBSCRIPTION_SQN, CUSTOMER_NAME, MASKED_MOBILE, BASE, CUST_ANONY_ID from UAC_EX_307537_16 order by SERVICE_NUM [sdbtacc:3065] 09/12/2024 23:02:18.972 (5767, T = 2836743744) [E] [TABLE ACC] Database error (5): SQL call failed. cessSpecificFunction":"","result":"","row":652,"runState":0,"runStatus":0,"selected":false,"type":6}]} 09/12/2024 23:02:20.543 (5767, T = 2842850880) [D] [PROCESS] [IP16_CH] ReCopy Single JSON Response nID = 1, processID = 0, func_tag = 27, jsonResp = {"ProgressInfo":{"lastSuccessRunDate":"","lastSuccessRunMode":false,"percentDone":0.0,"runEndTime":0,"runMode":0,"runStartTime":0,"runState":2,"runStatus":0,"runType":0}} 09/12/2024 23:02:20.554 (5767, T = 2842850880) [E] [PROCESS] [IP16_CH] MailList PROCESS_RUN_ABORT: CODE 11528: Database Error Database error (5): SQL call failed. DB vendor code: 1028 ===========================

          People

            Unassigned Unassigned
            MurthyRamachandrula R S N Murthy
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Git Integration

                Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.