[MDEV-21852] ANALYZE for statements: r_other_time_ms inside filesort Created: 2020-03-02  Updated: 2020-06-09  Resolved: 2020-06-04

Status: Closed
Project: MariaDB Server
Component/s: Optimizer
Affects Version/s: 10.5
Fix Version/s: N/A

Type: Bug Priority: Major
Reporter: Sergei Petrunia Assignee: Sergei Petrunia
Resolution: Not a Bug Votes: 0
Labels: None


 Description   

r_other_time_ms was introduced to collect the time spent in JOIN "between the table accesses" - checking the WHERE, etc.

However it is also printed for tables that are put as a source to filesort:

analyze format=json  select char_field, count(distinct b) from t_char_1000000 group by char_field;
| {
  "query_block": {
    "select_id": 1,
    "r_loops": 1,
    "r_total_time_ms": 4048.4,
    "read_sorted_file": {
      "r_rows": 1e6,
      "filesort": {
        "sort_key": "t_char_1000000.char_field",
        "r_loops": 1,
        "r_total_time_ms": 1634.5,
        "r_used_priority_queue": false,
        "r_output_rows": 1000000,
        "r_sort_passes": 1,
        "r_buffer_size": "2047Kb",
        "r_sort_mode": "packed_sort_key,rowid",
        "table": {
          "table_name": "t_char_1000000",
          "access_type": "ALL",
          "r_loops": 1,
          "rows": 1000000,
          "r_rows": 1e6,
          "r_table_time_ms": 2267.4,
          "r_other_time_ms": 1781,
          "filtered": 100,
          "r_filtered": 100
        }
      }
    }
  }

Does it have any meaning there? Need to check this.



 Comments   
Comment by Sergei Petrunia [ 2020-06-04 ]

Looking at the numbers in the report

query_block.r_other_time_ms=4048.4
filesort.r_total_time_ms=1634.5
table.r_table_time_ms=2267.4
table.r_other_time_ms=1781

note that filesort's time is less than table's time. This is surprising as
time to do filesort includes the time to read the table.

Comment by Sergei Petrunia [ 2020-06-04 ]

... and I can't reproduce this now. In my example, filesort's r_total_time_ms
is bigger than table's r_table_time_ms.

create table t_char_1000000 (char_field char(32) , b int);
insert into t_char_1000000 select mod(seq,100), 1234 from seq_1_to_50000;

analyze format=json  select char_field, count(distinct b) from t_char_1000000 group by char_field\G
*************************** 1. row ***************************
ANALYZE: {
  "query_block": {
    "select_id": 1,
    "r_loops": 1,
    "r_total_time_ms": 2759.069786,
    "read_sorted_file": {
      "r_rows": 50000,
      "filesort": {
        "sort_key": "t_char_1000000.char_field",
        "r_loops": 1,
        "r_total_time_ms": 2733.617565,
        "r_used_priority_queue": false,
        "r_output_rows": 50000,
        "r_sort_passes": 1,
        "r_buffer_size": "2047Kb",
        "r_sort_mode": "sort_key,packed_addon_fields",
        "table": {
          "table_name": "t_char_1000000",
          "access_type": "ALL",
          "r_loops": 1,
          "rows": 49935,
          "r_rows": 50000,
          "r_table_time_ms": 2684.348286,
          "r_other_time_ms": 46.11518914,
          "filtered": 100,
          "r_filtered": 100
        }
      }
    }
  }
}
1 row in set (2.760 sec)

analyze format=json  select char_field, count(distinct b) from t_char_1000000 where not (repeat(char_field,200) LIKE '%4a%')  group by char_field\G
*************************** 1. row ***************************
ANALYZE: {
  "query_block": {
    "select_id": 1,
    "r_loops": 1,
    "r_total_time_ms": 3093.73568,
    "read_sorted_file": {
      "r_rows": 50000,
      "filesort": {
        "sort_key": "t_char_1000000.char_field",
        "r_loops": 1,
        "r_total_time_ms": 3068.450581,
        "r_used_priority_queue": false,
        "r_output_rows": 50000,
        "r_sort_passes": 1,
        "r_buffer_size": "2047Kb",
        "r_sort_mode": "sort_key,packed_addon_fields",
        "table": {
          "table_name": "t_char_1000000",
          "access_type": "ALL",
          "r_loops": 1,
          "rows": 49935,
          "r_rows": 50000,
          "r_table_time_ms": 2931.132145,
          "r_other_time_ms": 136.6401994,
          "filtered": 100,
          "r_filtered": 100,
          "attached_condition": "repeat(t_char_1000000.char_field,200)  not like '%4a%'"
        }
      }
    }
  }
}
1 row in set (3.095 sec)

Comment by Sergei Petrunia [ 2020-06-04 ]

Back to the initial question.
r_other_time_ms value makes sense when inside filesort. It is defined as "Time spent for other activity".
What can be surprising here is that "other activity" includes doing the sorting and merging. These are not easy to separate as sorting steps are overlapped with reading the data.

When/if we start to count the time spent sorting the buffer (in write_keys) and doing the merge-sort operation, then r_other_time_ms will get a more "conventional" meaning of "checking the WHERE, etc". It will still likely include the time spent constructing the sort keys, because it's not feasible to count that separately.

Comment by Sergei Petrunia [ 2020-06-04 ]

For now, closing as not-a-bug.

Generated at Thu Feb 08 09:10:18 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.