[MDEV-27163] Very slow query with split_materialized optimizer Created: 2021-12-03  Updated: 2023-12-05

Status: Stalled
Project: MariaDB Server
Component/s: Optimizer
Affects Version/s: 10.3.31
Fix Version/s: 10.3

Type: Bug Priority: Major
Reporter: SirWill Assignee: Sergei Petrunia
Resolution: Unresolved Votes: 0
Labels: None
Environment:

Ubuntu 20.04.3 LTS



 Description   

We moved a database from MariaDB 10.2.39 to a new server with MariaDB version 10.3.31 and
noticed a query being very slow on the new server.

After troubleshooting for hours, we found out that the new split_materialized optimizer was causing the issue, which was enabled by default. With it enabled the query below takes over 2 minutes while it takes less than a second with it disabled.

The query:

SELECT DISTINCT 
    post.userID AS ownPosts,
    meta.participants,
    meta.totalLikes,
          thread.*
  FROM wbb1_thread thread
    LEFT JOIN wbb1_post post ON (post.threadID = thread.threadID AND post.userID = 29630)
    LEFT JOIN  (
          SELECT threadID, COUNT(DISTINCT userID) AS participants, SUM(cumulativeLikes) AS totalLikes
          FROM    wbb1_post
          WHERE `time` > UNIX_TIMESTAMP() - 2592000
          GROUP BY threadID
  ) meta ON thread.threadID = meta.threadID
  WHERE   thread.threadID IN (103361,99668,96785,113398,125293)
          ORDER BY thread.lastPostTime DESC,thread.threadID DESC;

I also noticed that removing one of the joins on the wbb1_post table will make the query to complete fast with split_materialized enabled.

Create table:

CREATE TABLE `wbb1_post` (
  `postID` int(10) NOT NULL AUTO_INCREMENT,
  `threadID` int(10) NOT NULL,
  `userID` int(10) DEFAULT NULL,
  `username` varchar(255) COLLATE utf8mb4_unicode_ci NOT NULL DEFAULT '',
  `subject` varchar(255) COLLATE utf8mb4_unicode_ci NOT NULL DEFAULT '',
  `message` longtext COLLATE utf8mb4_unicode_ci NOT NULL,
  `time` int(10) NOT NULL DEFAULT 0,
  `isDeleted` tinyint(1) NOT NULL DEFAULT 0,
  `isDisabled` tinyint(1) NOT NULL DEFAULT 0,
  `isClosed` tinyint(1) NOT NULL DEFAULT 0,
  `editorID` int(10) DEFAULT NULL,
  `editor` varchar(255) COLLATE utf8mb4_unicode_ci NOT NULL DEFAULT '',
  `lastEditTime` int(10) NOT NULL DEFAULT 0,
  `editCount` mediumint(7) NOT NULL DEFAULT 0,
  `editReason` mediumtext COLLATE utf8mb4_unicode_ci DEFAULT NULL,
  `attachments` smallint(5) NOT NULL DEFAULT 0,
  `pollID` int(10) DEFAULT NULL,
  `enableHtml` tinyint(1) NOT NULL DEFAULT 0,
  `ipAddress` varchar(39) COLLATE utf8mb4_unicode_ci NOT NULL DEFAULT '',
  `cumulativeLikes` mediumint(7) NOT NULL DEFAULT 0,
  `deleteTime` int(10) NOT NULL DEFAULT 0,
  `lastVersionTime` int(10) NOT NULL DEFAULT 0,
  `enableTime` int(10) NOT NULL DEFAULT 0,
  `hasEmbeddedObjects` tinyint(1) NOT NULL DEFAULT 0,
  `isUzbot` tinyint(1) DEFAULT 0,
  PRIMARY KEY (`postID`),
  KEY `threadID` (`threadID`,`userID`),
  KEY `threadID_2` (`threadID`,`isDeleted`,`isDisabled`,`time`),
  KEY `isDeleted` (`isDeleted`),
  KEY `isDisabled` (`isDisabled`),
  KEY `ipAddress` (`ipAddress`),
  KEY `time` (`time`),
  KEY `abbc0439063d333ac79065f1379ac5ef_fk` (`userID`),
  KEY `fe4ab714a15afd3495126fede5ad0b6c_fk` (`editorID`),
  KEY `33917154f7a46447d164b6fb12be8f36_fk` (`pollID`),
  KEY `postEnableTime` (`enableTime`),
  KEY `userToPost` (`userID`,`isDeleted`,`isDisabled`,`threadID`),
  KEY `thread_3` (`threadID`,`isDisabled`,`userID`,`time`),
  CONSTRAINT `33917154f7a46447d164b6fb12be8f36_fk` FOREIGN KEY (`pollID`) REFERENCES `wcf1_poll` (`pollID`) ON DELETE SET NULL,
  CONSTRAINT `87e5b9c2dcf7bf276097178fc13ee88c_fk` FOREIGN KEY (`threadID`) REFERENCES `wbb1_thread` (`threadID`) ON DELETE CASCADE,
  CONSTRAINT `abbc0439063d333ac79065f1379ac5ef_fk` FOREIGN KEY (`userID`) REFERENCES `wcf1_user` (`userID`) ON DELETE SET NULL,
  CONSTRAINT `fe4ab714a15afd3495126fede5ad0b6c_fk` FOREIGN KEY (`editorID`) REFERENCES `wcf1_user` (`userID`) ON DELETE SET NULL
) ENGINE=InnoDB AUTO_INCREMENT=885294 DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_unicode_ci

The table has 638543 entries.

Query explain:

MariaDB 10.2.39:
+------+-------------+------------+-------+-----------------------------------------------------------------------------+----------+---------+------------------------------------+------+----------------------------------------------+
| id   | select_type | table      | type  | possible_keys                                                               | key      | key_len | ref                                | rows | Extra                                        |
+------+-------------+------------+-------+-----------------------------------------------------------------------------+----------+---------+------------------------------------+------+----------------------------------------------+
|    1 | PRIMARY     | thread     | range | PRIMARY                                                                     | PRIMARY  | 4       | NULL                               |    5 | Using where; Using temporary; Using filesort |
|    1 | PRIMARY     | post       | ref   | threadID,threadID_2,abbc0439063d333ac79065f1379ac5ef_fk,userToPost,thread_3 | threadID | 9       | hayday_forum.thread.threadID,const |    1 | Using index                                  |
|    1 | PRIMARY     | <derived2> | ref   | key0                                                                        | key0     | 5       | hayday_forum.thread.threadID       |   10 |                                              |
|    2 | DERIVED     | wbb1_post  | range | time                                                                        | time     | 4       | NULL                               | 4231 | Using index condition; Using filesort        |
+------+-------------+------------+-------+-----------------------------------------------------------------------------+----------+---------+------------------------------------+------+----------------------------------------------+
 
MariaDB 10.3.31 (split_materialized=on):
+------+-----------------+------------+-------+-----------------------------------------------------------------------------+----------+---------+------------------------------------+------+----------------------------------------------+
| id   | select_type     | table      | type  | possible_keys                                                               | key      | key_len | ref                                | rows | Extra                                        |
+------+-----------------+------------+-------+-----------------------------------------------------------------------------+----------+---------+------------------------------------+------+----------------------------------------------+
|    1 | PRIMARY         | thread     | range | PRIMARY                                                                     | PRIMARY  | 4       | NULL                               |    5 | Using where; Using temporary; Using filesort |
|    1 | PRIMARY         | post       | ref   | threadID,threadID_2,abbc0439063d333ac79065f1379ac5ef_fk,userToPost,thread_3 | threadID | 9       | hayday_forum.thread.threadID,const |    1 | Using index                                  |
|    1 | PRIMARY         | <derived2> | ref   | key0                                                                        | key0     | 5       | hayday_forum.thread.threadID       |    2 |                                              |
|    2 | LATERAL DERIVED | wbb1_post  | ref   | threadID,threadID_2,time,thread_3                                           | threadID | 4       | hayday_forum.thread.threadID       |    4 | Using where                                  |
+------+-----------------+------------+-------+-----------------------------------------------------------------------------+----------+---------+------------------------------------+------+----------------------------------------------+



 Comments   
Comment by Sergei Petrunia [ 2022-02-27 ]

Lookgin at the EXPLAINs:

The query starts with reading table thread. We scan 5 threads:

WHERE thread.threadID IN (103361,99668,96785,113398,125293)

Then, we join with table post on

(post.threadID = thread.threadID AND post.userID = 29630)

The estimate is 1 (although one can imagine that the user 29630 has made more than one post in a thread).

Then, we join with the derived table that computes per-thread totals for messages in last 30 days:

    LEFT JOIN  (
          SELECT threadID, COUNT(DISTINCT userID) AS participants, SUM(cumulativeLikes) AS totalLikes
          FROM    wbb1_post
          WHERE `time` > UNIX_TIMESTAMP() - 2592000
          GROUP BY threadID
  ) meta ON thread.threadID = meta.threadID

The plan that is not using LATERAL DERIVED uses range(time), there are 4K messages in the last month.

The plan that is using LATERAL DERIVED reads all messages for each thread. There are 4 messages on average.

Comment by Sergei Petrunia [ 2022-02-27 ]

Don't see anything apparently wrong so far, the estimates shown in EXPLAIN do not indicate that the plan with LATERAL DERIVED would be much slower.

Trying to imagine a situation where LATERAL DERIVED be much slower...
There are only 5 threads scanned, but it could be that these have exceptionally high number of messages that were posted more than a month ago... (denote this as GUESS#1)

But this doesn't explain this observation:

I also noticed that removing one of the joins on the wbb1_post table will make the query to complete fast with split_materialized enabled.

I assume the above talks about removing this part of the query:

    LEFT JOIN wbb1_post post ON (post.threadID = thread.threadID AND post.userID = 29630)

The optimizer doesn't expect that it adds any fanout (EXPLAIN shows rows=1). One could imagine that in reality it does have large fanout (denote this as GUESS#2) in which case LATERAL DERIVED will execute the subquery more times when the extra LEFT JOIN (...) is present. If the subquery execution is expensive (as we guessed in GUESS#1), then together with GUESS#2 this would explain the observed behavior.

Comment by Sergei Petrunia [ 2022-02-27 ]

SirWill, Is it possible to get the output of the following:

ANALYZE FORMAT=JSON <the query>;
set optimizer_switch='split_materialized=off';
ANALYZE FORMAT=JSON <the query>;

Comment by SirWill [ 2022-02-28 ]

{
  "query_block": {
    "select_id": 1,
    "r_loops": 1,
    "r_total_time_ms": 172924,
    "filesort": {
      "sort_key": "thread.lastPostTime desc, thread.threadID desc",
      "r_loops": 1,
      "r_total_time_ms": 0.0114,
      "r_used_priority_queue": false,
      "r_output_rows": 5,
      "r_buffer_size": "360",
      "temporary_table": {
        "table": {
          "table_name": "thread",
          "access_type": "range",
          "possible_keys": ["PRIMARY"],
          "key": "PRIMARY",
          "key_length": "4",
          "used_key_parts": ["threadID"],
          "r_loops": 1,
          "rows": 5,
          "r_rows": 5,
          "r_total_time_ms": 0.0514,
          "filtered": 100,
          "r_filtered": 100,
          "attached_condition": "thread.threadID in (103361,99668,96785,113398,125293)"
        },
        "table": {
          "table_name": "post",
          "access_type": "ref",
          "possible_keys": [
            "threadID",
            "threadID_2",
            "abbc0439063d333ac79065f1379ac5ef_fk",
            "userToPost",
            "thread_3"
          ],
          "key": "threadID",
          "key_length": "9",
          "used_key_parts": ["threadID", "userID"],
          "ref": ["hayday_forum.thread.threadID", "const"],
          "r_loops": 5,
          "rows": 1,
          "r_rows": 1113.2,
          "r_total_time_ms": 5.9298,
          "filtered": 100,
          "r_filtered": 100,
          "using_index": true
        },
        "table": {
          "table_name": "<derived2>",
          "access_type": "ref",
          "possible_keys": ["key0"],
          "key": "key0",
          "key_length": "5",
          "used_key_parts": ["threadID"],
          "ref": ["hayday_forum.thread.threadID"],
          "r_loops": 5566,
          "rows": 2,
          "r_rows": 1,
          "r_total_time_ms": 26.013,
          "filtered": 100,
          "r_filtered": 100,
          "materialized": {
            "lateral": 1,
            "query_block": {
              "select_id": 2,
              "r_loops": 5566,
              "r_total_time_ms": 172832,
              "table": {
                "table_name": "wbb1_post",
                "access_type": "ref",
                "possible_keys": ["threadID", "threadID_2", "time", "thread_3"],
                "key": "threadID",
                "key_length": "4",
                "used_key_parts": ["threadID"],
                "ref": ["hayday_forum.thread.threadID"],
                "r_loops": 5566,
                "rows": 4,
                "r_rows": 31585,
                "r_total_time_ms": 166924,
                "filtered": 100,
                "r_filtered": 2.6129,
                "attached_condition": "wbb1_post.`time` > 1643451787"
              }
            }
          }
        }
      }
    }
  }
}


{
  "query_block": {
    "select_id": 1,
    "r_loops": 1,
    "r_total_time_ms": 22.168,
    "filesort": {
      "sort_key": "thread.lastPostTime desc, thread.threadID desc",
      "r_loops": 1,
      "r_total_time_ms": 0.0033,
      "r_used_priority_queue": false,
      "r_output_rows": 5,
      "r_buffer_size": "360",
      "temporary_table": {
        "table": {
          "table_name": "thread",
          "access_type": "range",
          "possible_keys": ["PRIMARY"],
          "key": "PRIMARY",
          "key_length": "4",
          "used_key_parts": ["threadID"],
          "r_loops": 1,
          "rows": 5,
          "r_rows": 5,
          "r_total_time_ms": 0.0306,
          "filtered": 100,
          "r_filtered": 100,
          "attached_condition": "thread.threadID in (103361,99668,96785,113398,125293)"
        },
        "table": {
          "table_name": "post",
          "access_type": "ref",
          "possible_keys": [
            "threadID",
            "threadID_2",
            "abbc0439063d333ac79065f1379ac5ef_fk",
            "userToPost",
            "thread_3"
          ],
          "key": "threadID",
          "key_length": "9",
          "used_key_parts": ["threadID", "userID"],
          "ref": ["hayday_forum.thread.threadID", "const"],
          "r_loops": 5,
          "rows": 1,
          "r_rows": 1113.2,
          "r_total_time_ms": 0.9507,
          "filtered": 100,
          "r_filtered": 100,
          "using_index": true
        },
        "table": {
          "table_name": "<derived2>",
          "access_type": "ref",
          "possible_keys": ["key0"],
          "key": "key0",
          "key_length": "5",
          "used_key_parts": ["threadID"],
          "ref": ["hayday_forum.thread.threadID"],
          "r_loops": 5566,
          "rows": 10,
          "r_rows": 1,
          "r_total_time_ms": 0.5913,
          "filtered": 100,
          "r_filtered": 100,
          "materialized": {
            "query_block": {
              "select_id": 2,
              "r_loops": 1,
              "r_total_time_ms": 6.531,
              "read_sorted_file": {
                "r_rows": 3881,
                "filesort": {
                  "sort_key": "wbb1_post.threadID",
                  "r_loops": 1,
                  "r_total_time_ms": 5.9207,
                  "r_used_priority_queue": false,
                  "r_output_rows": 3881,
                  "r_buffer_size": "4096Kb",
                  "table": {
                    "table_name": "wbb1_post",
                    "access_type": "range",
                    "possible_keys": ["time"],
                    "key": "time",
                    "key_length": "4",
                    "used_key_parts": ["time"],
                    "r_loops": 1,
                    "rows": 3881,
                    "r_rows": 3881,
                    "r_total_time_ms": 5.439,
                    "filtered": 100,
                    "r_filtered": 100,
                    "index_condition": "wbb1_post.`time` > 1643451960"
                  }
                }
              }
            }
          }
        }
      }
    }
  }
}

Comment by Julien Fritsch [ 2023-12-05 ]

Automated message:
----------------------------
Since this issue has not been updated since 6 weeks, it's time to move it back to Stalled.

Comment by JiraAutomate [ 2023-12-05 ]

Automated message:
----------------------------
Since this issue has not been updated since 6 weeks, it's time to move it back to Stalled.

Generated at Thu Feb 08 09:50:49 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.