[MDEV-29284] ANALYZE doesn't work with pushed derived tables - Jira

Details

Type: Bug
Status: Closed (View Workflow)
Priority: Major
Resolution: Fixed
Affects Version/s: 10.4(EOL), 10.5, 10.6, 10.7(EOL), 10.8(EOL), 10.9(EOL), 10.10(EOL)
Fix Version/s: 10.4.31, 10.5.22, 10.6.15, 10.9.8, 10.10.6, 10.11.5, 11.0.3, 11.1.2, 11.2.1
Component/s: Storage Engine - Federated
Labels:
None

Description

(I tried on 10.6 and 10.10 but perhaps earlier versions are affected as well)

Put this into mysql-test/suite/federated/a1.test :

--source have_federatedx.inc

--source include/federated.inc

connection default;

set global federated_pushdown=1;

connection slave;

DROP TABLE IF EXISTS federated.t1;

CREATE TABLE federated.t1 (

  a  int(20) NOT NULL

);

INSERT INTO federated.t1 VALUES (0),(1),(2),(3),(4),(5),(6),(7),(8),(9);

connection master;

DROP TABLE IF EXISTS federated.t1;

--replace_result $SLAVE_MYPORT SLAVE_PORT

eval

CREATE TABLE federated.t1 (

  a int(20) NOT NULL

ENGINE="FEDERATED" CONNECTION='mysql://root@127.0.0.1:$SLAVE_MYPORT/federated/t1';

analyze format=json

select * from (select * from federated.t1 where a<3 union

               select * from federated.t1 where a>=3) T;

set global federated_pushdown=0;

source include/federated_cleanup.inc;

and run.

At the end of output see:

  "query_block": {

    "select_id": 1,

    "r_loops": 1,

    "r_total_time_ms": 0.548587957,

    "table": {

      "table_name": "<derived2>",

      "access_type": "ALL",

      "r_loops": 1,

      "rows": 20,

      "r_rows": 0,

      "r_table_time_ms": 1.660416e-4,

      "r_other_time_ms": 0.005957995,

      "filtered": 100,

      "r_filtered": 100,

      "materialized": {

        "query_block": {

          "select_id": 2,

          "table": {

            "message": "Pushed derived"

Note the {{ "r_rows": 0}}.

This is because pushdown code doesn't handle ANALYZE correctly.

Attachments

Activity

Ascending order - Click to sort in descending order

Oleg Smirnov added a comment - 2023-07-03 15:12

So what is the goal of this task - to display correct "r_rows" or to remove "r_rows" line for pushed down derived tables?

For information, there is almost no analyze data for pushed SELECTs:

analyze format=json select * from federated.t1 where a<3;

ANALYZE

  "query_block": {

    "select_id": 1,

    "table": {

      "message": "Pushed select"

Oleg Smirnov added a comment - 2023-07-03 15:12 So what is the goal of this task - to display correct "r_rows" or to remove "r_rows" line for pushed down derived tables? For information, there is almost no analyze data for pushed SELECTs: analyze format=json select * from federated.t1 where a<3; ANALYZE { "query_block": { "select_id": 1, "table": { "message": "Pushed select" } } }

Oleg Smirnov added a comment - 2023-07-04 07:04

The cause of this is that the server does not actually loop through the rows returned from the storage engine when doing ANALYZE:

derived_handler.cc
int Pushdown_derived::execute()
{
...

if ((err= handler->init_scan()))
goto error;

if (is_analyze)
{
handler->end_scan();
DBUG_RETURN(0);
}

while (!(err= handler->next_row()))
{
...
}

If just does init_scan() and then end_scan() without looping through next_row() results.

The same approach is observed at select_handler.cc for pushed SELECTs.
Removing

   if (is_analyze)

    handler->end_scan();

    DBUG_RETURN(0);

makes "r_rows" show the actual data. It is interesting what was the reasoning behind such a limitation for ANALYZE. igor, can you comment?

Oleg Smirnov added a comment - 2023-07-04 07:04 The cause of this is that the server does not actually loop through the rows returned from the storage engine when doing ANALYZE: derived_handler.cc int Pushdown_derived::execute() { ... if ((err= handler->init_scan())) goto error; if (is_analyze) { handler->end_scan(); DBUG_RETURN(0); } while (!(err= handler->next_row())) { ... } If just does init_scan() and then end_scan() without looping through next_row() results. The same approach is observed at select_handler.cc for pushed SELECTs. Removing if (is_analyze) { handler->end_scan(); DBUG_RETURN(0); } makes "r_rows" show the actual data. It is interesting what was the reasoning behind such a limitation for ANALYZE. igor , can you comment?

Sergei Petrunia added a comment - 2023-07-04 07:45

ANALYZE should execute the query. That means executing pushed down derived tables. This should be easy to implement.

It might be harder to handle ANALYZE in the select handler... ha_federated passes the query text to the backend. However, the backend then will produce ANALYZE output (either tabular or JSON). I'm not sure if the frontend is prepared to deal with this).

In any case, it seems obvious to me that for pushed derived tables, ANALYZE should read the rows from the backend.

Sergei Petrunia added a comment - 2023-07-04 07:45 ANALYZE should execute the query. That means executing pushed down derived tables. This should be easy to implement. It might be harder to handle ANALYZE in the select handler... ha_federated passes the query text to the backend. However, the backend then will produce ANALYZE output (either tabular or JSON). I'm not sure if the frontend is prepared to deal with this). In any case, it seems obvious to me that for pushed derived tables, ANALYZE should read the rows from the backend.

Oleg Smirnov added a comment - 2023-07-05 13:18

psergei, please review bb-10.4-mdev-29284. I didn't add a new test case since there is one already in federated.federatedx_create_handlers covering this.

Oleg Smirnov added a comment - 2023-07-05 13:18 psergei , please review bb-10.4-mdev-29284. I didn't add a new test case since there is one already in federated.federatedx_create_handlers covering this.

Sergei Petrunia added a comment - 2023-07-06 10:02

oleg.smirnov, please apply the below patch. Ok to push after it is applied.

diff --git a/sql/derived_handler.cc b/sql/derived_handler.cc

index e3b710b1622..cddd1200f5d 100644

--- a/sql/derived_handler.cc

+++ b/sql/derived_handler.cc

@@ -40,7 +40,6 @@

 Pushdown_derived::Pushdown_derived(TABLE_LIST *tbl, derived_handler *h)

  : derived(tbl), handler(h)

-  is_analyze= handler->thd->lex->analyze_stmt;

diff --git a/sql/sql_select.h b/sql/sql_select.h

index 9b8bc72bbf9..5c00a77638a 100644

--- a/sql/sql_select.h

+++ b/sql/sql_select.h

@@ -2545,8 +2545,6 @@ class derived_handler;

 class Pushdown_derived: public Sql_alloc

-private:

-  bool is_analyze;

 public:

   TABLE_LIST *derived;

   derived_handler *handler;

Sergei Petrunia added a comment - 2023-07-06 10:02 oleg.smirnov , please apply the below patch. Ok to push after it is applied. diff --git a/sql/derived_handler.cc b/sql/derived_handler.cc index e3b710b1622..cddd1200f5d 100644 --- a/sql/derived_handler.cc +++ b/sql/derived_handler.cc @@ -40,7 +40,6 @@ Pushdown_derived::Pushdown_derived(TABLE_LIST *tbl, derived_handler *h) : derived(tbl), handler(h) { - is_analyze= handler->thd->lex->analyze_stmt; } diff --git a/sql/sql_select.h b/sql/sql_select.h index 9b8bc72bbf9..5c00a77638a 100644 --- a/sql/sql_select.h +++ b/sql/sql_select.h @@ -2545,8 +2545,6 @@ class derived_handler; class Pushdown_derived: public Sql_alloc { -private: - bool is_analyze; public: TABLE_LIST *derived; derived_handler *handler;

Oleg Smirnov added a comment - 2023-07-07 11:58

Pushed to 10.4

Oleg Smirnov added a comment - 2023-07-07 11:58 Pushed to 10.4

People

Assignee:: Oleg Smirnov

Reporter:: Sergei Petrunia

Votes:: 0 Vote for this issue

Watchers:: 2 Start watching this issue

Dates

Created:: 2022-08-10 08:19

Updated:: 2023-09-11 10:58

Resolved:: 2023-07-07 11:58

Git Integration

Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.

MariaDB Server