[MDEV-29284] ANALYZE doesn't work with pushed derived tables Created: 2022-08-10  Updated: 2023-09-11  Resolved: 2023-07-07

Status: Closed
Project: MariaDB Server
Component/s: Storage Engine - Federated
Affects Version/s: 10.4, 10.5, 10.6, 10.7, 10.8, 10.9, 10.10
Fix Version/s: 10.4.31, 10.5.22, 10.6.15, 10.9.8, 10.10.6, 10.11.5, 11.0.3, 11.1.2, 11.2.1

Type: Bug Priority: Major
Reporter: Sergei Petrunia Assignee: Oleg Smirnov
Resolution: Fixed Votes: 0
Labels: None


 Description   

(I tried on 10.6 and 10.10 but perhaps earlier versions are affected as well)

Put this into mysql-test/suite/federated/a1.test :

--source have_federatedx.inc
--source include/federated.inc
 
connection default;
 
set global federated_pushdown=1;
 
connection slave;
 
DROP TABLE IF EXISTS federated.t1;
CREATE TABLE federated.t1 (
  a  int(20) NOT NULL
);
INSERT INTO federated.t1 VALUES (0),(1),(2),(3),(4),(5),(6),(7),(8),(9);
 
connection master;
 
DROP TABLE IF EXISTS federated.t1;
 
--replace_result $SLAVE_MYPORT SLAVE_PORT
eval
CREATE TABLE federated.t1 (
  a int(20) NOT NULL
)
ENGINE="FEDERATED" CONNECTION='mysql://root@127.0.0.1:$SLAVE_MYPORT/federated/t1';
 
analyze format=json
select * from (select * from federated.t1 where a<3 union 
               select * from federated.t1 where a>=3) T;
 
set global federated_pushdown=0;
 
source include/federated_cleanup.inc;

and run.

At the end of output see:

{
  "query_block": {
    "select_id": 1,
    "r_loops": 1,
    "r_total_time_ms": 0.548587957,
    "table": {
      "table_name": "<derived2>",
      "access_type": "ALL",
      "r_loops": 1,
      "rows": 20,
      "r_rows": 0,
      "r_table_time_ms": 1.660416e-4,
      "r_other_time_ms": 0.005957995,
      "filtered": 100,
      "r_filtered": 100,
      "materialized": {
        "query_block": {
          "select_id": 2,
          "table": {
            "message": "Pushed derived"
          }
        }
      }
    }
  }
}

Note the {{ "r_rows": 0}}.

This is because pushdown code doesn't handle ANALYZE correctly.



 Comments   
Comment by Oleg Smirnov [ 2023-07-03 ]

So what is the goal of this task - to display correct "r_rows" or to remove "r_rows" line for pushed down derived tables?

For information, there is almost no analyze data for pushed SELECTs:

analyze format=json select * from federated.t1 where a<3;
ANALYZE
{
  "query_block": {
    "select_id": 1,
    "table": {
      "message": "Pushed select"
    }
  }
}

Comment by Oleg Smirnov [ 2023-07-04 ]

The cause of this is that the server does not actually loop through the rows returned from the storage engine when doing ANALYZE:

derived_handler.cc

int Pushdown_derived::execute()
{
 ...
 
  if ((err= handler->init_scan()))
    goto error;
 
  if (is_analyze)
  {
    handler->end_scan();
    DBUG_RETURN(0);
  }
 
  while (!(err= handler->next_row()))
  {
     ...
  }

If just does init_scan() and then end_scan() without looping through next_row() results.

The same approach is observed at select_handler.cc for pushed SELECTs.
Removing

   if (is_analyze)
  {
    handler->end_scan();
    DBUG_RETURN(0);
  }

makes "r_rows" show the actual data. It is interesting what was the reasoning behind such a limitation for ANALYZE. igor, can you comment?

Comment by Sergei Petrunia [ 2023-07-04 ]

ANALYZE should execute the query. That means executing pushed down derived tables. This should be easy to implement.

It might be harder to handle ANALYZE in the select handler... ha_federated passes the query text to the backend. However, the backend then will produce ANALYZE output (either tabular or JSON). I'm not sure if the frontend is prepared to deal with this).

In any case, it seems obvious to me that for pushed derived tables, ANALYZE should read the rows from the backend.

Comment by Oleg Smirnov [ 2023-07-05 ]

psergei, please review bb-10.4-mdev-29284. I didn't add a new test case since there is one already in federated.federatedx_create_handlers covering this.

Comment by Sergei Petrunia [ 2023-07-06 ]

oleg.smirnov, please apply the below patch. Ok to push after it is applied.

diff --git a/sql/derived_handler.cc b/sql/derived_handler.cc
index e3b710b1622..cddd1200f5d 100644
--- a/sql/derived_handler.cc
+++ b/sql/derived_handler.cc
@@ -40,7 +40,6 @@
 Pushdown_derived::Pushdown_derived(TABLE_LIST *tbl, derived_handler *h)
  : derived(tbl), handler(h)
 {
-  is_analyze= handler->thd->lex->analyze_stmt;
 }
 
 
diff --git a/sql/sql_select.h b/sql/sql_select.h
index 9b8bc72bbf9..5c00a77638a 100644
--- a/sql/sql_select.h
+++ b/sql/sql_select.h
@@ -2545,8 +2545,6 @@ class derived_handler;
 
 class Pushdown_derived: public Sql_alloc
 {
-private:
-  bool is_analyze;
 public:
   TABLE_LIST *derived;
   derived_handler *handler;

Comment by Oleg Smirnov [ 2023-07-07 ]

Pushed to 10.4

Generated at Thu Feb 08 10:07:20 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.