* Design doc for spider select handler
  :PROPERTIES:
  :UPDATED:  [2025-06-23 Mon 11:39]
  :END:
** Motivation
   :PROPERTIES:
   :UPDATED:  [2025-06-26 Thu 14:46]
   :ID:       8a0b0a40-1491-4e4f-b8ba-c3b70614b68d
   :END:
Spider uses the group by handler (gbh) for direct execution of SELECT
queries.

At the end of optimization, the sql layer asks the storage engine to
try to create a gbh. The engine could choose to not create one if it
determines that it cannot execute the query.

For spider, all it needs is a query equivalent to the original query,
so that it can translate the local table names to remote table names,
execute the query on the data node, and store the results.

The first problem (let's call it SpdExtraFields) with the group by
handler is that the sql layer passes extra fields (all_fields) to
spider gbh as SELECT items. The spider gbh cannot simply ignore the
extra fields and choose to use st_select_lex::item_list, because the
temp table is created using all_fields as well.

The second problem (SpdSkipConst) is that the sql layer skips
constants (optimized or not) in temptable fields for spider to store
results to.

Both the first and second problems were addressed in MDEV-26345. The
solution was for spider to recover the SELECT items, and skip storing
constants. It was what could be done in the framework of gbh, but it
would be much better if such complexities were avoided in the first
place.

The third problem (SpdInvalidQuery) is that the optimizer turns the
query to something invalid when passed to spider for a gbh creation.

One example is the addition of semi join and tvc into the query, like
in MDEV-33560. A workaround is to disable a useful optimization
switch.

Another example is the one in the description of MDEV-32273 that uses
right join, where a JOIN use out-of-scope items, after a right join is
transformed to a left join, and optimizer further muddies the water by
transforming on the ON conditions. This is fixed in MDEV-26247 by a
detection of out-of-scope ON conditions and returning NULL on gbh
creation (i.e. not creating a gbh).

Yet another example is the one in MDEV-32907, where at gbh creation
the query is valid, but not at execution. The cause is still under
investigation.

While most of these issues have a solution or workaround, or are under
investigation:

1. There is no contract between the optimizer and spider, where the
   former promises the latter that the query will be valid at the
   creation or execution for.
2. The fact that spider has the "reverse-engineer" optimizations done
   at the sql layer to recover the original queries is not a good
   design and adds to maintenance cost.
3. Where spider can not recover the original query, but has to give up
   on direct execution by not creating a gbh affects performance.
4. Similarly, when an optimization switch needs to be disabled for the
   spider gbh to work, performance may be affected for queries that do
   not involve spider but benefits from the disabled optimization.
5. The gbh is designed for handling GROUP BY, but spider uses it for
   direct execution of any SELECT query with a simple translation of
   names. This mismatch between purpose and implementation adds
   confusion and is a ground for potential bugs. The spider use is
   closer to the purpose to the select handler.

** Comparison between group by handler (gbh) and select handler (sh)
   :PROPERTIES:
   :UPDATED:  [2025-06-26 Thu 14:06]
   :END:
Both have the same interface: a create method by the handlerton to
create the gbh/sh, init_scan() / next_row() / end_scan() for execution.

A gbh is wrapped in a Pushdown_query with an execute method calling
the execution methods mentioned in the previous paragraph, whereas an
sh has its own execute method doing the same thing.

The Pushdown_query object is JOIN::pushdown_query, while the
select_handler object is SELECT_LEX::pushdown_select.

A gbh is created at the end of optimization, and the execution happens
in do_select. An sh is created before optimization, and if one is
created the entire optimization step is skipped. The sh execution
happens in JOIN::exec_inner, before the call to do_select(), thus
earlier than pushdown_query.

The sql layer passes a Query struct to the gbh for query execution.
The query struct does not always match the original query (see
Motivation). The sql layer passes the SELECT_LEX to the sh for query
execution.

- [X] TODO: gbh stores results in the temptable, does the sh not
  require any temptable? yes, it creates a temp table to store results

- [ ] TODO: gbh has some limitations such as "all tables should be the same
engine". check whether sh has the same limitations.

- [ ] TODO: add any other similarities and differences

** The sequence gbh and a bit of history
   :PROPERTIES:
   :UPDATED:  [2025-06-25 Wed 13:49]
   :ID:       a03a8942-936a-4750-a13f-8aca2b6263f6
   :END:
The sequence engine is the original and only other engine in the
mariadb code base using gbh.

Recall that sequence engine tables are (see /server/server-usage/storage-engines/sequence-storage-engine.md in the
docs repo) virtual tables with names seq_M_to_N[_step_K] that contains
one field seq going from M to N (with step K):

#+begin_src sql
SELECT * FROM seq_1_to_15_step_3;
+-----+
| seq |
+-----+
|   1 |
|   4 |
|   7 |
|  10 |
|  13 |
+-----+
#+end_src

The sequence gbh can handle SUM and COUNT queries directly by doing a
simple mathematical calculation. Despite the name, the gbh does not
handle queries with GROUP BY. It also does not handle queries with
WHERE or any SELECT items that are not SUM or COUNT, so the result is
always one row. This is indeed much faster than a table scan, thanks
to the predictable content of a sequence table.

The sequence gbh was included in the commit that added gbh itself
(cf50e13fbd1b4d27a3542fe2751216d274eb7493), as part of MDEV-6080. The
initial implementation requires an aggregate function or GROUP BY to
warrant the use of a gbh:

#+begin_src diff
+  if ((tmp_table_param.sum_func_count || group_list) && !procedure &&
+      (one_storage_engine && one_storage_engine->create_group_by))
+  {
+    /* Check if the storage engine can intercept the query */
+    if ((storage_handler_for_group_by=
+         (one_storage_engine->create_group_by)(thd, select_lex,
+                                               &all_fields,
+                                               tables_list,
+                                               group_list, order,
+                                               conds, having)))
#+end_src

This changed when the spider gbh was added in
52ca07c2a0977f74ccfb56363e4158f0bd0ad3a0:

#+begin_src diff
modified   sql/sql_select.cc
@@ -2403,20 +2403,22 @@ bool JOIN::make_aggr_tables_info()

   /*
     All optimization is done. Check if we can use the storage engines
-    group by handler to evaluate the group by
+    group by handler to evaluate the group by.
+    Some storage engines, like spider can also do joins, group by and
+    distinct in the engine, so we do this for all queries, not only
+    GROUP BY queries.
   ,*/
-  if (tables_list && (tmp_table_param.sum_func_count || group_list) &&
-      !procedure)
+  if (tables_list && !procedure)
   {
#+end_src

Henceforth gbh was allowed to handle any SELECT query the engine
wants, even though a common misconception is that gbh is still for
aggregate functions / GROUP BY only, which is reflected by code
comments, such as this at the beginning of group_by_handler.cc:

#+begin_src c++
/*
  This file implements the group_by_handler code. This interface
  can be used by storage handlers that can intercept summary or GROUP
  BY queries from MariaDB and itself return the result to the user or
  upper level.
,*/
#+end_src

TODO: fix these comments.

** The federatedx sh
   :PROPERTIES:
   :UPDATED:  [2025-06-26 Thu 14:58]
   :ID:       c90dc87b-25ca-4e11-aa61-b67ab91761b0
   :END:
We now turn to sh. In the mariadb codebase it is implemented in the
federatedx engine.

Before MDEV-25080, the federatedx sh simply executes the full query,
assuming that is what it is asked to handle. After MDEV-25080, it
executes the query printed from the select_lex that could be a SELECT
as part of a UNION. Spider will do something similar to columnstore as
mentioned in the description of MDEV-25080, i.e. walk the query to
construct one that can be executed at the data node.

The results are stored in the temp table, which has the same number of
fields as the SELECT items (see
st_select_lex_unit::join_union_item_types).

I have run example cases mentioned in the [[id:8a0b0a40-1491-4e4f-b8ba-c3b70614b68d][Motivation]] section against
the federatedx sh, to check whether the three problems SpdExtraFields,
SpdSkipConst, SpdInvalidQuery are present there. They do not, which is
promising. See [[id:ff6ed5d2-c6b3-4ed3-b8ef-c09badc8a648][Appendix 1: federatedx testcase]] for the full testcase
which should be placed under the federated suite
~mysql-test/suite/federated/~.

For the SpdExtraFields examples, as mentioned above, the temp table
has the same number of fields as the SELECT items.

For SpdSkipConst, TMP_TABLE_ALL_COLUMNS is passed to
create_tmp_table() in sh which ensures no columns are skipped. Note
that for gbh the select_options passed is conjured from several
sources rather than hardcoded.

As a side note, group=null is passed to create_tmp_table, but that
does not means sh cannot handle group by. I tested a query with GROUP
BY and it is taken care of by the federatedx sh.

For SpdInvalidQuery, as expected, the sh creation causes the whole
optimization to be skipped, and the query remains intact throughout.
However there are reports such as MDEV-23860 suggesting that the query
can still be altered at the prepare phase. But it is certainly less
altered than at the end of the optimization phase as is the case with
gbh.

** Worries
   :PROPERTIES:
   :UPDATED:  [2025-06-25 Wed 13:49]
   :END:
A worry is that sh cannot handle view/derived tables.

For view and derived tables, in some but not all cases a gbh is
created.

Here's an example where a gbh is not created:

#+begin_src sql
create table t2 (c int);
create table t1 (c int) ENGINE=Spider
COMMENT='WRAPPER "mysql", srv "srv",TABLE "t2"';
create view v as select * from t1;
select * from v;
drop view v;
drop table t1, t2;
#+end_src

It is not created because here ht is the heap handlerton
#+begin_src c++
    handlerton *ht= tbl && tbl->table ? tbl->table->file->partition_ht() : 0;
    for (tbl= tbl->next_local; ht && tbl; tbl= tbl->next_local)
    {
      if (!tbl->table || tbl->table->file->partition_ht() != ht)
        ht= 0;
    }

    if (ht && ht->create_group_by)
    {
#+end_src

Here's an example where a gbh is created:

#+begin_src sql
create table t2 (a int, b int);
create table t1 (a int, b int) ENGINE=Spider
COMMENT='WRAPPER "mysql", srv "srv",TABLE "t2"';
insert into t1 VALUES (1,4), (1,2), (2,11);
create view v as select MIN(b), a from t1 where a=1;
explain select * from v;
explain select * from (SELECT MIN(b), a FROM t1 WHERE a=1) as v;
drop view v;
drop table t1, t2;
#+end_src

- [ ] TODO: check why the discrepancy in gbh creation.

** Initial Plan
   :PROPERTIES:
   :UPDATED:  [2025-06-26 Thu 15:00]
   :ID:       b77591cc-bf68-4444-89b3-94a80d507cb5
   :END:
I expect the spider sh, if one is implemented, to work alongside the
spider gbh, before a replacement is considered.

The spider sh could and should reuse the same query printing functions
used by the spider gbh.

Here's some things that could be done first:
- [X] Check how gbh works with the sequence engine. This will help clarify
  the purpose of gbh
  - see the new section above [[id:a03a8942-936a-4750-a13f-8aca2b6263f6][The sequence gbh and a bit of history]]
- [-] Stress test spider gbh for wrong results. From my experience, spider
  test coverage is not great, and I would not be surprised if more
  "wrong results" cases are uncovered, which would make it clearer in
  the determination of gbh vs sh. Will need to work with testers on
  this
  - [2025-06-25 Wed]: spoke to roel about it
- [X] Implement a PoC spider sh that can be used to against testcases
  showcasing issues with gbh, to verify whether these issues do not
  exist with sh. Alternatively, verify the same with the only other sh
  user federatedx engine if possible.
  - [2025-06-26 Thu]: see the new section above ([[id:c90dc87b-25ca-4e11-aa61-b67ab91761b0][The federatedx sh]])
    where the problems are verified to be nonexistent in the sh by
    testing federatedx

** Follow-up Plan
   :PROPERTIES:
   :UPDATED:  [2025-06-26 Thu 15:01]
   :END:
With the progress made in the [[id:b77591cc-bf68-4444-89b3-94a80d507cb5][Initial Plan]], I think we can proceed
with a PoC spider sh to test some basic queries.

** Appendix 1: federatedx testcase
   :PROPERTIES:
   :UPDATED:  [2025-06-26 Thu 14:54]
   :ID:       ff6ed5d2-c6b3-4ed3-b8ef-c09badc8a648
   :END:
#+begin_src sql
--source have_federatedx.inc
--source include/federated.inc
--source include/no_valgrind_without_big.inc
--source include/have_partition.inc

connection default;
set global federated_pushdown=1;

connection slave;
CREATE TABLE t1 (a int, b int, PRIMARY KEY (a, b));
INSERT INTO t1 VALUES
  (1,4), (1,2), (2,11);

connection master;
--replace_result $SLAVE_MYPORT SLAVE_PORT
eval
CREATE TABLE t1 (a int, b int, PRIMARY KEY (a, b))
ENGINE="FEDERATED" DEFAULT CHARSET=latin1
CONNECTION='mysql://root@127.0.0.1:$SLAVE_MYPORT/test/t1';

# (rr) dbp select_lex
# $1 = 0x56337647efc0 <dbug_item_print_buf> "select min(`federated`.t1.b) AS `MIN(b)`,`federated`.t1.a AS a from t1 where `federated`.t1.a = 1"

SELECT MIN(b), a FROM t1 WHERE a=1;
# check that GROUP BY is handled by the sh
SELECT MIN(b), a FROM t1 GROUP BY a;
drop table t1;

connection slave;
drop table t1;

# MDEV-33560
--source include/have_sequence.inc
create table t1 (c int, d int, primary key (c));
insert into t1 select seq, seq from seq_1_to_100;

connection master;
--replace_result $SLAVE_MYPORT SLAVE_PORT
eval
CREATE TABLE t1 (c int, d int, PRIMARY KEY (c))
ENGINE="FEDERATED" DEFAULT CHARSET=latin1
CONNECTION='mysql://root@127.0.0.1:$SLAVE_MYPORT/test/t1';

set in_predicate_conversion_threshold= 2;
# (rr) dbp join->select_lex
# $1 = 0x555e5e53ec60 <dbug_item_print_buf> "select t1.c AS c,t1.d AS d from t1 where t1.c in (23,37,55,89)"
# after optimize()
# (rr) dbp join->select_lex
# $2 = 0x555e5e53ec60 <dbug_item_print_buf> "select t1.c AS c,t1.d AS d from t1 where t1.c in (23,37,55,89)"
# in select handler
# (rr) dbp select
# $3 = 0x555e5e53ec60 <dbug_item_print_buf> "select t1.c AS c,t1.d AS d from t1 where t1.c in (23,37,55,89)"

select * from t1 where c in (23, 37, 55, 89);

drop table t1;

connection slave;
drop table t1;

# right join
connection slave;
CREATE TABLE t1 (a int);
CREATE TABLE t2 (a int);
CREATE TABLE t3 (a int);

INSERT INTO t1 VALUES (1);
INSERT INTO t2 VALUES (1), (2);
INSERT INTO t3 VALUES (1), (2), (3);

connection master;
--replace_result $SLAVE_MYPORT SLAVE_PORT
eval
CREATE TABLE t1 (a int)
ENGINE="FEDERATED" DEFAULT CHARSET=latin1
CONNECTION='mysql://root@127.0.0.1:$SLAVE_MYPORT/test/t1';
--replace_result $SLAVE_MYPORT SLAVE_PORT
eval
CREATE TABLE t2 (a int)
ENGINE="FEDERATED" DEFAULT CHARSET=latin1
CONNECTION='mysql://root@127.0.0.1:$SLAVE_MYPORT/test/t2';
--replace_result $SLAVE_MYPORT SLAVE_PORT
eval
CREATE TABLE t3 (a int)
ENGINE="FEDERATED" DEFAULT CHARSET=latin1
CONNECTION='mysql://root@127.0.0.1:$SLAVE_MYPORT/test/t3';

# Note: these outputs were before we changed to use the `test`
# database instead of `federated`.
# (rr) dbp join->select_lex
# $1 = 0x55f1221ddc60 <dbug_item_print_buf> "select `federated`.t1.a AS a,`federated`.t2.a AS a,`federated`.t3.a AS a from (t3 left join (t1 left join t2 on(`federated`.t1.a = `federated`.t2.a)) on(`federated`.t3.a = `federated`.t1.a))"
# after optimize()
# (rr) dbp join->select_lex
# $2 = 0x55f1221ddc60 <dbug_item_print_buf> "select `federated`.t1.a AS a,`federated`.t2.a AS a,`federated`.t3.a AS a from (t3 left join (t1 left join t2 on(`federated`.t1.a = `federated`.t2.a)) on(`federated`.t3.a = `federated`.t1.a))"
# in init_scan()
# (rr) dbp select
# $3 = 0x55f1221ddc60 <dbug_item_print_buf> "select `federated`.t1.a AS a,`federated`.t2.a AS a,`federated`.t3.a AS a from (t3 left join (t1 left join t2 on(`federated`.t1.a = `federated`.t2.a)) on(`federated`.t3.a = `federated`.t1.a))"

select * from t1 left join t2 on t1.a = t2.a right join t3 on t3.a = t1.a;

drop table t1, t2, t3;

connection slave;
drop table t1, t2, t3;

# MDEV-32907
connection slave;
create table t1 (c int);
insert into t1 values (3), (NULL);

connection master;
--replace_result $SLAVE_MYPORT SLAVE_PORT
eval
CREATE TABLE t1 (c int)
ENGINE="FEDERATED" DEFAULT CHARSET=latin1
CONNECTION='mysql://root@127.0.0.1:$SLAVE_MYPORT/test/t1';
# (rr) dbp join->select_lex
# $1 = 0x55cdf9e41c60 <dbug_item_print_buf> "select ifnull(sum(t1.c),0) AS `nvl(sum(c), 0)` from t1"
# after optimize()
# (rr) dbp join->select_lex
# $2 = 0x55cdf9e41c60 <dbug_item_print_buf> "select ifnull(sum(t1.c),0) AS `nvl(sum(c), 0)` from t1"
# in select handler
# (rr) dbp select
# $3 = 0x55cdf9e41c60 <dbug_item_print_buf> "select ifnull(sum(t1.c),0) AS `nvl(sum(c), 0)` from t1"

select nvl(sum(c), 0) from t1;

drop table t1;

connection slave;
drop table t1;

# UNION
connection slave;
CREATE TABLE t1 (a int);
CREATE TABLE t2 (a int);
CREATE TABLE t3 (a int);

INSERT INTO t1 VALUES (1);
INSERT INTO t2 VALUES (1), (2);
INSERT INTO t3 VALUES (1), (2), (3);

connection master;
--replace_result $SLAVE_MYPORT SLAVE_PORT
eval
CREATE TABLE t1 (a int)
ENGINE="FEDERATED" DEFAULT CHARSET=latin1
CONNECTION='mysql://root@127.0.0.1:$SLAVE_MYPORT/test/t1';
--replace_result $SLAVE_MYPORT SLAVE_PORT
eval
CREATE TABLE t2 (a int)
ENGINE="FEDERATED" DEFAULT CHARSET=latin1
CONNECTION='mysql://root@127.0.0.1:$SLAVE_MYPORT/test/t2';
--replace_result $SLAVE_MYPORT SLAVE_PORT
eval
CREATE TABLE t3 (a int)
ENGINE="FEDERATED" DEFAULT CHARSET=latin1
CONNECTION='mysql://root@127.0.0.1:$SLAVE_MYPORT/test/t3';

select a from t1 union all select max(a) from t2 union all select count(a) from t3;

drop table t1, t2, t3;

connection slave;
drop table t1, t2, t3;

# cleanup
connection default;
set global federated_pushdown=0;

source include/federated_cleanup.inc;
#+end_src