[MDEV-31725] Change visibility of some multi_update members to public Created: 2023-07-17  Updated: 2023-11-28  Resolved: 2023-09-02

Status: Closed
Project: MariaDB Server
Component/s: Data Manipulation - Update
Fix Version/s: N/A

Type: Task Priority: Critical
Reporter: Gagan Goel (Inactive) Assignee: Sergei Golubchik
Resolution: Won't Fix Votes: 0
Labels: None

Issue Links:
Blocks
blocks MCOL-4740 UPDATE returns wrong "Rows matched" o... Closed
Relates
relates to MDEV-32256 bulk multi-update / multi-delete API ... Open

 Description   

For UPDATEs involving a single table, the server call to handler::ha_direct_update_rows(ha_rows *update_rows, ha_rows *found_rows) allows ColumnStore to correctly set the number of updated rows.

However, for UPDATEs involving multi-tables, the server does not call handler::direct_update_rows(). This is causing the following UPDATE statement in ColumnStore to incorrectly report the affected number of rows to the client, even when the actual UPDATE is successful:

MariaDB [test]> CREATE TABLE mcs_1 (a INT, b INT(11), c VARCHAR(100)) engine=columnstore;
Query OK, 0 rows affected (0.287 sec)
 
MariaDB [test]> CREATE TABLE mcs_2 (a INT, b INT(11), c VARCHAR(100)) engine=columnstore;
Query OK, 0 rows affected (0.271 sec)
 
MariaDB [test]> INSERT INTO mcs_1 VALUES (33, 99, 1);
Query OK, 1 row affected (1.179 sec)
 
MariaDB [test]> INSERT INTO mcs_1 VALUES (33, 99, 2);
Query OK, 1 row affected (0.115 sec)
 
MariaDB [test]> INSERT INTO mcs_1 VALUES (33, 99, 3);
Query OK, 1 row affected (1.116 sec)
 
MariaDB [test]> INSERT INTO mcs_2 VALUES (33, 11, 1);
Query OK, 1 row affected (0.173 sec)
 
MariaDB [test]> INSERT INTO mcs_2 VALUES (33, 11, 2);
Query OK, 1 row affected (1.119 sec)
 
MariaDB [test]> INSERT INTO mcs_2 VALUES (33, 11, 3);
Query OK, 1 row affected (0.112 sec)
 
MariaDB [test]> SELECT * FROM mcs_1;
+------+------+------+
| a    | b    | c    |
+------+------+------+
|   33 |   99 | 1    |
|   33 |   99 | 2    |
|   33 |   99 | 3    |
+------+------+------+
3 rows in set (0.056 sec)
 
MariaDB [test]> UPDATE mcs_1 A, mcs_2 B SET A.b = B.b WHERE A.c = B.c LIMIT 1;
Query OK, 0 row affected (1.154 sec)
Rows matched: 0  Changed: 0  Warnings: 0
 
MariaDB [test]> SELECT * FROM mcs_1;
+------+------+------+
| a    | b    | c    |
+------+------+------+
|   33 |   11 | 1    |
|   33 |   99 | 2    |
|   33 |   99 | 3    |
+------+------+------+
3 rows in set (0.009 sec)

As can be seen, the UPDATE is successful but the number of affected rows reported to the client is incorrect.

For the above multi-table UPDATE, the server execution calls the below functions in the following order:
1. handler::rnd_init()
2. handler::rnd_end()
3. multi_update::send_eof()

multi_update::send_eof() is responsible for reporting the count of the affected rows to the client. Knowing this execution order, ColumnStore can correctly set the number of affected rows in handler::rnd_end(), provided the following members of class multi_update are made public:
multi_update::update_tables
multi_update::table_to_update
multi_update::updated
multi_update::found



 Comments   
Comment by Gagan Goel (Inactive) [ 2023-07-18 ]

Here is the server-side patch: https://github.com/MariaDB/server/pull/2697

Comment by Sergei Golubchik [ 2023-09-01 ]

where the table is actually updated in the multi-update case?

Comment by Sergei Golubchik [ 2023-09-02 ]

answering my question
tables are updated in the ha_columnstore::rnd_init(), ColumnStore parses the original query text, executes the whole multi-update when the rnd_init() is invoked and returns HA_ERR_END_OF_FILE on the first rnd_next().

So, this is basically, a hack, columnstore lies to the server that there were no rows affected and the server diligently reports this to the client. I don't think we should extend the server to encourage this kind of hacks and to make them work better. Instead, we should provide a proper multi-update API to the storage engine, so that it wouldn't need to reparse the statement and fake the empty table. Like the direct_update() does in the single-table update case.

Meanwhile, columnstore can include server headers after #define private public, this won't need any changes in the server and will be quite in line with the hacks it was doing so far.

Comment by Gagan Goel (Inactive) [ 2023-09-05 ]

serg I agree. The proper solution would be to extend the multi-table API and provide a function similar to direct_update() for the singe-table update case. Should I create a new ticket to request this change?

On a side note, I was not aware that #define private public was even possible.

Generated at Thu Feb 08 10:26:00 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.