[MDEV-26545] Spider does not correctly handle UDF and stored function in where conds Created: 2021-09-06 Updated: 2021-12-03 Resolved: 2021-09-22 |
|
| Status: | Closed |
| Project: | MariaDB Server |
| Component/s: | Storage Engine - Spider |
| Affects Version/s: | 10.2, 10.3, 10.4, 10.5, 10.6 |
| Fix Version/s: | 10.2.41, 10.3.32, 10.4.22, 10.5.13, 10.6.5 |
| Type: | Bug | Priority: | Critical |
| Reporter: | Daniel YE | Assignee: | Nayuta Yanagisawa (Inactive) |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | bug | ||
| Description |
| Comments |
| Comment by Roel Van de Paar [ 2021-09-06 ] | ||||||||||||||||||||||||
|
DanielYe133 I tested this and received the following;
Can you check? | ||||||||||||||||||||||||
| Comment by Daniel YE [ 2021-09-06 ] | ||||||||||||||||||||||||
|
@Roel Van de Paar Removing the CRC32 should be okay. This appears in my example because we additionally added CRC32 support for partitioning in our fork. The partitioning strategy is not an influencer in this issue. | ||||||||||||||||||||||||
| Comment by Nayuta Yanagisawa (Inactive) [ 2021-09-07 ] | ||||||||||||||||||||||||
|
DanielYe133 Thank you for your report. I confirm that the bug reproduces in all the supported versions. | ||||||||||||||||||||||||
| Comment by Nayuta Yanagisawa (Inactive) [ 2021-09-08 ] | ||||||||||||||||||||||||
|
I did slight a debug concerning the following query on 10.2 HEAD(0d3de06e).
The WHERE condition is appended by the second call of spider_mysql_handler::append_condition_part(). In more detail, the There seem to be several problems here.
| ||||||||||||||||||||||||
| Comment by Nayuta Yanagisawa (Inactive) [ 2021-09-08 ] | ||||||||||||||||||||||||
|
Just skipping UDFs fixes the bug at least.
| ||||||||||||||||||||||||
| Comment by Nayuta Yanagisawa (Inactive) [ 2021-09-08 ] | ||||||||||||||||||||||||
|
DanielYe133 What is your expectation on the UDF pushdown? Do you think an UDF should be pushed down to a data node? | ||||||||||||||||||||||||
| Comment by Nayuta Yanagisawa (Inactive) [ 2021-09-08 ] | ||||||||||||||||||||||||
|
A workaround for the bug is to set spider_skip_default_condition to 1.
| ||||||||||||||||||||||||
| Comment by Nayuta Yanagisawa (Inactive) [ 2021-09-08 ] | ||||||||||||||||||||||||
|
I noticed that the above fix and the workaround work on 10.2 but not on 10.3 or higher. So, we need to do further debugging on 10.3. More precisely, even on 10.3, SELECTs work as expected with the above fix but Update and DELETE do not. | ||||||||||||||||||||||||
| Comment by Nayuta Yanagisawa (Inactive) [ 2021-09-08 ] | ||||||||||||||||||||||||
|
Why the fix does work for 10.2? That is possibly because 10.2 does not have direct update and direct delete functionalities:
| ||||||||||||||||||||||||
| Comment by Nayuta Yanagisawa (Inactive) [ 2021-09-08 ] | ||||||||||||||||||||||||
|
I debugged the execution of the following query on 10.3 HEAD (a6383a1).
In a direct update, a WHERE clause is appended by spider_db_append_key_where_internal(). I do not yet know why but null search keys (start_key and end_key) are passed to the function. As a result, the function append no search condition. This is the mechanism of how the bug occurs. | ||||||||||||||||||||||||
| Comment by Nayuta Yanagisawa (Inactive) [ 2021-09-08 ] | ||||||||||||||||||||||||
|
I'm not yet so sure, but if Spider intentionally pass NULL search keys to spider_db_append_key_where_internal(), I think that a proper search condition should be appended by spider_mbase_handler::append_condition_part(). However, this is impossible because spider_db_mysql_util::open_item_func(), which is used in append_condition_part(), cannot properly handle the item_func corresponding to the UDF. An easy fix would be not to perform a direct UPDATE when the WHERE condition includes UDFs. | ||||||||||||||||||||||||
| Comment by Nayuta Yanagisawa (Inactive) [ 2021-09-08 ] | ||||||||||||||||||||||||
|
The above would also apply to the case of DELETE. | ||||||||||||||||||||||||
| Comment by Nayuta Yanagisawa (Inactive) [ 2021-09-08 ] | ||||||||||||||||||||||||
|
> An easy fix would be not to perform a direct UPDATE when the WHERE condition includes UDFs. I should have said that "If we do not push down UDFs to data nodes, an easy fix would be not to perform a direct UPDATE when the WHERE condition includes UDFs." If we decided to push down UDFs to data nodes, we might be able to just push it down without giving up direct UPDATE and DELETE. | ||||||||||||||||||||||||
| Comment by Nayuta Yanagisawa (Inactive) [ 2021-09-09 ] | ||||||||||||||||||||||||
|
The function spider_mbase_handler::append_condition() ignores the error ER_SPIDER_COND_SKIP_NUM. This is OK if the condition pushdown is just for optimization. That is because, even if some conditions are ignored, just more rows are returned by data nodes. However, for direct UPDATE and direct DELETE cases, every condition must be pushed down to data nodes.
| ||||||||||||||||||||||||
| Comment by Daniel YE [ 2021-09-09 ] | ||||||||||||||||||||||||
|
@nayuta-yanagisawa I personally prefer skipping UDFs conditions in the case of SELECT, and disabling direct update/delete in the case of UPDATE/DELETE, even though it means it could potentially bring the server a lot more records to process. This is because, as far as I know, there is a blurry line between UDFs and SPs, and in our case the server identifies our functions as Item_func::FUNC_SP, even they do not have the characteristics of a SP (e.g. has DML queries, or takes OUT parameters). So, correct me if I am wrong, I doubt that the server is able to distinguish between a UDF and an SP. Therefore, if we decide to push down conditions containing usage of UDFs, it will likely also pushdown conditions containing SPs. This could become a problem because an SP could contain DML operations on several tables, and even if a remote has the same definition of the SP, it can fail to execute the SP due to non-existence of some of the tables. In this case it could be really error-prone. So, I would suggest pushing down UDF conditions only if we are able to make sure the remotes can execute them successfully and do not do any unexpected damage. | ||||||||||||||||||||||||
| Comment by Nayuta Yanagisawa (Inactive) [ 2021-09-09 ] | ||||||||||||||||||||||||
|
DanielYe133 Thank you for your input. I think too that we should not push down UDFs, stored functions, and stored procedures. That is because inefficient and accurate query execution is far better than efficient and faulty query execution. Spider has a tendency to over-optimize at times, which I believe is not good. In my previous comments, I wrote "UDF" where I should have written "stored function" (already corrected). I apologize if I caused any confusion. | ||||||||||||||||||||||||
| Comment by Nayuta Yanagisawa (Inactive) [ 2021-09-09 ] | ||||||||||||||||||||||||
|
DanielYe133 By the way, you can write like [~nayuta-yanagisawa] to mention me. | ||||||||||||||||||||||||
| Comment by Nayuta Yanagisawa (Inactive) [ 2021-09-09 ] | ||||||||||||||||||||||||
|
DanielYe133 I may not have fully understood your statement. SP here means "stored procedure", right? This seems to be no problem because stored procedures do not appear in WHERE clauses.
| ||||||||||||||||||||||||
| Comment by Daniel YE [ 2021-09-10 ] | ||||||||||||||||||||||||
|
nayuta-yanagisawa (thank you for telling me how to do it You are right. I was referring to stored procedures and indeed they do not appear in WHERE clauses. I did not fully comprehend what Item_func::FUNC_SP represents and thought it means stored procedures (judging by the name), and after looking into the code I now realize it is the type of Item_func_sp which represents stored functions.
| ||||||||||||||||||||||||
| Comment by Nayuta Yanagisawa (Inactive) [ 2021-09-10 ] | ||||||||||||||||||||||||
|
DanielYe133 OK. BTW, are you interested in fixing the present bug by yourself? If you want me to do it, that's fine too. > after looking into the code I now realize it is the type of Item_func_sp which represents stored functions. In my understanding, Item is something that returns a value, an expression. So, I think stored procedures aren't represented by Item_func_sp. | ||||||||||||||||||||||||
| Comment by Daniel YE [ 2021-09-10 ] | ||||||||||||||||||||||||
|
nayuta-yanagisawa Sure, I'd love to. And if you don't mind, please let me know if there is anything I need to pay attention to before doing it. And great point on this!
| ||||||||||||||||||||||||
| Comment by Nayuta Yanagisawa (Inactive) [ 2021-09-10 ] | ||||||||||||||||||||||||
|
DanielYe133 Great! Because of JIRA permission, I cannot assign the issue to you, but you are now in charge of this. Here is some advice:
| ||||||||||||||||||||||||
| Comment by Nayuta Yanagisawa (Inactive) [ 2021-09-10 ] | ||||||||||||||||||||||||
|
DanielYe133 If you have any questions, please feel free to ask. | ||||||||||||||||||||||||
| Comment by Daniel YE [ 2021-09-10 ] | ||||||||||||||||||||||||
|
nayuta-yanagisawa Thank you so much for your detailed guidance! I'll inform you once I have any progress on any question. |