[MDEV-8954] unnecessary fetch of entire table Created: 2015-10-16  Updated: 2017-07-10  Resolved: 2017-07-10

Status: Closed
Project: MariaDB Server
Component/s: Storage Engine - Spider
Affects Version/s: 10.1
Fix Version/s: 10.2.6

Type: Bug Priority: Major
Reporter: Rick James (Inactive) Assignee: Jacob Mathew (Inactive)
Resolution: Fixed Votes: 0
Labels: None
Environment:

Ubuntu


Issue Links:
Problem/Incident
causes MDEV-13282 post-fix for MDEV-8954 Closed
Sprint: 2017-02, 10.1.22

 Description   

set global optimizer_switch="engine_condition_pushdown=on"

Ouch! It is now pulling back the entire table 200 rows at a time.

Query: SELECT DISTINCT x FROM table
PRIMARY KEY (x,y,z)
Backend queries: SELECT x FROM table x > 12345 ORDER BY x,y,z LIMIT 200.

(And it is taking ​much​ longer than when ecp=OFF)

Question 1: Why fetch the table rather than executing the query?

Question 2: When you have a multi-part PK, that code is likely to be "wrong". Note that it always compared only x, not y and z. And it always said 200.



 Comments   
Comment by Elena Stepanova [ 2015-10-17 ]

rjasdf,

Please note that engine_condition_pushdown switch has been deprecated since 10.1.1. As you can see from the explanation, it is not expected to produce any optimization, only an overhead.

I will assign the report to psergey anyway, in case he wants to look into it and see if there is any problem outside the scope of engine_condition_pushdown switch. If it turns out to be solely the switch'es guilt, I recommend closing it as 'Won't fix'.

rjasdf, please also specify the version where you observed the problem.

Comment by Rick James (Inactive) [ 2015-10-20 ]

OK, engine_condition_pushdown may be a red herring. Still, I have 2
questions that are not necessarily related to that.

On Mon, Oct 19, 2015 at 5:47 PM, Elena Stepanova (JIRA) <

Comment by Sergei Petrunia [ 2015-11-11 ]

This looks like a problem inside Spider. If the query is SELECT DISTINCT x FROM table, there are no conditions that SQL layer could push (or not push) down to Spider.

I am not sure if I should attempt to fix Spider bugs.

Comment by Jacob Mathew (Inactive) [ 2017-01-06 ]

I'm attempting to reproduce the bug.

Comment by Jacob Mathew (Inactive) [ 2017-01-07 ]

I don't think that engine condition pushdown is pertinent, as the setting is deprecated and is therefore always ON.

I am working with a table with 1 local node and 1 remote node, with 3 INT columns that comprise the primary key. The table has 2000 rows, where each distinct value of column 1 appears 10 times in the table. My SELECT statement has a WHERE clause that qualifies the greatest 10 of those distinct values. Index fetch next is executed in the remote table exactly 100 times, once for each occurrence of the 100 rows that qualify the WHERE clause. This seems like correct behavior.

Comment by Jacob Mathew (Inactive) [ 2017-01-17 ]

I continued testing with a 10000-row table. All columns in every row have a nonzero value. SELECT DISTINCT C1 FROM ST WHERE C1 > 0 ORDER BY C1,C2 visits each row once. Removal of the ORDER BY, the WHERE clause, or both of them results in each row being visited many, many times. This is the bug, and it pertains to DISTINCT. Now that I understand the problem, I can reduce the table size to continue debugging it.

Comment by Jacob Mathew (Inactive) [ 2017-01-17 ]

Testing with a 15-row table and lots of breakpoints, I can see what is happening. For SELECT DISTINCT, for a table that contains 5 distinct values (in my test case 1..5) for the first column, we get 5 queries sent to the back-end spider servers that access the table rows as follows:

1. All rows with values 1..5 for the first column value.
2. All rows with values 2..5 for the first column value.
3. All rows with values 3..5 for the first column value.
4. All rows with values 4..5 for the first column value.
5. All rows with values 5..5 for the first column value.

Comment by Jacob Mathew (Inactive) [ 2017-02-02 ]

After doing more extensive testing with different spidered configurations, it is clear that the reported performance problem occurs only with the basic, unpartitioned configuration. I have augmented my fix to pertain to only this configuration.

I have also found other problems in the basic, unpartitioned configuration pertaining to table statistics. These problems cause the optimizer to choose the wrong query plan and yield incorrect query results. The problems disappear after the next background refresh of the table statistics from the back-end server.

Comment by Jacob Mathew (Inactive) [ 2017-02-07 ]

I have committed locally my changes to fix the reported problem, awaiting review.

Comment by Jacob Mathew (Inactive) [ 2017-04-26 ]

The fix was committed in: https://github.com/MariaDB/server/commit/ab841ac6c7eed993d4a69f8f205b97d43ebc3847 and merged into the 10.2-spider branch. Note that this will be released in a 10.3 release of MariaDB.

Comment by Elena Stepanova [ 2017-05-01 ]

It shouldn't really be closed until it makes to a main tree (10.3 in this case).

Comment by Jacob Mathew (Inactive) [ 2017-05-01 ]

Note that builds from the 10.2-spider branch will be given to customers. We therefore need to be able to mark bugs fixed in that branch as CLOSED.

Comment by Michael Widenius [ 2017-07-06 ]

Code needs to fixed according to review

Comment by Sergei Golubchik [ 2017-07-10 ]

after-review fixes are in MDEV-13282

Generated at Thu Feb 08 07:31:02 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.