Details

    • 2017-02, 10.1.22

    Description

      set global optimizer_switch="engine_condition_pushdown=on"

      Ouch! It is now pulling back the entire table 200 rows at a time.

      Query: SELECT DISTINCT x FROM table
      PRIMARY KEY (x,y,z)
      Backend queries: SELECT x FROM table x > 12345 ORDER BY x,y,z LIMIT 200.

      (And it is taking ​much​ longer than when ecp=OFF)

      Question 1: Why fetch the table rather than executing the query?

      Question 2: When you have a multi-part PK, that code is likely to be "wrong". Note that it always compared only x, not y and z. And it always said 200.

      Attachments

        Issue Links

          Activity

            elenst Elena Stepanova added a comment - - edited

            rjasdf,

            Please note that engine_condition_pushdown switch has been deprecated since 10.1.1. As you can see from the explanation, it is not expected to produce any optimization, only an overhead.

            I will assign the report to psergey anyway, in case he wants to look into it and see if there is any problem outside the scope of engine_condition_pushdown switch. If it turns out to be solely the switch'es guilt, I recommend closing it as 'Won't fix'.

            rjasdf, please also specify the version where you observed the problem.

            elenst Elena Stepanova added a comment - - edited rjasdf , Please note that engine_condition_pushdown switch has been deprecated since 10.1.1 . As you can see from the explanation, it is not expected to produce any optimization, only an overhead. I will assign the report to psergey anyway, in case he wants to look into it and see if there is any problem outside the scope of engine_condition_pushdown switch. If it turns out to be solely the switch'es guilt, I recommend closing it as 'Won't fix'. rjasdf , please also specify the version where you observed the problem.

            OK, engine_condition_pushdown may be a red herring. Still, I have 2
            questions that are not necessarily related to that.

            On Mon, Oct 19, 2015 at 5:47 PM, Elena Stepanova (JIRA) <

            rjasdf Rick James (Inactive) added a comment - OK, engine_condition_pushdown may be a red herring. Still, I have 2 questions that are not necessarily related to that. On Mon, Oct 19, 2015 at 5:47 PM, Elena Stepanova (JIRA) <

            This looks like a problem inside Spider. If the query is SELECT DISTINCT x FROM table, there are no conditions that SQL layer could push (or not push) down to Spider.

            I am not sure if I should attempt to fix Spider bugs.

            psergei Sergei Petrunia added a comment - This looks like a problem inside Spider. If the query is SELECT DISTINCT x FROM table , there are no conditions that SQL layer could push (or not push) down to Spider. I am not sure if I should attempt to fix Spider bugs.

            I'm attempting to reproduce the bug.

            jacob-mathew Jacob Mathew (Inactive) added a comment - I'm attempting to reproduce the bug.

            I don't think that engine condition pushdown is pertinent, as the setting is deprecated and is therefore always ON.

            I am working with a table with 1 local node and 1 remote node, with 3 INT columns that comprise the primary key. The table has 2000 rows, where each distinct value of column 1 appears 10 times in the table. My SELECT statement has a WHERE clause that qualifies the greatest 10 of those distinct values. Index fetch next is executed in the remote table exactly 100 times, once for each occurrence of the 100 rows that qualify the WHERE clause. This seems like correct behavior.

            jacob-mathew Jacob Mathew (Inactive) added a comment - I don't think that engine condition pushdown is pertinent, as the setting is deprecated and is therefore always ON. I am working with a table with 1 local node and 1 remote node, with 3 INT columns that comprise the primary key. The table has 2000 rows, where each distinct value of column 1 appears 10 times in the table. My SELECT statement has a WHERE clause that qualifies the greatest 10 of those distinct values. Index fetch next is executed in the remote table exactly 100 times, once for each occurrence of the 100 rows that qualify the WHERE clause. This seems like correct behavior.

            I continued testing with a 10000-row table. All columns in every row have a nonzero value. SELECT DISTINCT C1 FROM ST WHERE C1 > 0 ORDER BY C1,C2 visits each row once. Removal of the ORDER BY, the WHERE clause, or both of them results in each row being visited many, many times. This is the bug, and it pertains to DISTINCT. Now that I understand the problem, I can reduce the table size to continue debugging it.

            jacob-mathew Jacob Mathew (Inactive) added a comment - I continued testing with a 10000-row table. All columns in every row have a nonzero value. SELECT DISTINCT C1 FROM ST WHERE C1 > 0 ORDER BY C1,C2 visits each row once. Removal of the ORDER BY, the WHERE clause, or both of them results in each row being visited many, many times. This is the bug, and it pertains to DISTINCT. Now that I understand the problem, I can reduce the table size to continue debugging it.

            Testing with a 15-row table and lots of breakpoints, I can see what is happening. For SELECT DISTINCT, for a table that contains 5 distinct values (in my test case 1..5) for the first column, we get 5 queries sent to the back-end spider servers that access the table rows as follows:

            1. All rows with values 1..5 for the first column value.
            2. All rows with values 2..5 for the first column value.
            3. All rows with values 3..5 for the first column value.
            4. All rows with values 4..5 for the first column value.
            5. All rows with values 5..5 for the first column value.

            jacob-mathew Jacob Mathew (Inactive) added a comment - Testing with a 15-row table and lots of breakpoints, I can see what is happening. For SELECT DISTINCT, for a table that contains 5 distinct values (in my test case 1..5) for the first column, we get 5 queries sent to the back-end spider servers that access the table rows as follows: 1. All rows with values 1..5 for the first column value. 2. All rows with values 2..5 for the first column value. 3. All rows with values 3..5 for the first column value. 4. All rows with values 4..5 for the first column value. 5. All rows with values 5..5 for the first column value.

            After doing more extensive testing with different spidered configurations, it is clear that the reported performance problem occurs only with the basic, unpartitioned configuration. I have augmented my fix to pertain to only this configuration.

            I have also found other problems in the basic, unpartitioned configuration pertaining to table statistics. These problems cause the optimizer to choose the wrong query plan and yield incorrect query results. The problems disappear after the next background refresh of the table statistics from the back-end server.

            jacob-mathew Jacob Mathew (Inactive) added a comment - After doing more extensive testing with different spidered configurations, it is clear that the reported performance problem occurs only with the basic, unpartitioned configuration. I have augmented my fix to pertain to only this configuration. I have also found other problems in the basic, unpartitioned configuration pertaining to table statistics. These problems cause the optimizer to choose the wrong query plan and yield incorrect query results. The problems disappear after the next background refresh of the table statistics from the back-end server.

            I have committed locally my changes to fix the reported problem, awaiting review.

            jacob-mathew Jacob Mathew (Inactive) added a comment - I have committed locally my changes to fix the reported problem, awaiting review.

            The fix was committed in: https://github.com/MariaDB/server/commit/ab841ac6c7eed993d4a69f8f205b97d43ebc3847 and merged into the 10.2-spider branch. Note that this will be released in a 10.3 release of MariaDB.

            jacob-mathew Jacob Mathew (Inactive) added a comment - The fix was committed in: https://github.com/MariaDB/server/commit/ab841ac6c7eed993d4a69f8f205b97d43ebc3847 and merged into the 10.2-spider branch. Note that this will be released in a 10.3 release of MariaDB.

            It shouldn't really be closed until it makes to a main tree (10.3 in this case).

            elenst Elena Stepanova added a comment - It shouldn't really be closed until it makes to a main tree (10.3 in this case).

            Note that builds from the 10.2-spider branch will be given to customers. We therefore need to be able to mark bugs fixed in that branch as CLOSED.

            jacob-mathew Jacob Mathew (Inactive) added a comment - Note that builds from the 10.2-spider branch will be given to customers. We therefore need to be able to mark bugs fixed in that branch as CLOSED.

            Code needs to fixed according to review

            monty Michael Widenius added a comment - Code needs to fixed according to review

            after-review fixes are in MDEV-13282

            serg Sergei Golubchik added a comment - after-review fixes are in MDEV-13282

            People

              jacob-mathew Jacob Mathew (Inactive)
              rjasdf Rick James (Inactive)
              Votes:
              0 Vote for this issue
              Watchers:
              7 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.