Details

    Description

      Domas initial request:

      Essentially we want to be able to use:

      SELECT * FROM ( SELECT ... WHERE x=123 LIMIT 1000000 ) WHERE y=456 LIMIT 1000;

      Mostly this is to allow paging over large datasets that don't have that great selectivity - so we want to be able to return within reasonable time, even if x=123 condition matches billions of rows, but
      y=456 matches very few.
      Currently the internal subselect would get materialized by MySQL and wouldn't be used as streaming dataset, which makes it prohibitive on OLTP machines.

      In certain cases having DISTINCT not materialize would also be extremely useful for us, but that would probably somewhat different task? (we want to optimize b=Y,c=Z fetches on (a,b,c) indexes).

      Attachments

        Issue Links

          Activity

            By Petrunia:

            If inner query didn't have LIMIT:

            SELECT * FROM ( SELECT ... WHERE x=123) WHERE y=456 LIMIT 1000;

            then MariaDB 5.3 (and MySQL 5.6) is able to handle this case by merging the inner select into the outer.

            "LIMIT 1000000" will prevents the merging, and the subquery will be materialized (that is, really fully read and stored in a temporary table).

            Currently there is no support for "streaming" FROM-clause subqueries. It is generally hard, because MySQL's join execution code has this structure:

            When we run execute a JOIN of tables t1, t2, t3, in that order, the call stack will look like this:

            JOIN::exec
            sub_select(table t1)
            sub_select(table t2)
            sub_select(table t2)
            end_send()

            { send the (t1 x t2 x t3) record to output}

            changing this execution model to provide a JOIN::get_next_output_record() function is rather difficult. Kostja has made an attempt to do this for cursors and he didn't succeed.

            However, if we're looking at only handling the special case where the subquery has form of

            (SELECT ... FROM single_table WHERE ... LIMIT ...)

            that is

            • select from single base table,
            • no [outer] joins or join buffering,
            • no grouping

            then the task becomes much simpler.

            Still, the task may depend on a piece of code from MariaDB 5.3/MySQL 5.5. The piece of code is the patch changed FROM-clause subqueries from being materialized very early (even before the parent join is optimized) to being materialized right before it is read

            ratzpo Rasmus Johansson (Inactive) added a comment - - edited By Petrunia: If inner query didn't have LIMIT: SELECT * FROM ( SELECT ... WHERE x=123) WHERE y=456 LIMIT 1000; then MariaDB 5.3 (and MySQL 5.6) is able to handle this case by merging the inner select into the outer. "LIMIT 1000000" will prevents the merging, and the subquery will be materialized (that is, really fully read and stored in a temporary table). Currently there is no support for "streaming" FROM-clause subqueries. It is generally hard, because MySQL's join execution code has this structure: When we run execute a JOIN of tables t1, t2, t3, in that order, the call stack will look like this: JOIN::exec sub_select(table t1) sub_select(table t2) sub_select(table t2) end_send() { send the (t1 x t2 x t3) record to output} changing this execution model to provide a JOIN::get_next_output_record() function is rather difficult. Kostja has made an attempt to do this for cursors and he didn't succeed. However, if we're looking at only handling the special case where the subquery has form of (SELECT ... FROM single_table WHERE ... LIMIT ...) that is select from single base table, no [outer] joins or join buffering, no grouping then the task becomes much simpler. Still, the task may depend on a piece of code from MariaDB 5.3/MySQL 5.5. The piece of code is the patch changed FROM-clause subqueries from being materialized very early (even before the parent join is optimized) to being materialized right before it is read

            Regarding DISTINCT by Petrunia:

            • I think we should get to this question after having resolved difficulties
              mentioned earlier in this email
            • It is generally possible to have DISTINCT run in an on-the-fly mode but then
              there will be a question of when the optimizer should choose to do that.
            ratzpo Rasmus Johansson (Inactive) added a comment - Regarding DISTINCT by Petrunia: I think we should get to this question after having resolved difficulties mentioned earlier in this email It is generally possible to have DISTINCT run in an on-the-fly mode but then there will be a question of when the optimizer should choose to do that.

            By Domas:

            Nested-loop-only support would be nice, OTOH, grouping, when done via a loose scan, could be extremely useful too.

            ratzpo Rasmus Johansson (Inactive) added a comment - By Domas: Nested-loop-only support would be nice, OTOH, grouping, when done via a loose scan, could be extremely useful too.
            rdyas Robert Dyas added a comment -

            This looks like a really old issue that has been forgotten. I ran into it today with a client. Very useful to be able to use LIMIT in a subquery inside an IN clause.
            Any chance this is more easily fixable these days?

            rdyas Robert Dyas added a comment - This looks like a really old issue that has been forgotten. I ran into it today with a client. Very useful to be able to use LIMIT in a subquery inside an IN clause. Any chance this is more easily fixable these days?

            People

              psergei Sergei Petrunia
              ratzpo Rasmus Johansson (Inactive)
              Votes:
              3 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

                Created:
                Updated:

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.