[MDEV-794] LP:861763 - Inconsistent use of column alias In ORDER BY clause that specified collation Created: 2011-09-28  Updated: 2022-10-31  Resolved: 2022-10-31

Status: Closed
Project: MariaDB Server
Component/s: Optimizer
Affects Version/s: 5.5.28
Fix Version/s: N/A

Type: Bug Priority: Minor
Reporter: Timour Katchaounov (Inactive) Assignee: Sergei Petrunia
Resolution: Fixed Votes: 0
Labels: Launchpad

Attachments: XML File LPexportBug861763.xml    

 Description   

This bug is created to track MySQL BUG#59449, and to review and
either apply or reimplement a contributed patch for this bug.



 Comments   
Comment by Timour Katchaounov (Inactive) [ 2011-10-05 ]

Re: Inconsistent use of column alias In ORDER BY clause that specified collation
Much shorter test case:

CREATE TABLE Language (
LanguageID INTEGER UNSIGNED NOT NULL AUTO_INCREMENT,
LanguageName VARCHAR(40) NOT NULL,
PRIMARY KEY (LanguageID),
KEY LanguageName (LanguageName) )
ENGINE=MyISAM
DEFAULT CHARSET=utf8
COLLATE=utf8_unicode_ci;

INSERT INTO Language VALUES
(1, 'English'),
(2, 'Spanish'),
(3, 'American Sign Language'),
(4, 'Argentinian Sign Language');

– works as expected - no collation - sorted by alias to a function result (alias has name matching a real field from a queried table)
SELECT LanguageID,
substr(LanguageName, 2) as LanguageName
FROM Language
ORDER BY LanguageName;

– broken - collated - sorted by alias to a function (coalesce) result (alias has name matching a real field from a queried table)
SELECT LanguageID,
substr(LanguageName, 2) as LanguageName
FROM Language
ORDER BY LanguageName COLLATE utf8_spanish_ci;

– broken - collated - sorted by alias to a column (alias has name matching a real field from a queried table)
SELECT LanguageID as LanguageName,
substr(LanguageName, 2) as LanguageNameSubstr
FROM Language
ORDER BY LanguageName COLLATE utf8_spanish_ci;

drop table Language;

Comment by Timour Katchaounov (Inactive) [ 2011-10-05 ]

Re: Inconsistent use of column alias In ORDER BY clause that specified collation
Analysis:
This is an architectural problem in the way ORDER BY expressions
and function arguments are resolved.

Order by expressions are resolved by the procedure find_order_in_list().
This procedure correctly defines the name resolution order to search
first in the SELECT clause and to prefer aliases. The name resolution is
implemented via the call:

/* Lookup the current GROUP/ORDER field in the SELECT clause. */
select_item= find_item_in_list(order_item, fields, &counter,
REPORT_EXCEPT_NOT_FOUND, &resolution);

Note however that find_item_in_list() works only for fields (Item_field)
and not functions.

Thus functions in the ORDER clause are not resolved against the
SELECT clause, and find_order_in_list() proceeds until it calls
Item::fix_fields() for these Items.

For functions, fix_fields() calls recursively fix_fields() for its
arguments. Finally we end up calling Item_field::fix_fields() for
the 'LanguageName' argument of e.g. the Collate function.

However Item_field::fix_fields() is designed to resolve fields
mainly in the WHERE and other clauses, where the search order
is different from the one in the ORDER clause. Specifically,
fix_fields() searches first in the FROM clause. As a result, matching
fields in the FROM clause have priority over aliases in the SELECT
clause.

This explains why the first example is resolved correctly, and the
next two - incorrectly.

Comment by Timour Katchaounov (Inactive) [ 2011-10-06 ]

Re: Inconsistent use of column alias In ORDER BY clause that specified collation
Solution:

In summary, the problem that needs to be solved is that
column references in the ORDER/GROUP BY clauses must
be resolved in a different way from column references in
the WHERE and SELECT clauses.

The implementation in find_order_in_list() is limited only
to Item_field and Item_ref. In general the approach to resolve
ORDER/GROUP BY clauses in a different way compared to the
rest seems wrong.

As discussed with Sanja, the right way to have a flexible and
generic name resolution that supports different name resolution
algorithms, is to reuse the Item::fix_fields mechanism in the
following way.

Items have to be marked correctly with their location in the query.
The name resolution algorithms for each group of clauses have to
be implemented as separate methods of the most suitable class.
Then the corresponding fix_fields() method should select the
correct name resolution algorithm based on the location of the
item in the query.

Notice that when e.g. an Item_field in an ORDER/GROUP BY clause
is inside a subquery, the name resolution algorithm must be changed
so that it first searches the subquery itself in the same way the WHERE
clause is resolved, and only if not found in the subquery, name resolution
should proceed to search the outer query, giving preference to aliases.

This is a small re-engineering task, and not a simple bug fix. My estimate
is at least 4-5 full developer days if there are no nasty surprises. Changing
name resolution is quite risky, so this task should be planned for a release
after 5.3.

Comment by Rasmus Johansson (Inactive) [ 2011-10-21 ]

Launchpad bug id: 861763

Comment by Sergei Golubchik [ 2022-10-31 ]

fixed long time ago

Generated at Thu Feb 08 06:31:23 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.