[MDEV-24250] Sorting behavior changed from 10.1 to 10.3 Created: 2020-11-19 Updated: 2020-11-23 Resolved: 2020-11-20 |
|
| Status: | Closed |
| Project: | MariaDB Server |
| Component/s: | Character Sets |
| Affects Version/s: | 10.3.25, 10.2, 10.3, 10.4, 10.5 |
| Fix Version/s: | N/A |
| Type: | Bug | Priority: | Major |
| Reporter: | Torben Hansen | Assignee: | Sergei Golubchik |
| Resolution: | Not a Bug | Votes: | 0 |
| Labels: | None | ||
| Environment: |
Ubuntu 20.04 |
||
| Issue Links: |
|
||||||||
| Description |
|
The sorting behavior for columns with CRLF (\n\r) values seem to have changed from MariaDB 10.1 to 10.3 Simple example table
Results on MariaDB 10.1 (Ubuntu 18.04 LTS)
Results on MariaDB 10.3 (Ubuntu 20.04 LTS)
The ordering for column c has obviously changed, since rows with a CRLF value are now shown before rows with an empty value. I did not find any documentation describing this change in sorting behavior. Is this a bug or can this behavior be changed "back" by a setting? |
| Comments |
| Comment by Alice Sherepa [ 2020-11-20 ] | ||||||||||||||||||||||||
|
Thanks! I repeated on 10.2-10.5:
| ||||||||||||||||||||||||
| Comment by Sergei Golubchik [ 2020-11-20 ] | ||||||||||||||||||||||||
|
This is correct behavior. In MariaDB most collations have what SQL standard calls "PAD" behavior, that is when two strings are compared they are padded with spaces to the max string length. And as '\r' < ' ' you also get '\r' < ''. You can use the NOPAD collation if you want strings not to be padded on comparison:
| ||||||||||||||||||||||||
| Comment by Torben Hansen [ 2020-11-22 ] | ||||||||||||||||||||||||
|
Thanks for your feedback. Since I'm using a DBAL, I have no real influence on how queries are created, so I can not easily add
as suggested. My initial question is/was, if this behavior somehow changed between MariaDB 10.1 and 10.3? If so, this should at least be documented somewhere. | ||||||||||||||||||||||||
| Comment by Sergei Golubchik [ 2020-11-23 ] | ||||||||||||||||||||||||
|
It was changed in commit https://github.com/MariaDB/server/commit/6fb94c3e430d as a fix for My previous comment described the correct string comparison behavior. It was implemented before 10.1. That is, 10.1 also pads strings with spaces and '\n' is sorted before '' there. What I omitted — in binary character set strings are padded with '\0', not with spaces. In that case _binary '\n' > _binary ''. Now 10.1 had a bug where BLOB and TEXT columns were always padding with '\0', even though TEXT columns have a proper character set. It was a bug in comparison and sorting of TEXT columns and it was fixed in 10.2.5 | ||||||||||||||||||||||||
| Comment by Torben Hansen [ 2020-11-23 ] | ||||||||||||||||||||||||
|
|