[MDEV-10302] Strings beginning with control characters are "less than" empty string Created: 2016-06-29 Updated: 2016-06-30 Resolved: 2016-06-29 |
|
| Status: | Closed |
| Project: | MariaDB Server |
| Component/s: | Character Sets |
| Affects Version/s: | 5.5.47, 10.0.25 |
| Fix Version/s: | N/A |
| Type: | Bug | Priority: | Minor |
| Reporter: | Michael Balzer | Assignee: | Sergei Golubchik |
| Resolution: | Not a Bug | Votes: | 0 |
| Labels: | None | ||
| Environment: |
openSuSE 42, CentOS 7 |
||
| Issue Links: |
|
||||||||
| Description |
|
Any non empty string is expected to be larger than the empty string. Contrary to this, SELECT ... WHERE textfield > '' fails where the text data begins with a control character like newline or tab. This affects multiple collations (tested with utf8 and latin1 standards), only binary comparison works as expected. Reproduce / test:
Any control character at the beginning of the text triggers this behaviour. Workaround: use BINARY or *_bin collations or compare textfield <> ''. |
| Comments |
| Comment by Sergei Golubchik [ 2016-06-29 ] |
|
This is expected behavior. MariaDB implements what SQL standard calls "PAD SPACE" collations. It means that when you compare two strings, the shorter is padded with spaces to the length of the longer one. That's why you get these results. Binary collation does not pad with spaces (in a binary collation the string is merely a sequence of bytes, it does not have a concept of a "space" or a "letter"). We plan to add support for NO PAD collations in 10.2, see |
| Comment by Michael Balzer [ 2016-06-30 ] |
|
Thanks for your explanation and pointers, Sergei! |