Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-10302

Strings beginning with control characters are "less than" empty string

Details

    • Bug
    • Status: Closed (View Workflow)
    • Minor
    • Resolution: Not a Bug
    • 5.5.47, 10.0.25
    • N/A
    • Character Sets
    • None
    • openSuSE 42, CentOS 7

    Description

      Any non empty string is expected to be larger than the empty string. Contrary to this, SELECT ... WHERE textfield > '' fails where the text data begins with a control character like newline or tab.

      This affects multiple collations (tested with utf8 and latin1 standards), only binary comparison works as expected.

      Reproduce / test:

      select '\ntext' > '', strcmp('\ntext', '');
      +---------------+----------------------+
      | '\ntext' > '' | strcmp('\ntext', '') |
      +---------------+----------------------+
      |             0 |                   -1 |
      +---------------+----------------------+
      

      Any control character at the beginning of the text triggers this behaviour.

      Workaround: use BINARY or *_bin collations or compare textfield <> ''.

      Attachments

        Issue Links

          Activity

            This is expected behavior. MariaDB implements what SQL standard calls "PAD SPACE" collations. It means that when you compare two strings, the shorter is padded with spaces to the length of the longer one. That's why you get these results. Binary collation does not pad with spaces (in a binary collation the string is merely a sequence of bytes, it does not have a concept of a "space" or a "letter").

            We plan to add support for NO PAD collations in 10.2, see MDEV-9711

            serg Sergei Golubchik added a comment - This is expected behavior. MariaDB implements what SQL standard calls "PAD SPACE" collations. It means that when you compare two strings, the shorter is padded with spaces to the length of the longer one. That's why you get these results. Binary collation does not pad with spaces (in a binary collation the string is merely a sequence of bytes, it does not have a concept of a "space" or a "letter"). We plan to add support for NO PAD collations in 10.2, see MDEV-9711

            Thanks for your explanation and pointers, Sergei!

            balzer@expeedo.de Michael Balzer added a comment - Thanks for your explanation and pointers, Sergei!

            People

              serg Sergei Golubchik
              balzer@expeedo.de Michael Balzer
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.