Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-15624

Changing the default character set to utf8mb4 changes query evaluation in a very surprising way

    Details

    • Type: Bug
    • Status: Closed (View Workflow)
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 5.5, 10.0, 10.1, 10.2, 10.3, 10.2.13
    • Fix Version/s: 5.5.60, 10.0.35, 10.1.33, 10.2.15, 10.3.6
    • Component/s: Character Sets
    • Labels:
    • Environment:
      Darwin crest.fritz.box 17.4.0 Darwin Kernel Version 17.4.0: Sun Dec 17 09:19:54 PST 2017; root:xnu-4570.41.2~1/RELEASE_X86_64 x86_64

      Description

      Changing the default character set to utf8mb4 changes query evaluation in a very surprising way. (Please feel free to set a better title).

      This is actually a followup bug report to this: https://bitbucket.org/zzzeek/sqlalchemy/issues/4222/query-yields-different-result-via-sqla (see there for more detail)

      Here's the problem: This query: ```select seq, replace(uuid(), "-", "") from seq_0_to_9;``` has a completely different result, wether you connect to mysql with a character set of utf8 or utf8mb4.

      Here's an example:

      % mysql --default-character-set=utf8mb4 yeepa_test --execute 'select seq, replace(uuid(), "-", "") from seq_0_to_9;'
      +-----+----------------------------------+
      | seq | replace(uuid(), "-", "")         |
      +-----+----------------------------------+
      |   0 | 49734e8c2d5111e881616c4008b0d40e |
      |   1 | 49734e8c2d5111e881616c4008b0d40e |
      |   2 | 49734e8c2d5111e881616c4008b0d40e |
      |   3 | 49734e8c2d5111e881616c4008b0d40e |
      |   4 | 49734e8c2d5111e881616c4008b0d40e |
      |   5 | 49734e8c2d5111e881616c4008b0d40e |
      |   6 | 49734e8c2d5111e881616c4008b0d40e |
      |   7 | 49734e8c2d5111e881616c4008b0d40e |
      |   8 | 49734e8c2d5111e881616c4008b0d40e |
      |   9 | 49734e8c2d5111e881616c4008b0d40e |
      +-----+----------------------------------+
      

      This returns 10 UUIDs that are all the same. *I believe this to be wrong*

      Calling that same query with utf8 as the format yields 10 different uuids - which I believe to be correct:

      % mysql --default-character-set=utf8 yeepa_test --execute 'select seq, replace(uuid(), "-", "") from seq_0_to_9;' 
      +-----+----------------------------------+
      | seq | replace(uuid(), "-", "")         |
      +-----+----------------------------------+
      |   0 | 797f1cf02d5111e881616c4008b0d40e |
      |   1 | 797f1d042d5111e881616c4008b0d40e |
      |   2 | 797f1d0e2d5111e881616c4008b0d40e |
      |   3 | 797f1d0f2d5111e881616c4008b0d40e |
      |   4 | 797f1d182d5111e881616c4008b0d40e |
      |   5 | 797f1d222d5111e881616c4008b0d40e |
      |   6 | 797f1d232d5111e881616c4008b0d40e |
      |   7 | 797f1d2c2d5111e881616c4008b0d40e |
      |   8 | 797f1d2d2d5111e881616c4008b0d40e |
      |   9 | 797f1d362d5111e881616c4008b0d40e |
      +-----+----------------------------------+
      

      This also happens on on at least on current Fedora. (I can look up the details tomorrow)

      On a further note, this result is

        Attachments

          Issue Links

            Activity

              People

              • Assignee:
                bar Alexander Barkov
                Reporter:
                dwt Martin Häcker
              • Votes:
                0 Vote for this issue
                Watchers:
                5 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: