Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-8872

Performance regressions with utf8mb4 vs utf8 in WordPress

Details

    • Bug
    • Status: Closed (View Workflow)
    • Major
    • Resolution: Duplicate
    • 10.0.21-galera
    • N/A
    • Character Sets
    • None

    Description

      When switching tables from utf8 to utf8mb4, we're seeing fairly severe performance regressions - queries that used to run in microseconds, now taking 20 seconds to run.

      We've run some tests with query and index variations, or there's a summary here.

      For a direct comparison, see:
      utf8: https://travis-ci.org/tollmanz/utf8mb4-query-time/jobs/82864656
      utf8mb4: https://travis-ci.org/tollmanz/utf8mb4-query-time/jobs/82864660

      Here's the WordPress tracking ticket. Comment #2 notes that reducing the meta_key column length to 191 (to match the length of the prefix index) fixes the performance issue, but the utf8 character set doesn't have the same problem.

      The same behaviour occurs in MySQL, as well.

      Attachments

        Issue Links

          Activity

            pento Gary Pendergast created issue -
            bar Alexander Barkov made changes -
            Field Original Value New Value
            Assignee Alexander Barkov [ bar ]
            elenst Elena Stepanova made changes -
            Fix Version/s 10.0 [ 16000 ]
            bar Alexander Barkov made changes -
            otto Otto Kekäläinen made changes -
            bar Alexander Barkov made changes -
            Fix Version/s N/A [ 14700 ]
            Fix Version/s 10.0 [ 16000 ]
            Resolution Not a Bug [ 6 ]
            Status Open [ 1 ] Closed [ 6 ]
            serg Sergei Golubchik made changes -
            Workflow MariaDB v3 [ 71842 ] MariaDB v4 [ 149667 ]
            monty Michael Widenius made changes -
            monty Michael Widenius made changes -
            psergei Sergei Petrunia made changes -
            monty Michael Widenius made changes -
            Assignee Alexander Barkov [ bar ] Sergei Petrunia [ psergey ]
            Resolution Not a Bug [ 6 ]
            Status Closed [ 6 ] Stalled [ 10000 ]
            monty Michael Widenius made changes -
            Comment [ The usage of comparisons of indexed column with another column from a different character set is becoming more and more frequent and we should fix this in the cases it is possible.

            The current case that is causing performance issues (index can not be used) is

            WHERE utf8bm3_index=utf8mb4_column
            as utft8bm3 cannot handle all utf8mb4 characters.
            Note that WHERE utf8bm4_index=utf8mb3_column works.

            The idea is to do allow comparison of indexed column to other column with = for different character sets as long as the collation is not making it impossible (for example having an index with a case-sensitive collation comparing to a column that is not case sensitive).
            If the character set conversion from compared column to indexed_column fails because a non existing character, then we will assume that there is no matching row in the table.


            ]
            psergei Sergei Petrunia made changes -
            Comment [ Further work on the issue from the above comment is done in MDEV-32113. ]
            monty Michael Widenius made changes -
            Resolution Duplicate [ 3 ]
            Status Stalled [ 10000 ] Closed [ 6 ]
            monty Michael Widenius made changes -
            julien.fritsch Julien Fritsch made changes -

            People

              psergei Sergei Petrunia
              pento Gary Pendergast
              Votes:
              8 Vote for this issue
              Watchers:
              12 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.