Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-23301

my_tosort_unicode: unnecessary max_char check while utf8-mb3 processing

    XMLWordPrintable

Details

    Description

      my_tosort_unicode function checks an input character against uni_plane max char value here (strings/ctype-utf8.c):

      static inline void
      my_tosort_unicode(MY_UNICASE_INFO *uni_plane, my_wc_t *wc, uint flags)
      {
        if (*wc <= uni_plane->maxchar)
        {
          MY_UNICASE_CHARACTER *page;
          if ((page= uni_plane->page[*wc >> 8]))
            *wc= (flags & MY_CS_LOWER_SORT) ?
                 page[*wc & 0xFF].tolower :
                 page[*wc & 0xFF].sort;
        }
        else
        {
          *wc= MY_CS_REPLACEMENT_CHARACTER;
        }
      }
      

      But utf8-mb3 encodes only 2-bytes and there is no uniplanes with max char less than 65535 so such check is not required.

      Getting rid of this results in a small performance gain (tested on amd64 and aarch64 with sysbench ro test)

      Attachments

        Activity

          People

            Unassigned Unassigned
            georgykirichenko Georgy Kirichenko
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

              Created:
              Updated:

              Git Integration

                Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.