Uploaded image for project: 'MariaDB ColumnStore'
  1. MariaDB ColumnStore
  2. MCOL-3663

Performance regression in LIKE & NOT LIKE

Details

    • Bug
    • Status: Closed (View Workflow)
    • Blocker
    • Resolution: Fixed
    • 1.4.1
    • 1.4.3
    • None
    • None
    • 2020-1, 2020-2

    Description

      LIKE and NOT LIKE perform significantly worse in 1.4 vs 1.2. The assumed guess is this falls back to non-select-handler based methods. See attached image for example.

      Attachments

        Issue Links

          Activity

            reopen per latest test result

            dleeyh Daniel Lee (Inactive) added a comment - reopen per latest test result

            LIKE and NOT LIKE are implemented internally using regular expressions. In 1.2.6, while fixing other things, we changed from using posix regex to using boost regex. Who woulda thought boost would be 4 times slower? Switched pack to posix regex.

            In C++x11, there's a std::regex, which may or may not be better to use.

            David.Hall David Hall (Inactive) added a comment - LIKE and NOT LIKE are implemented internally using regular expressions. In 1.2.6, while fixing other things, we changed from using posix regex to using boost regex. Who woulda thought boost would be 4 times slower? Switched pack to posix regex. In C++x11, there's a std::regex, which may or may not be better to use.

            Google uses RE2 https://github.com/google/re2. We may want to benchmark that.

            David.Hall David Hall (Inactive) added a comment - Google uses RE2 https://github.com/google/re2 . We may want to benchmark that.
            drrtuy Roman added a comment - - edited

            std::regex is as bad as boost:regex so we don't want it unless I missed some magic performance knob that I doubt.

            We surely want to test re2 having this test in mind. Worth to note the test is outdated.

            drrtuy Roman added a comment - - edited std::regex is as bad as boost:regex so we don't want it unless I missed some magic performance knob that I doubt. We surely want to test re2 having this test in mind. Worth to note the test is outdated.

            Build verified: 1.4.3-1 BB nightly

            engine commit:
            8588678

            MariaDB [tpch10]> select count from orders;
            -----------

            count

            -----------

            150000000

            -----------
            1 row in set (1.690 sec)

            MariaDB [tpch10]> select count from orders o where o_comment like '%express%packages%';
            ----------

            count

            ----------

            1610254

            ----------
            1 row in set (26.867 sec)

            MariaDB [tpch10]> select count from orders o where o_comment like '%express%packages%';
            ----------

            count

            ----------

            1610254

            ----------
            1 row in set (16.448 sec)

            dleeyh Daniel Lee (Inactive) added a comment - Build verified: 1.4.3-1 BB nightly engine commit: 8588678 MariaDB [tpch10] > select count from orders; ----------- count ----------- 150000000 ----------- 1 row in set (1.690 sec) MariaDB [tpch10] > select count from orders o where o_comment like '%express%packages%'; ---------- count ---------- 1610254 ---------- 1 row in set (26.867 sec) MariaDB [tpch10] > select count from orders o where o_comment like '%express%packages%'; ---------- count ---------- 1610254 ---------- 1 row in set (16.448 sec)

            People

              dleeyh Daniel Lee (Inactive)
              LinuxJedi Andrew Hutchings (Inactive)
              Votes:
              0 Vote for this issue
              Watchers:
              5 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.