Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-34620

Large N-way OR causes a lot of index_merge variants to be created and discarded

    XMLWordPrintable

Details

    • Bug
    • Status: Stalled (View Workflow)
    • Critical
    • Resolution: Unresolved
    • 10.5, 10.6, 10.10(EOL), 11.0(EOL)
    • 10.6
    • None

    Description

      This is a public part of TODO-4799 (which is not public as it has customer data).

      A totally synthetic example from there:

      CREATE TABLE `t1` (
        `a1` varchar(128) NOT NULL,
        `a2` varchar(128) NOT NULL,
        `a3` varchar(64) NOT NULL,
        `a4` varchar(128) NOT NULL,
        filler varchar(100),
        PRIMARY KEY (`a1`,`a2`,`a3`,`a4`),
        KEY `key1` (`a3`)
      );
      

      insert into t1 select 
        A.seq,
        B.seq,
        C.seq,
        D.seq,
        'filler-data'
      from 
        seq_1_to_10 A,
        seq_1_to_10 B,
        seq_1_to_10 C,
        seq_1_to_10 D
      ;
      

      explain format=json
      delete from t1
      where
         a4='a4-line1' and a2='a2-line1' and a1='a1-line1' and a3='a3-line1'
      or a4='a4-line2' and a2='a2-line2' and a1='a1-line2' and a3='a3-line2'
      or a4='a4-line3' and a2='a2-line3' and a1='a1-line3' and a3='a3-line3'
      ...
      or a4='a4-line997' and a2='a2-line997' and a1='a1-line997' and a3='a3-line997'
      or a4='a4-line998' and a2='a2-line998' and a1='a1-line998' and a3='a3-line998'
      or a4='a4-line999' and a2='a2-line999' and a1='a1-line999' and a3='a3-line999'
      or a4='a4-line1000' and a2='a2-line1000' and a1='a1-line1000' and a3='a3-line1000';
      

      (see attachments for full files)

      gives

      "query_block": {
          "select_id": 1,
          "table": {
            "delete": 1,
            "table_name": "t1",
            "access_type": "range",
            "possible_keys": ["PRIMARY", "key1"],
            "key": "PRIMARY",
            "key_length": "456",
            "used_key_parts": ["a1", "a2", "a3", "a4"],
            "rows": 1000,
            "attached_condition": ...
      

      This is fine but during the range optimization one can see that many SEL_IMERGE objects were created.
      Does the optimizer actually estimate many SEL_IMERGE options?

      It considers a range scan on both of the indexes:

      "range_scan_alternatives": [
        PK: range [
          "(a1,a2,a3,a4) = (line1-a1...)"
          ...
          "(a1,a2,a3,a4) = (line999-a1...)"
          ]
        key1: <the same>
      }
      

      Then, there are three variants for index_merge.
      1. A 2-way index_merge where the last OR disjunct is a separate scan.
      2. A 3-way index_merge with last two OR disjuncts are separate scans.
      3. A 2-way index_merge where the last two OR disjuncts are one separate scan.

      First:

      analyzing_index_merge_union {
        indexes_to_merge : [
          { 
            PK: range [
              "(a1,a2,a3,a4) = (line1-a1...)
              ...
              "(a1,a2,a3,a4) = (line999-a1...)
              ]
            key1: <the same>
          },
          {
            PK: range [ "(a1,a2,a3,a4) = (line1000-a1...) ]
            key1: same, line1000.
          }
        ]
      

      Second:

        indexes_to_merge : [
          { 
            PK: range [
              "(a1,a2,a3,a4) = (line1-a1...)
              ...
              "(a1,a2,a3,a4) = (line998-a1...)
              ]
            key1: <the same>
          },
          {
            PK: range [ "(a1,a2,a3,a4) = (line999-a1...) ]
            key1: same, line999.
          }
          {
            PK: range [ "(a1,a2,a3,a4) = (line1000-a1...) ]
            key1: same, line1000.
          }
        ]
      

      Third:

        indexes_to_merge : [
          { 
            PK: range [
              "(a1,a2,a3,a4) = (line1-a1...)
              ...
              "(a1,a2,a3,a4) = (line998-a1...)
              ]
            key1: <the same>
          },
          {
            PK: range [
              "(a1,a2,a3,a4) = (line1000-a1...)
              "(a1,a2,a3,a4) = (line999-a1...)
              ]
            key1: <the same>
          }
        ]
      }
      

      Attachments

        Issue Links

          Activity

            People

              psergei Sergei Petrunia
              psergei Sergei Petrunia
              Votes:
              0 Vote for this issue
              Watchers:
              6 Start watching this issue

              Dates

                Created:
                Updated:

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.