Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-21068

Test MRR scans over partitioned tables

Details

    • Task
    • Status: Closed (View Workflow)
    • Major
    • Resolution: Fixed
    • 10.3.23
    • Tests
    • None

    Description

      This patch https://github.com/mariadb/server/commit/8eeb689e9fc57afe19a8dbff354b5f9f167867a9
      has added MRR support to partitioning engine. If the partitions ' table engine supports MRR, the ha_partition engine will also support it.

      The patch was targeting only the Spider engine, but actually it enabled MRR for partitioned DS-MRR engines (InnoDB,MyISAM and Aria). MRR support in MyRocks is not in MariaDB yet, but it in the upstream MyRocks so we will get in the next merge.

      MDEV-20611 fixes the basic issues with partitioned DS-MRR. It would be nice to have more test coverage.

      The test should have:

      • A partitioned table that supports [DS-]MRR. Nice to try different partitioning schemes.
      • The table must have an index.
      • The query must use MRR or BKA+MRR to read the data from the partitioned table.
      • AFAIU, there are two different cases:
        • The lookup tuples do not cover all partitioning fields. In this case, for each lookup tuple, ha_partition needs to search for it in all partitions (well, in all partitions that were left after partition pruning)
        • The lookup tuples cover partitioned fields. In this case, ha_partition will split the list of ranges into sub-lists that cover individual partitions.

      Things to cover:

      • non-trivial pushed Index Condition
      • Not-exists optimization (Use LEFT JOIN with WHERE inner_table.not_null_col IS NULL
      • BKAH (hashed BKA).

      Attachments

        Issue Links

          Activity

            elenst Elena Stepanova added a comment - - edited

            First test run done, no new failures, moderately affected by MDEV-14667, MDEV-21085 and some legacy wrong result failures which are reproducible before MDEV-20611.

            Line coverage for the patch from the first run (branch coverage collection hangs, possibly related to MDEV-20694, or something else)

             
            ===== File: sql/ha_partition.cc =====
             
               5478 :      25198794 :        if ((tmp= (*file)->ha_index_end()))
               5479 :             0 :          error= tmp;
               5480 :               :      }
               5481 :       2088492 : +    else if ((*file)->inited == RND)
               5482 :               : +    {
               5483 :               : +      // Possible due to MRR
               5484 :               : +      int tmp;
               5485 :             0 : +      if ((tmp= (*file)->ha_rnd_end()))
               5486 :             0 : +        error= tmp;
               5487 :               : +    }
               5488 :      27287374 :    } while (*(++file));
               5489 :       4221504 :    destroy_record_priority_queue();
               5490 :       8443001 :    DBUG_RETURN(error);
             
               6526 :       2732361 :      else if (unlikely((error= handle_unordered_next(table->record[0], FALSE))))
               6527 :         35527 :        DBUG_RETURN(error);
               6528 :               :  
               6529 :       2732359 : +    if (!(m_mrr_mode & HA_MRR_NO_ASSOCIATION))
               6530 :               : +    {
               6531 :       2401029 : +      *range_info=
               6532 :       2401029 : +        ((PARTITION_KEY_MULTI_RANGE *) m_range_info[m_last_part])->ptr;
               6533 :               : +    }
               6534 :               :    }
               6535 :     186298496 :    DBUG_RETURN(0);
               6536 :               :  }
             
            ===== File: sql/multi_range_read.cc =====
             
               1589 :               :    }
               1590 :               :  
               1591 :       3260248 :    uint add_len= share->key_info[keyno].key_length + primary_file->ref_length; 
               1592 :       3260248 : +  if (get_disk_sweep_mrr_cost(keyno, rows, *flags, bufsz, add_len,
               1593 :               : +                              &dsmrr_cost))
               1594 :             0 :      return TRUE;
               1595 :               : +
               1596 :               :    bool force_dsmrr;
               1597 :               :    /* 
               1598 :               :      If mrr_cost_based flag is not set, then set cost of DS-MRR to be minimum of
             
               1681 :               :    @param rows               E(Number of rows to be scanned)
               1682 :               :    @param flags              Scan parameters (HA_MRR_* flags)
               1683 :               :    @param buffer_size INOUT  Buffer size
               1684 :               : +                            IN: Buffer of size 0 means the function
               1685 :               : +                            will determine the best size and return it.
               1686 :               : +  @param extra_mem_overhead Extra memory overhead of the MRR implementation
               1687 :               : +                            (the function assumes this many bytes of buffer
               1688 :               : +                             space will not be usable by DS-MRR)
               1689 :               :    @param cost        OUT    The cost
               1690 :               :  
               1691 :               :    @retval FALSE  OK
             
               1694 :               :  */
               1695 :               :  
               1696 :       3260262 :  bool DsMrr_impl::get_disk_sweep_mrr_cost(uint keynr, ha_rows rows, uint flags,
               1697 :               : +                                         uint *buffer_size,
               1698 :               : +                                         uint extra_mem_overhead,
               1699 :               : +                                         Cost_estimate *cost)
               1700 :               :  {
               1701 :               :    ulong max_buff_entries, elem_size;
               1702 :               :    ha_rows rows_in_full_step;
             
               1706 :               :  
               1707 :       6520524 :    elem_size= primary_file->ref_length + 
               1708 :       3260262 :               sizeof(void*) * (!MY_TEST(flags & HA_MRR_NO_ASSOCIATION));
               1709 :               :  
               1710 :       3260262 : +  if (!*buffer_size)
               1711 :               : +  {
               1712 :               : +    /*
               1713 :               : +      We are requested to determine how much memory we need.
               1714 :               : +      Request memory to finish the scan in one pass but do not request
               1715 :               : +      more than @@mrr_buff_size.
               1716 :               : +    */
               1717 :       5281504 : +    *buffer_size= (uint) MY_MIN(extra_mem_overhead + elem_size*(ulong)rows,
               1718 :               : +                                MY_MAX(table->in_use->variables.mrr_buff_size,
               1719 :       2640752 : +                                       extra_mem_overhead));
               1720 :               : +  }
               1721 :               : +
               1722 :       3260262 : +  if (elem_size + extra_mem_overhead > *buffer_size)
               1723 :             8 :      return TRUE; /* Buffer has not enough space for even 1 rowid */
               1724 :               :  
               1725 :       3260254 : +  max_buff_entries = (*buffer_size - extra_mem_overhead) / elem_size;
               1726 :               : +
               1727 :               :    /* Number of iterations we'll make with full buffer */
               1728 :       3260254 :    n_full_steps= (uint)floor(rows2double(rows) / max_buff_entries);
               1729 :               :    
             
            ===== File: sql/multi_range_read.h =====
             
                631 :               :    
                632 :               :    bool choose_mrr_impl(uint keyno, ha_rows rows, uint *flags, uint *bufsz, 
                633 :               :                         Cost_estimate *cost);
                634 :               : +  bool get_disk_sweep_mrr_cost(uint keynr, ha_rows rows, uint flags,
                635 :               : +                               uint *buffer_size, uint extra_mem_overhead,
                636 :               : +                               Cost_estimate *cost);
                637 :               :    bool check_cpk_scan(THD *thd, TABLE_SHARE *share, uint keyno, uint mrr_flags);
                638 :               :  
                639 :               :    bool setup_buffer_sharing(uint key_size_in_keybuf, key_part_map key_tuple_map);
             
            ===== File: storage/innobase/handler/ha_innodb.cc =====
             
               6466 :         60912 :  	DBUG_ENTER("ha_innobase::clone");
               6467 :               :  
               6468 :               :  	ha_innobase*	new_handler = static_cast<ha_innobase*>(
               6469 :         30456 : +		handler::clone(m_prebuilt->table->name.m_name, mem_root));
               6470 :               :  
               6471 :         30456 :  	if (new_handler != NULL) {
               6472 :         30456 :  		DBUG_ASSERT(new_handler->m_prebuilt != NULL);
             
            ===== File: storage/maria/ha_maria.cc =====
             
               1000 :       4864300 :  {}
               1001 :               :  
               1002 :               :  
               1003 :         96627 : +handler *ha_maria::clone(const char *name __attribute__((unused)),
               1004 :               : +                         MEM_ROOT *mem_root)
               1005 :               :  {
               1006 :               : +  ha_maria *new_handler=
               1007 :         96627 : +    static_cast <ha_maria *>(handler::clone(file->s->open_file_name.str,
               1008 :         96627 : +                                            mem_root));
               1009 :         96627 :    if (new_handler)
               1010 :               :    {
               1011 :         96627 :      new_handler->file->state= file->state;
             
            ===== File: storage/myisam/ha_myisam.cc =====
             
                702 :       3851265 :     can_enable_indexes(1)
                703 :       3851099 :  {}
                704 :               :  
                705 :         73231 : +handler *ha_myisam::clone(const char *name __attribute__((unused)),
                706 :               : +                          MEM_ROOT *mem_root)
                707 :               :  {
                708 :               : +  ha_myisam *new_handler=
                709 :         73231 : +    static_cast <ha_myisam *>(handler::clone(file->filename, mem_root));
                710 :         73231 :    if (new_handler)
                711 :         73231 :      new_handler->file->state= file->state;
                712 :         73231 :    return new_handler;
            

            Which of course doesn't mean much, apart from the obviously missing parts

            sql/ha_partition.cc:
               5485 : +      if ((tmp= (*file)->ha_rnd_end()))
               5486 : +        error= tmp;
            

            and

               1592 :       3260248 : +  if (get_disk_sweep_mrr_cost(keyno, rows, *flags, bufsz, add_len,
               1593 :               : +                              &dsmrr_cost))
               1594 :             0 :      return TRUE;
            

            elenst Elena Stepanova added a comment - - edited First test run done, no new failures, moderately affected by MDEV-14667 , MDEV-21085 and some legacy wrong result failures which are reproducible before MDEV-20611 . Line coverage for the patch from the first run (branch coverage collection hangs, possibly related to MDEV-20694 , or something else)   ===== File: sql/ha_partition.cc =====   5478 : 25198794 : if ((tmp= (*file)->ha_index_end())) 5479 : 0 : error= tmp; 5480 : : } 5481 : 2088492 : + else if ((*file)->inited == RND) 5482 : : + { 5483 : : + // Possible due to MRR 5484 : : + int tmp; 5485 : 0 : + if ((tmp= (*file)->ha_rnd_end())) 5486 : 0 : + error= tmp; 5487 : : + } 5488 : 27287374 : } while (*(++file)); 5489 : 4221504 : destroy_record_priority_queue(); 5490 : 8443001 : DBUG_RETURN(error);   6526 : 2732361 : else if (unlikely((error= handle_unordered_next(table->record[0], FALSE)))) 6527 : 35527 : DBUG_RETURN(error); 6528 : : 6529 : 2732359 : + if (!(m_mrr_mode & HA_MRR_NO_ASSOCIATION)) 6530 : : + { 6531 : 2401029 : + *range_info= 6532 : 2401029 : + ((PARTITION_KEY_MULTI_RANGE *) m_range_info[m_last_part])->ptr; 6533 : : + } 6534 : : } 6535 : 186298496 : DBUG_RETURN(0); 6536 : : }   ===== File: sql/multi_range_read.cc =====   1589 : : } 1590 : : 1591 : 3260248 : uint add_len= share->key_info[keyno].key_length + primary_file->ref_length; 1592 : 3260248 : + if (get_disk_sweep_mrr_cost(keyno, rows, *flags, bufsz, add_len, 1593 : : + &dsmrr_cost)) 1594 : 0 : return TRUE; 1595 : : + 1596 : : bool force_dsmrr; 1597 : : /* 1598 : : If mrr_cost_based flag is not set, then set cost of DS-MRR to be minimum of   1681 : : @param rows E(Number of rows to be scanned) 1682 : : @param flags Scan parameters (HA_MRR_* flags) 1683 : : @param buffer_size INOUT Buffer size 1684 : : + IN: Buffer of size 0 means the function 1685 : : + will determine the best size and return it. 1686 : : + @param extra_mem_overhead Extra memory overhead of the MRR implementation 1687 : : + (the function assumes this many bytes of buffer 1688 : : + space will not be usable by DS-MRR) 1689 : : @param cost OUT The cost 1690 : : 1691 : : @retval FALSE OK   1694 : : */ 1695 : : 1696 : 3260262 : bool DsMrr_impl::get_disk_sweep_mrr_cost(uint keynr, ha_rows rows, uint flags, 1697 : : + uint *buffer_size, 1698 : : + uint extra_mem_overhead, 1699 : : + Cost_estimate *cost) 1700 : : { 1701 : : ulong max_buff_entries, elem_size; 1702 : : ha_rows rows_in_full_step;   1706 : : 1707 : 6520524 : elem_size= primary_file->ref_length + 1708 : 3260262 : sizeof(void*) * (!MY_TEST(flags & HA_MRR_NO_ASSOCIATION)); 1709 : : 1710 : 3260262 : + if (!*buffer_size) 1711 : : + { 1712 : : + /* 1713 : : + We are requested to determine how much memory we need. 1714 : : + Request memory to finish the scan in one pass but do not request 1715 : : + more than @@mrr_buff_size. 1716 : : + */ 1717 : 5281504 : + *buffer_size= (uint) MY_MIN(extra_mem_overhead + elem_size*(ulong)rows, 1718 : : + MY_MAX(table->in_use->variables.mrr_buff_size, 1719 : 2640752 : + extra_mem_overhead)); 1720 : : + } 1721 : : + 1722 : 3260262 : + if (elem_size + extra_mem_overhead > *buffer_size) 1723 : 8 : return TRUE; /* Buffer has not enough space for even 1 rowid */ 1724 : : 1725 : 3260254 : + max_buff_entries = (*buffer_size - extra_mem_overhead) / elem_size; 1726 : : + 1727 : : /* Number of iterations we'll make with full buffer */ 1728 : 3260254 : n_full_steps= (uint)floor(rows2double(rows) / max_buff_entries); 1729 : :   ===== File: sql/multi_range_read.h =====   631 : : 632 : : bool choose_mrr_impl(uint keyno, ha_rows rows, uint *flags, uint *bufsz, 633 : : Cost_estimate *cost); 634 : : + bool get_disk_sweep_mrr_cost(uint keynr, ha_rows rows, uint flags, 635 : : + uint *buffer_size, uint extra_mem_overhead, 636 : : + Cost_estimate *cost); 637 : : bool check_cpk_scan(THD *thd, TABLE_SHARE *share, uint keyno, uint mrr_flags); 638 : : 639 : : bool setup_buffer_sharing(uint key_size_in_keybuf, key_part_map key_tuple_map);   ===== File: storage/innobase/handler/ha_innodb.cc =====   6466 : 60912 : DBUG_ENTER("ha_innobase::clone"); 6467 : : 6468 : : ha_innobase* new_handler = static_cast<ha_innobase*>( 6469 : 30456 : + handler::clone(m_prebuilt->table->name.m_name, mem_root)); 6470 : : 6471 : 30456 : if (new_handler != NULL) { 6472 : 30456 : DBUG_ASSERT(new_handler->m_prebuilt != NULL);   ===== File: storage/maria/ha_maria.cc =====   1000 : 4864300 : {} 1001 : : 1002 : : 1003 : 96627 : +handler *ha_maria::clone(const char *name __attribute__((unused)), 1004 : : + MEM_ROOT *mem_root) 1005 : : { 1006 : : + ha_maria *new_handler= 1007 : 96627 : + static_cast <ha_maria *>(handler::clone(file->s->open_file_name.str, 1008 : 96627 : + mem_root)); 1009 : 96627 : if (new_handler) 1010 : : { 1011 : 96627 : new_handler->file->state= file->state;   ===== File: storage/myisam/ha_myisam.cc =====   702 : 3851265 : can_enable_indexes(1) 703 : 3851099 : {} 704 : : 705 : 73231 : +handler *ha_myisam::clone(const char *name __attribute__((unused)), 706 : : + MEM_ROOT *mem_root) 707 : : { 708 : : + ha_myisam *new_handler= 709 : 73231 : + static_cast <ha_myisam *>(handler::clone(file->filename, mem_root)); 710 : 73231 : if (new_handler) 711 : 73231 : new_handler->file->state= file->state; 712 : 73231 : return new_handler; Which of course doesn't mean much, apart from the obviously missing parts sql/ha_partition.cc: 5485 : + if ((tmp= (*file)->ha_rnd_end())) 5486 : + error= tmp; and 1592 : 3260248 : + if (get_disk_sweep_mrr_cost(keyno, rows, *flags, bufsz, add_len, 1593 : : + &dsmrr_cost)) 1594 : 0 : return TRUE;
            elenst Elena Stepanova added a comment - - edited

            Comparison test between the patched build and pre-patch build (352e7667) only hit MDEV-21085 (unrelated).

            elenst Elena Stepanova added a comment - - edited Comparison test between the patched build and pre-patch build (352e7667) only hit MDEV-21085 (unrelated).

            sql/ha_partition.cc:
               5485 : +      if ((tmp= (*file)->ha_rnd_end()))
               5486 : +        error= tmp;
            

            We don't have storage engines that can return an error from rnd_end.

             
               1592 :       3260248 : +  if (get_disk_sweep_mrr_cost(keyno, rows, *flags, bufsz, add_len,
               1593 :               : +                              &dsmrr_cost))
               1594 :             0 :      return TRUE;
            

            This can fire under this condition:

              if (elem_size + extra_mem_overhead > *buffer_size)
                return TRUE; /* Buffer has not enough space for even 1 rowid */
            

            so this needs a table with a long clustered PK.

            psergei Sergei Petrunia added a comment - sql/ha_partition.cc: 5485 : + if ((tmp= (*file)->ha_rnd_end())) 5486 : + error= tmp; We don't have storage engines that can return an error from rnd_end. 1592 : 3260248 : + if (get_disk_sweep_mrr_cost(keyno, rows, *flags, bufsz, add_len, 1593 : : + &dsmrr_cost)) 1594 : 0 : return TRUE; This can fire under this condition: if (elem_size + extra_mem_overhead > *buffer_size) return TRUE; /* Buffer has not enough space for even 1 rowid */ so this needs a table with a long clustered PK.

            Round 10 (there were several rounds in between with nothing worth mentioning).
            More unrelated positives due to MDEV-21183, possibly a duplicate of something filed earlier, hard to tell.
            Restarting tests again, now without compressed columns.

            elenst Elena Stepanova added a comment - Round 10 (there were several rounds in between with nothing worth mentioning). More unrelated positives due to MDEV-21183 , possibly a duplicate of something filed earlier, hard to tell. Restarting tests again, now without compressed columns.

            This will be tested further as part of general optimizer & server testing. Closing.

            Roel Roel Van de Paar added a comment - This will be tested further as part of general optimizer & server testing. Closing.

            People

              Roel Roel Van de Paar
              psergei Sergei Petrunia
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.