Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-17492

Benchmark AHI to find cases where it's useful

Details

    Description

      AHI is probably a subject to removal. It's clear that managing a hash table costs some performance but it's not clear where it could be beneficial.

      One example of harm was found in https://jira.mariadb.org/browse/MDEV-16796 where TRUNCATE was slowed down.

      axel please find cases where AHI improves server speed. Then we can measure whether it's needed at all or not.

      Attachments

        Issue Links

          Activity

            The adaptive hash index is disabled for temporary tables in MariaDB Server 10.2+.

            Starting with MariaDB Server 10.2, the btr_search_latch has been partitioned into btr_search_latches[] of innodb_adaptive_hash_index_parts (btr_ahi_parts) elements, default 8, ranging from 1 to 512. The adaptive hash index itself is still global; only the latch has been partitioned by essentially index_id (which is still global; in MySQL 5.7, on my suggestion, some previsions were implemented to allow each tablespace to have their own domain of index_id):

            UNIV_INLINE
            rw_lock_t*
            btr_get_search_latch(const dict_index_t* index)
            {
            	ut_ad(index != NULL);
             
            	ulint	ifold = ut_fold_ulint_pair(static_cast<ulint>(index->id),
            					   static_cast<ulint>(index->space));
             
            	return(btr_search_latches[ifold % btr_ahi_parts]);
            }
            

            So, all requests for a particular index will always use the same btr_search_latches[] element.

            marko Marko Mäkelä added a comment - The adaptive hash index is disabled for temporary tables in MariaDB Server 10.2+. Starting with MariaDB Server 10.2, the btr_search_latch has been partitioned into btr_search_latches[] of innodb_adaptive_hash_index_parts ( btr_ahi_parts ) elements, default 8, ranging from 1 to 512. The adaptive hash index itself is still global; only the latch has been partitioned by essentially index_id (which is still global; in MySQL 5.7, on my suggestion, some previsions were implemented to allow each tablespace to have their own domain of index_id): UNIV_INLINE rw_lock_t* btr_get_search_latch( const dict_index_t* index) { ut_ad(index != NULL);   ulint ifold = ut_fold_ulint_pair( static_cast <ulint>(index->id), static_cast <ulint>(index->space));   return (btr_search_latches[ifold % btr_ahi_parts]); } So, all requests for a particular index will always use the same btr_search_latches[] element.
            axel Axel Schwenke added a comment - - edited

            I tested now a lot more workloads to find cases where the AHI increases performance. The biggest improvements could be found for strictly read-only workloads. But even then it depends a lot of the specific queries. From the sysbench OLTP query mix only the point-select workload benefits from the AHI. Depending on thread count I see up to 7% improvement when the AHI is enabled. None of the range queries profits from the AHI. Details are in attached spreadsheet AHI_point-select.ods

            I also tested a modified read-only workload where rows are accessed via secondary key. Now point selects don't benefit from the AHI. Range queries benefit from the AHI at low thread counts but are considerably slower at thread counts higher than 8. The reason is the btr_search_latch that becomes extremely busy. This can be seen in detail in attached spreadsheet AHI_secondary_range.ods

            axel Axel Schwenke added a comment - - edited I tested now a lot more workloads to find cases where the AHI increases performance. The biggest improvements could be found for strictly read-only workloads. But even then it depends a lot of the specific queries. From the sysbench OLTP query mix only the point-select workload benefits from the AHI. Depending on thread count I see up to 7% improvement when the AHI is enabled. None of the range queries profits from the AHI. Details are in attached spreadsheet AHI_point-select.ods I also tested a modified read-only workload where rows are accessed via secondary key. Now point selects don't benefit from the AHI. Range queries benefit from the AHI at low thread counts but are considerably slower at thread counts higher than 8. The reason is the btr_search_latch that becomes extremely busy. This can be seen in detail in attached spreadsheet AHI_secondary_range.ods
            axel Axel Schwenke added a comment -

            Another tested workload is TPC-C, using the Lua implementation for sysbench 1.0. Since this workload includes heavy writes (about 50% of the queries are writes) the btr_search_latch becomes the bottleneck. Even partitioning the AHI does not help with that. Here are the TPC-C queries per scond for different thread counts using AHI with 16/1 partitions vs. no AHI:

            threads 16-AHI  1-AHI   no-AHI
            1       8489.5  9057.8  9599.2
            2       15614   17754   18439
            4       29015   34259   35078
            8       46936   62416   64824
            16      67218   96819   101095
            32      103217  145655  154564
            64      111819  157915  177283
            128     114516  161011  175250
            256     113827  148542  162780
            

            axel Axel Schwenke added a comment - Another tested workload is TPC-C, using the Lua implementation for sysbench 1.0. Since this workload includes heavy writes (about 50% of the queries are writes) the btr_search_latch becomes the bottleneck. Even partitioning the AHI does not help with that. Here are the TPC-C queries per scond for different thread counts using AHI with 16/1 partitions vs. no AHI: threads 16-AHI 1-AHI no-AHI 1 8489.5 9057.8 9599.2 2 15614 17754 18439 4 29015 34259 35078 8 46936 62416 64824 16 67218 96819 101095 32 103217 145655 154564 64 111819 157915 177283 128 114516 161011 175250 256 113827 148542 162780
            axel Axel Schwenke added a comment - - edited

            Now here is a final set of results, testing a mixture of point-selects (which benefit from AHI) and writes (which lead to contention of the btr_search_latch). The dataset and queries are from sysbench OLTP, the writes are updates on indexed/non-indexed columns. I vary the percentage of writes between 1% and 20%.

            For details, look at the attached spread sheet AHI_vs_writes.ods.

            Some observations:

            • even with 1% writes the btr_search_latch contention is visible in the case of a single AHI partition
            • with 2% or more writes, the configuration with AHI disabled starts to shine
            • at 5% writes a single AHI partition delivers only little more than half the througput compared to no AHI - at least for thread counts up to 64
            • at very high thread counts the difference flattens out. This is due to other locks becoming busy (log_sys_write_mutex, LOCK_table_cache)

            I'm trying to draw a conclusion: the AHI has the potential to increase performance by up to 10% - but only with perfectly matching workload. Further more this workload must be (almost) read-only. Even 2% writes is enough to cancel any positive effect of the AHI. I'd say it's OK to remove the AHI from InnoDB. It's causing more trouble than it's worth.

            axel Axel Schwenke added a comment - - edited Now here is a final set of results, testing a mixture of point-selects (which benefit from AHI) and writes (which lead to contention of the btr_search_latch). The dataset and queries are from sysbench OLTP, the writes are updates on indexed/non-indexed columns. I vary the percentage of writes between 1% and 20%. For details, look at the attached spread sheet AHI_vs_writes.ods . Some observations: even with 1% writes the btr_search_latch contention is visible in the case of a single AHI partition with 2% or more writes, the configuration with AHI disabled starts to shine at 5% writes a single AHI partition delivers only little more than half the througput compared to no AHI - at least for thread counts up to 64 at very high thread counts the difference flattens out. This is due to other locks becoming busy (log_sys_write_mutex, LOCK_table_cache) I'm trying to draw a conclusion: the AHI has the potential to increase performance by up to 10% - but only with perfectly matching workload. Further more this workload must be (almost) read-only. Even 2% writes is enough to cancel any positive effect of the AHI. I'd say it's OK to remove the AHI from InnoDB. It's causing more trouble than it's worth.

            We recently encountered a case where a customer experienced a 50% performance drop after upgrading from 10.4 to 10.6.19. Specifically, their long-running queries saw execution times double, increasing from 1 hour to over 2 hours.

            Upon investigation, we found the root cause: AHI (Adaptive Hash Index) was disabled by default in 10.6, whereas it was enabled by default in 10.4.
            This seems to show that there are still scenarios, perhaps less common, where this feature has a significant impact on performance, so I decided to reopen the ticket

            patrizio.tamorri Patrizio Tamorri added a comment - We recently encountered a case where a customer experienced a 50% performance drop after upgrading from 10.4 to 10.6.19. Specifically, their long-running queries saw execution times double, increasing from 1 hour to over 2 hours. Upon investigation, we found the root cause: AHI (Adaptive Hash Index) was disabled by default in 10.6, whereas it was enabled by default in 10.4. This seems to show that there are still scenarios, perhaps less common, where this feature has a significant impact on performance, so I decided to reopen the ticket

            People

              axel Axel Schwenke
              kevg Eugene Kosov (Inactive)
              Votes:
              1 Vote for this issue
              Watchers:
              9 Start watching this issue

              Dates

                Created:
                Updated:

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.