[MDEV-17492] Benchmark AHI to find cases where it's useful Created: 2018-10-18  Updated: 2023-10-12  Resolved: 2019-09-03

Status: Closed
Project: MariaDB Server
Component/s: Storage Engine - InnoDB
Fix Version/s: N/A

Type: Task Priority: Critical
Reporter: Eugene Kosov (Inactive) Assignee: Axel Schwenke
Resolution: Fixed Votes: 1
Labels: None

Attachments: File AHI_point-select.ods     File AHI_secondary_range.ods     File AHI_vs_writes.ods    
Issue Links:
Relates
relates to MDEV-12121 Introduce build option WITH_INNODB_AH... Closed
relates to MDEV-20487 Set innodb_adaptive_hash_index=OFF by... Closed
relates to MDEV-16796 TRUNCATE TABLE slowdown with innodb_f... Closed

 Description   

AHI is probably a subject to removal. It's clear that managing a hash table costs some performance but it's not clear where it could be beneficial.

One example of harm was found in https://jira.mariadb.org/browse/MDEV-16796 where TRUNCATE was slowed down.

axel please find cases where AHI improves server speed. Then we can measure whether it's needed at all or not.



 Comments   
Comment by Marko Mäkelä [ 2018-10-20 ]

For the record, there was this change in MySQL 5.7:

commit 00ec81a9efc1108376813f15935b52c451a268cf
Author: Marko Mäkelä <marko.makela@oracle.com>
Date: Thu Jun 11 13:19:50 2015 +0300

Bug#21198396 REINTRODUCE ADAPTIVE HASH INDEX FIELD PREFIXES
TO SPEED UP SEQUENTIAL INSERTS

This is a regression from the fix of
Bug#16852278 SIMPLIFY RECORD COMPARISONS
which removed the n_bytes fields from the adaptive hash index.

For certain workloads, such as LOAD DATA INFILE of sorted data,
the ability of the adaptive hash index to build indexes on
byte prefixes of arbitrary binary fields is essential.
In the reported test case, there is an integer primary key,
and we end up building adaptive hash indexes on the
most significant bytes of this key.

A more appropriate fix (which I was not allowed to consider at that time) would have been to investigate why the sorted inserts (into a table that did not have any secondary indexes) end up performing AHI lookups at all, and to fix that. InnoDB should remember and take advantage of the last insert position, both in the cursor (table handle) and in each index page.

It could be useful to test the described scenario (unfortunately no more details are available) with both values of innodb_adaptive_hash_index. It could have been a BIGINT UNSIGNED key (8 bytes). Without the above patch in 5.7 (that code was removed by me earlier during the 5.7 development), the AHI would not be used much, or at all.

Comment by Marko Mäkelä [ 2019-06-25 ]

wlad is says that on Windows, testing with 16 to 64 concurrent users, with the thread pool enabled, the AHI is not helping much. This seems to be in line with a benchmark by Mark Callaghan from 5 years ago.

Because of the above mentioned MySQL 5.7 change, I tried a similar INSERT benchmark:

--source include/have_innodb.inc
--source include/have_sequence.inc
CREATE TABLE t1 (a BIGINT UNSIGNED PRIMARY KEY, b CHAR(255) NOT NULL DEFAULT '', c CHAR(255) NOT NULL DEFAULT '') ENGINE=InnoDB ROW_FORMAT=REDUNDANT;
INSERT INTO t1 (a) SELECT * FROM seq_1_to_10000000;
DROP TABLE t1;

invoked like this:

for ahi in 1 0 1 0 1 0
do ./mtr --mysqld=--innodb-log-file-size=100m --mysqld=--innodb-adaptive-hash-index=$ahi --mysqld=--skip-innodb-buffer-pool-dump-at-shutdown --mysqld=--innodb-buffer-pool-size=128m --mysqld=--innodb-page-size=4k main.ahi
done

There was some variation in the execution times, but the adaptive hash index was not systematically faster. I also rebuilt the server without the adaptive hash index altogether:

cmake -DWITH_INNODB_AHI=OFF
make -j$(nproc)

The execution time remained similar.

I believe that corruption reports, such as MDEV-18815, are a strong enough reason to disable the adaptive hash index by default. Those corruptions could have started to occur when MySQL 5.6 and MariaDB Server 10.0 introduced atomic memory access for the field buf_fix_count. That could have broken the special logic around buf_fix_count==0. I can remember some cases of AHI corruption during the development MySQL 5.6 already.

Theoretically, disabling the adaptive hash index could cause some loss of performance in a special case. Here is some analysis. But is this worth the added code complexity and hard-to-chase bugs?

Comment by Marko Mäkelä [ 2019-06-25 ]

For the record: Traditionally, the adaptive hash index was always enabled in InnoDB. The option innodb_adaptive_hash_index for disabling it was introduced in the MySQL 5.1 series. Later, it became possible to change it with SET GLOBAL while the server is running.

I would suggest to disable the adaptive hash index by default. I would also deprecate&ignore the parameter and remove the code in a future version. Perhaps we could change the build option to default to WITH_INNODB_AHI=OFF in 10.5 already?

Comment by Axel Schwenke [ 2019-07-02 ]

I've run a sysbench OLTP benchmark on 10.4.6 with 3 different AHI configurations:

01 ... AH enabled, 16 partitions
02 ... AHI enabled, not partitioned
03 ... AHI disabled

# oltp_read_only
#thd    01      02      03
1       13098   13220   13413
2       27399   27366   27430
4       51195   51209   50803
8       94001   94098   93907
16      159700  160839  158330
32      260426  258744  256135
64      261776  261027  256641
128     253985  253562  250109
256     254862  257903  255052

-> rather small advantage with the AHI enabled

# oltp_read_write
#thd    01      02      03
1       11421   11653   11536
2       22414   22724   22912
4       43003   43383   43386
8       82690   82587   83548
16      128420  130178  129297
32      222172  230177  229942
64      236455  239085  241364
128     229690  229087  236008
256     215202  212033  213638

-> undecided

# oltp_write_only
#thd    01      02      03
1       13523   12232   14716
2       26829   24014   29676
4       48891   43479   56952
8       86071   69671   103918
16      159986  138336  168802
32      231250  210607  240942
64      268309  236845  292276
128     275251  245588  300034
256     258803  231385  288288

-> visible improvement from disabling the AHI, partitioned AHI behaves better. Very probably a locking issue.

# oltp_update_index
#thd    01      02      03
1       6732.9  6288.1  8263.1
2       15069   13911   17247
4       27481   23956   32595
8       46390   36278   58606
16      88046   68900   93487
32      130322  110882  141056
64      150297  129456  162672
128     148600  128495  161833
256     137251  117105  153698

-> same conclusion as for write-only workload

# oltp_update_non_index
#thd    01      02      03
1       11668   11157   11417
2       22639   21999   22474
4       41038   42108   42215
8       75529   74748   77345
16      114840  117617  116145
32      149631  148641  150975
64      165583  166864  172958
128     167827  169183  175871
256     159341  161391  169384

-> much smaller effect as for the other update-only workload. This is probably due to the nature of those updates. update-index just increments an INT column where update-non-index overwrites a CHAR(120) column. This second UPDATE moves much more data around which kind of masks most effects from the AHI.

I'm going to do more tests with TPCC.

Comment by Manjot Singh (Inactive) [ 2019-07-17 ]

Just a note that unless you have 32+ cores, many threads, and many tables, in my experience 16 is high. I have rarely seen more than 4 partitions in production use.

Comment by Axel Schwenke [ 2019-07-17 ]

manjot - thanks for your feedback. As a matter of fact I have done systematic testing how the number of AHI partitions affects performance. I couldn't find any drawback from having many AHI partitions (unlike having many buffer pool instances). It is true that going from say 8 AHI partitions to 16 didn't have much positive effect on performance. But I've never seen any negative effect as well.

I have some more results in the pipeline. From this it looks that the main problem with the AHI is the btr_search_latch that protects shared AHI data structures. This latch becomes hot when the workload contains writes. Partitioning the AHI can help to reduce (but not remove) contention on this latch. Conclusion: unless your workload is strictly read-only, the AHI should be partitioned.

Comment by Marko Mäkelä [ 2019-07-19 ]

The adaptive hash index is disabled for temporary tables in MariaDB Server 10.2+.

Starting with MariaDB Server 10.2, the btr_search_latch has been partitioned into btr_search_latches[] of innodb_adaptive_hash_index_parts (btr_ahi_parts) elements, default 8, ranging from 1 to 512. The adaptive hash index itself is still global; only the latch has been partitioned by essentially index_id (which is still global; in MySQL 5.7, on my suggestion, some previsions were implemented to allow each tablespace to have their own domain of index_id):

UNIV_INLINE
rw_lock_t*
btr_get_search_latch(const dict_index_t* index)
{
	ut_ad(index != NULL);
 
	ulint	ifold = ut_fold_ulint_pair(static_cast<ulint>(index->id),
					   static_cast<ulint>(index->space));
 
	return(btr_search_latches[ifold % btr_ahi_parts]);
}

So, all requests for a particular index will always use the same btr_search_latches[] element.

Comment by Axel Schwenke [ 2019-08-27 ]

I tested now a lot more workloads to find cases where the AHI increases performance. The biggest improvements could be found for strictly read-only workloads. But even then it depends a lot of the specific queries. From the sysbench OLTP query mix only the point-select workload benefits from the AHI. Depending on thread count I see up to 7% improvement when the AHI is enabled. None of the range queries profits from the AHI. Details are in attached spreadsheet AHI_point-select.ods

I also tested a modified read-only workload where rows are accessed via secondary key. Now point selects don't benefit from the AHI. Range queries benefit from the AHI at low thread counts but are considerably slower at thread counts higher than 8. The reason is the btr_search_latch that becomes extremely busy. This can be seen in detail in attached spreadsheet AHI_secondary_range.ods

Comment by Axel Schwenke [ 2019-08-27 ]

Another tested workload is TPC-C, using the Lua implementation for sysbench 1.0. Since this workload includes heavy writes (about 50% of the queries are writes) the btr_search_latch becomes the bottleneck. Even partitioning the AHI does not help with that. Here are the TPC-C queries per scond for different thread counts using AHI with 16/1 partitions vs. no AHI:

threads 16-AHI  1-AHI   no-AHI
1       8489.5  9057.8  9599.2
2       15614   17754   18439
4       29015   34259   35078
8       46936   62416   64824
16      67218   96819   101095
32      103217  145655  154564
64      111819  157915  177283
128     114516  161011  175250
256     113827  148542  162780

Comment by Axel Schwenke [ 2019-09-03 ]

Now here is a final set of results, testing a mixture of point-selects (which benefit from AHI) and writes (which lead to contention of the btr_search_latch). The dataset and queries are from sysbench OLTP, the writes are updates on indexed/non-indexed columns. I vary the percentage of writes between 1% and 20%.

For details, look at the attached spread sheet AHI_vs_writes.ods.

Some observations:

  • even with 1% writes the btr_search_latch contention is visible in the case of a single AHI partition
  • with 2% or more writes, the configuration with AHI disabled starts to shine
  • at 5% writes a single AHI partition delivers only little more than half the througput compared to no AHI - at least for thread counts up to 64
  • at very high thread counts the difference flattens out. This is due to other locks becoming busy (log_sys_write_mutex, LOCK_table_cache)

I'm trying to draw a conclusion: the AHI has the potential to increase performance by up to 10% - but only with perfectly matching workload. Further more this workload must be (almost) read-only. Even 2% writes is enough to cancel any positive effect of the AHI. I'd say it's OK to remove the AHI from InnoDB. It's causing more trouble than it's worth.

Generated at Thu Feb 08 08:36:52 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.