|
Running the query on upstream, the estimate is still greater than the rows of the table.
explain select c1 from t1 where c1 > 5 limit 10;
|
id select_type table type possible_keys key key_len ref rows Extra
|
1 SIMPLE t1 range i i 9 NULL 13492 Using where; Using index
|
|
|
In the function ha_rocksdb::info
we see there is a call made to update_stats, that sets the stats.records to the appropriate value when there is a primary key available. Here is the code snippet
for (uint i = 0; i < m_tbl_def->m_key_count; i++) {
|
if (is_pk(i, table, m_tbl_def)) {
|
stats.data_file_length = m_pk_descr->m_stats.m_actual_disk_size;
|
stats.records = m_pk_descr->m_stats.m_rows;
|
} else {
|
stats.index_file_length += m_key_descr_arr[i]->m_stats.m_actual_disk_size;
|
}
|
}
|
So for our example we have a primary key and so we get the value set for stats.records.
Then going ahead in ha_rocksdb::info we have
if (stats.records == 0 ||
|
(rocksdb_force_compute_memtable_stats &&
|
rocksdb_debug_optimizer_n_rows == 0))
|
{
|
rocksdb_debug_optimizer_n_rows is by default 0 and rocksdb_force_compute_memtable_stats is TRUE by default.
So we enter this IF branch.
|
|
Then going inside the IF branch we have
rdb->GetApproximateMemTableStats(m_pk_descr->get_cf(), r,
|
&memtableCount, &memtableSize);
|
and then we add the stats from here to the stats.records
stats.records += memtableCount;
|
stats.data_file_length += memtableSize;
|
So we had stats.records as 10k already and on top of it we add the value of memtableCount which we get from GetApproximateMemTableStats function.
|
|
So I have added this printout patch: https://gist.github.com/spetrunia/82c494ad7f26b93e6774bdd875c7fa3b
Ran this test (the testcase from this bug, with some output suppressed) : https://gist.github.com/spetrunia/73a1616827dccfbbae9008381cbb6080
Get this output with 20K rows reported
TEST RESULT TIME (ms) or COMMENT
|
--------------------------------------------------------------------------
|
worker[1] Using MTR_BUILD_THREAD 300, with reserved ports 16000..16019
|
create table t1 (id1 bigint, id2 bigint, c1 bigint, c2 bigint, c3 bigint, c4 bigint, c5 bigint, c6 bigint, c7 bigint, primary key (id1, id2), index i(c1, c2))engine=rocksdb;
|
explain select c1 from t1 where c1 > 5 limit 10;
|
id select_type table type possible_keys key key_len ref rows Extra
|
1 SIMPLE t1 range i i 9 NULL 19800 Using where; Using index
|
rocksdb._e [ pass ] 3989
|
however looking into mysqld.err I see that we've got the 20K number from GetApproximateMemTableStats:
AAA: GetApproximateSizes() call #1, value 0
|
AAA: GetApproximateMemTableStats() call#1 returns 0
|
AAA: GetApproximateSizes() call #1, value 0
|
AAA: GetApproximateMemTableStats() call#1 returns 20000
|
AAA: GetApproximateMemTableStats() call#2 returns 20000
|
AAA: GetApproximateMemTableStats() call#2 returns 20000
|
|
|
Changed unique key to be non-unique and I see GetApproximateMemTableStats return even 30K rows:
AAA: GetApproximateSizes() call #1, value 0
|
AAA: GetApproximateMemTableStats() call#1 returns 0
|
AAA: GetApproximateSizes() call #1, value 0
|
AAA: GetApproximateMemTableStats() call#1 returns 13103
|
AAA: GetApproximateMemTableStats() call#2 returns 30000
|
AAA: GetApproximateMemTableStats() call#2 returns 30000
|
This is fairly easy to reproduce.
The EXPLAIN output in the test returns either about 14K or 30K:
explain select c1 from t1 where c1 > 5 limit 10;
|
id select_type table type possible_keys key key_len ref rows Extra
|
1 SIMPLE t1 range i i 9 NULL 29700 Using where; Using index
|
|
|
In the comment dated 2018-07-16, Varun wrote:
In the function ha_rocksdb::info we see there is a call made to update_stats, that sets the stats.records to the appropriate value when there is a primary key available. Here is the code snippet
I'm debugging and I don't see anything wrong:
- ha_rocksdb::info calls ha_rocksdb::update_stats, which sets stats.records from m_pk_descr->m_stats.m_rows. The latter is on-disk data statistics.
- then, ha_rocksdb::info has this piece:
if (stats.records == 0) {
|
rdb->GetApproximateSizes(m_pk_descr->get_cf(), &r, 1, &sz,
|
include_flags);
|
|
That is, if on-disk data statistics had zero, we will make a call to get the current estimate.
- then, we add memtable estimates (which we either get from GetApproximateMemTableStats, or get the cached value).
So far I don't see how either on-disk or memtable data estimate could be counted twice.
|
|
I can't reproduce it with latest 10.3 branch(commit 7f73f5e4e5bdae80561ff2596f5142965397a1b1)
MariaDB [test]> select count(*) from t1;
|
+----------+
|
| count(*) |
|
+----------+
|
| 10000 |
|
+----------+
|
1 row in set (0.756 sec)
|
|
MariaDB [test]> explain select c1 from t1 where c1 > 5 limit 10;
|
+------+-------------+-------+-------+---------------+------+---------+------+------+--------------------------+
|
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
|
+------+-------------+-------+-------+---------------+------+---------+------+------+--------------------------+
|
| 1 | SIMPLE | t1 | range | i | i | 9 | NULL | 9900 | Using where; Using index |
|
+------+-------------+-------+-------+---------------+------+---------+------+------+--------------------------+
|
1 row in set (0.001 sec)
|
|
|
Here I see this on buildbot while working on MDEV-15253
CURRENT_TEST: rocksdb.rocksdb_range2
|
--- D:/winx64-packages/build/src/storage/rocksdb/mysql-test/rocksdb/r/rocksdb_range2.result 2018-11-14 10:37:57.000000000 +0000
|
+++ D:\winx64-packages\build\src\storage\rocksdb\mysql-test\rocksdb\r\rocksdb_range2.reject 2018-11-14 15:00:03.924782500 +0000
|
@@ -8,7 +8,7 @@
|
10000
|
explain select c1 from t1 where c1 > 5 limit 10;
|
id select_type table type possible_keys key key_len ref rows Extra
|
-1 SIMPLE t1 index i i 18 NULL # Using where; Using index
|
+1 SIMPLE t1 range i i 9 NULL # Using where; Using index
|
drop table t1;
|
https://buildbot.askmonty.org/buildbot/builders/winx64-packages/builds/10524/steps/test/logs/stdio
|