[MDEV-32785] background innodb stats recalculation causing incorrect rowcounts during multiple row insert statement Created: 2023-11-12  Updated: 2023-11-30

Status: Open
Project: MariaDB Server
Component/s: Storage Engine - InnoDB
Affects Version/s: 11.0.3
Fix Version/s: 11.0

Type: Bug Priority: Major
Reporter: Rex Johnston Assignee: Unassigned
Resolution: Unresolved Votes: 0
Labels: None

Issue Links:
Relates
relates to MDEV-28327 InnoDB persistent statistics fail to ... Closed
relates to MDEV-32690 After CREATE INDEX, InnoDB reports th... Confirmed
relates to MDEV-32901 innodb.mdev-14846 fails in 11.0 Closed
relates to MDEV-31793 LooseScan join on derived table crashes In Progress

 Description   

During a multi-row table insert being executed in one thread, another thread executing dict_stats_update_persistent() is observed to increment ib_table->stat_n_rows for the table being inserted into.

Steps to reproduce.

launch the server normally, repeatedly execute

DROP TABLE IF EXISTS `comments`;
CREATE TABLE `comments` (
    `id` INTEGER,
    `id2` INTEGER,
    `article_id` INTEGER,
    `article_id2` INTEGER,
    `author_id` INTEGER,
    `votes` INTEGER,
    `body` TEXT,
    `published` BOOLEAN,
    PRIMARY KEY (`id`),
    KEY `comments_article_id` (`article_id`, `article_id2`),
    KEY `comments_top_comments_sort` (`votes`, `id`)
);
 
DROP TABLE IF EXISTS `comments_i18n`;
CREATE TABLE `comments_i18n` (
    `id` INTEGER AUTO_INCREMENT,
    `locale` VARCHAR(255),
    `model` VARCHAR(255),
    `foreign_key` INTEGER,
    `field` VARCHAR(255),
    `content` TEXT,
    PRIMARY KEY (`id`),
    UNIQUE KEY `comments_I18N_LOCALE_FIELD` (`locale`, `model`, `foreign_key`, `field`),
    KEY `comments_I18N_FIELD` (`model`, `foreign_key`, `field`)
);
 
INSERT INTO `comments` (`id`, `id2`, `article_id`, `article_id2`, `author_id`, `votes`, `body`, `published`) VALUES
    (1, 2, 1, 2, 1, 1, 'Lorem ipsum dolor sit amet', 1),
    (2, 3, 1, 2, 1, 2, 'Lorem ipsum dolor sit amet', 1),
    (3, 4, 1, 2, 2, 3, 'Lorem ipsum dolor sit amet', 1),
    (4, 5, 1, 2, 2, 4, 'Lorem ipsum dolor sit amet', 0),
    (5, 6, 2, 3, 1, 10, 'Lorem ipsum dolor sit amet', 1),
    (6, 7, 2, 3, 1, 9, 'Lorem ipsum dolor sit amet', 1),
    (7, 8, 2, 3, 2, 8, 'Lorem ipsum dolor sit amet', 1),
    (8, 9, 2, 3, 2, 7, 'Lorem ipsum dolor sit amet', 0);
 
INSERT INTO `comments_i18n` (`locale`, `model`, `foreign_key`, `field`, `content`) VALUES
    ('de_DE', 'Comments', 1, 'body', 'Franz jagt im komplett verwahrlosten Taxi quer durch Bayern'),
    ('de_DE', 'Comments', 2, 'body', 'Der schnelle braune Fuchs springt über den faulen Hund'),
    ('de_DE', 'Comments', 5, 'body', 'Fix Schwyz! quäkt Jürgen blöd vom Paß'),
    ('de_DE', 'Comments', 6, 'body', 'Quod erat demonstrandum.');
 
set optimizer_trace='enabled=on';
 
SELECT
    TopComments.id AS TopComments__id,
    TopComments.article_id AS TopComments__article_id,
    TopComments.body AS TopComments__body,
    TopComments.votes AS TopComments__votes,
    TopComments_body_translation.id AS TopComments_body_translation__id,
    TopComments_body_translation.content AS TopComments_body_translation__content
FROM
    comments TopComments
INNER JOIN
    comments_i18n TopComments_body_translation ON (
        TopComments_body_translation.model = 'Comments'
        AND TopComments_body_translation.field = 'body'
        AND TopComments_body_translation.locale = 'de_DE'
        AND TopComments.id = TopComments_body_translation.foreign_key
    )
WHERE (
    TopComments.article_id in (1, 2)
    AND TopComments.id in (
        SELECT
            __ranked__TopComments.id AS id
        FROM (
            SELECT
                TopComments.id AS TopComments__id,
                TopComments.article_id AS TopComments__article_id,
                TopComments.body AS TopComments__body,
                TopComments.votes AS TopComments__votes,
                TopComments.id AS id,
                (
                    ROW_NUMBER() OVER (
                        PARTITION BY
                            TopComments.article_id
                        ORDER BY
                            TopComments.votes DESC,
                            TopComments.id ASC
                    )
                ) AS __row_number,
                TopComments_body_translation.id AS TopComments_body_translation__id,
                TopComments_body_translation.content AS TopComments_body_translation__content
            FROM
                comments TopComments
            INNER JOIN
                comments_i18n TopComments_body_translation ON (
                    TopComments_body_translation.model = 'Comments'
                    AND TopComments_body_translation.field = 'body'
                    AND TopComments_body_translation.locale = 'de_DE'
                    AND TopComments.id = TopComments_body_translation.foreign_key
                )
            WHERE
                TopComments.article_id in (1, 2)
        )
        __ranked__TopComments
        WHERE
            __ranked__TopComments.__row_number <= 3
    )
)
ORDER BY
    TopComments.votes DESC,
    TopComments.id ASC;
 
DROP TABLE `comments`;
DROP TABLE `comments_i18n`;

perhaps with

for (( i = 1; i <= 1000; i ++ ))
do
    echo "attempt $i"
    mariadb test < see_above.sql
    if [ $? -ne 0 ]; then break; fi
    sleep 0.05
done

A separate bug (to be fixed in MDEV-31793) will cause the server to crash when given an incorrect row count for the table comments. There are 8 rows inserted above.

Running the server like this

rr record sql/mariadbd

will not trigger the bug, but

rr record -h sql/mariadbd

will eventually.

setting a watch point on ib_table->stat_n_rows for the table comments we see this value incremented here

#0  dict_table_n_rows_inc (table=0x50472c05c020) at /home/rex/src/mariadb/server.11.0/storage/innobase/include/dict0dict.inl:339
#1  0x0000558b42259274 in row_insert_for_mysql (mysql_rec=0x50472c0653b0 "\200\001", prebuilt=0x50472c0dad70, ins_mode=ROW_INS_NORMAL) at /home/rex/src/mariadb/server.11.0/storage/innobase/row/row0mysql.cc:1400
#2  0x0000558b4208786f in ha_innobase::write_row (this=0x50472c060be0, record=0x50472c0653b0 "\200\001") at /home/rex/src/mariadb/server.11.0/storage/innobase/handler/ha_innodb.cc:7821
#3  0x0000558b41c6e197 in handler::ha_write_row (this=0x50472c060be0, buf=0x50472c0653b0 "\200\001") at /home/rex/src/mariadb/server.11.0/sql/handler.cc:7827
#4  0x0000558b418041be in write_record (thd=0x50472c000d58, table=0x50472c06f6d8, info=0x3b36249c35d0, sink=0x0) at /home/rex/src/mariadb/server.11.0/sql/sql_insert.cc:2213
#5  0x0000558b41800b32 in mysql_insert (thd=0x50472c000d58, table_list=0x50472c0a1830, fields=..., values_list=..., update_fields=..., update_values=..., duplic=DUP_ERROR, ignore=false, result=0x0)
    at /home/rex/src/mariadb/server.11.0/sql/sql_insert.cc:1156
#6  0x0000558b4185797d in mysql_execute_command (thd=0x50472c000d58, is_called_from_prepared_stmt=false) at /home/rex/src/mariadb/server.11.0/sql/sql_parse.cc:4581
#7  0x0000558b4186394f in mysql_parse (thd=0x50472c000d58,
    rawbuf=0x50472c057ec0 "INSERT INTO `comments` (`id`, `id2`, `article_id`, `article_id2`, `author_id`, `votes`, `body`, `published`) VALUES\n    (1, 2, 1, 2, 1, 1, 'Lorem ipsum dolor sit amet', 1),\n    (2, 3, 1, 2, 1, 2, 'Lorem i
psum dolor sit amet', 1),\n    (3, 4, 1, 2, 2, 3, 'Lorem ipsum dolor sit amet', 1),\n    (4, 5, 1, 2, 2, 4, 'Lorem ipsum dolor sit amet', 0),\n    (5, 6, 2, 3, 1, 10, 'Lorem ipsum dolor sit amet', 1),\n    (6, 7, 2, 3, 1, 9, 'Lorem i
psum dolor sit amet', 1),\n    (7, 8, 2, 3, 2, 8, 'Lorem ipsum dolor sit amet', 1),\n    (8, 9, 2, 3, 2, 7, 'Lorem ipsum dolor sit amet', 0)", length=571, parser_state=0x3b36249c3f30)
    at /home/rex/src/mariadb/server.11.0/sql/sql_parse.cc:8027

in the rr trace recorded this value transitions from 0 to 1, 1 to 2, 2 to 3 in thread 2

then 3 to 4 at

#0  dict_stats_update_persistent (table=0x50472c05c020) at /home/rex/src/mariadb/server.11.0/storage/innobase/dict/dict0stats.cc:3022
#1  0x0000558b4240ea3f in dict_stats_update (table=0x50472c05c020, stats_upd_option=DICT_STATS_RECALC_PERSISTENT) at /home/rex/src/mariadb/server.11.0/storage/innobase/dict/dict0stats.cc:4076
#2  0x0000558b42415d19 in dict_stats_process_entry_from_recalc_pool (thd=0x1e9d400795a8) at /home/rex/src/mariadb/server.11.0/storage/innobase/dict/dict0stats_bg.cc:343
#3  0x0000558b42415fa6 in dict_stats_func () at /home/rex/src/mariadb/server.11.0/storage/innobase/dict/dict0stats_bg.cc:382
#4  0x0000558b424ca4fc in tpool::thread_pool_generic::timer_generic::run (this=0x558b4510b930) at /home/rex/src/mariadb/server.11.0/tpool/tpool_generic.cc:344
#5  0x0000558b424ca635 in tpool::thread_pool_generic::timer_generic::execute (arg=0x558b4510b930) at /home/rex/src/mariadb/server.11.0/tpool/tpool_generic.cc:364
#6  0x0000558b424cf1c1 in tpool::task::execute (this=0x558b4510b970) at /home/rex/src/mariadb/server.11.0/tpool/task.cc:37
#7  0x0000558b424c8c07 in tpool::thread_pool_generic::worker_main (this=0x558b44e479d0, thread_var=0x558b44e48010) at /home/rex/src/mariadb/server.11.0/tpool/tpool_generic.cc:583

in thread 11

then (back in thread 2/dict_table_n_rows_inc() ) from 4 to 5, 5 to 6, 6 to 7, 7 to 8, finally 8 to 9.



 Comments   
Comment by Marko Mäkelä [ 2023-11-13 ]

MDEV-28327 fixed a bug related to the MDEV-515 bulk insert (set foreign_key_checks=0,unique_checks=0) that caused bogus statistics to be reported during or after an insert into an empty table or partition. After this fix, the statistics should reliably reflect either the empty or the bulk-loaded table contents, unless the bulk load is immediately followed by some other modifications on the table.

Before MDEV-28327 was fixed, the InnoDB persistent statistics option STATS_AUTO_RECALC=1 or innodb_stats_auto_recalc=ON never was aware of transactions. The statistics are essentially computed using the READ UNCOMMITTED isolation level. Therefore, it is perfectly normal to get "lies, damn lies, and statistics" if any concurrent transactions are modifying the tables. Even the fact whether the purge of committed history was executed may be observed in the statistics when using the setting innodb_stats_include_delete_marked=ON.

Generated at Thu Feb 08 10:34:00 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.