[MDEV-25581] MS FTS Bug 103523 also applicable to MariaDB Server (Was: Test MS Bug 103523 against MD010521-bb-10.5-thiru) Created: 2021-05-03  Updated: 2023-04-24  Resolved: 2022-06-13

Status: Closed
Project: MariaDB Server
Component/s: Full-text Search
Affects Version/s: 10.2, 10.3, 10.4, 10.5
Fix Version/s: 10.6.9, 10.7.5, 10.8.4, 10.9.2

Type: Bug Priority: Critical
Reporter: Roel Van de Paar Assignee: Thirunarayanan Balathandayuthapani
Resolution: Fixed Votes: 1
Labels: None

Issue Links:
Problem/Incident
causes MDEV-30996 insert.. select in presence of full t... Closed
Relates
relates to MDEV-15237 "Can't write; duplicate key in table"... Closed
relates to MDEV-27913 innodb_ft_cache_size max possible val... Closed

 Description   


 Comments   
Comment by Roel Van de Paar [ 2021-05-03 ]

Steps taken:

  • Build using instrumented build (bb-10.5-thiru) as debug build.
  • Setup init + slap in same way as described in bug. Slap runs correctly:

    10.5.10 b0ef1e45d453b210e6480758aea348ebc13606b9 (Debug)

    10.5.10>show processlist;
    +-------+------+-----------+------+---------+------+----------+------------------------------------------------------------------------------------------------------+----------+
    | Id    | User | Host      | db   | Command | Time | State    | Info                                                                                                 | Progress |
    +-------+------+-----------+------+---------+------+----------+------------------------------------------------------------------------------------------------------+----------+
    |     5 | root | localhost | test | Query   |    0 | starting | show processlist                                                                                     |    0.000 |
    |     6 | root | localhost | NULL | Sleep   |  449 |          | NULL                                                                                                 |    0.000 |
    | 39317 | root | localhost | test | Sleep   |  272 |          | NULL                                                                                                 |    0.000 |
    | 60050 | root | localhost | test | Query   |    5 | Commit   | insert into t1 ( `col01`, `col02`, `col03`, `col04`, `col05`, `col06`, `col07`, `col08`, `col09`, `c |    0.000 |
    | 60051 | root | localhost | test | Query   |    5 | Commit   | insert into t1 ( `col01`, `col02`, `col03`, `col04`, `col05`, `col06`, `col07`, `col08`, `col09`, `c |    0.000 |
    +-------+------+-----------+------+---------+------+----------+------------------------------------------------------------------------------------------------------+----------+
    5 rows in set (0.000 sec)
    

  • For the time being, Innodb_fts_optimize_queue_count does not increase:

    10.5.10 b0ef1e45d453b210e6480758aea348ebc13606b9 (Debug)

    10.5.10>show global status like 'innodb_fts%';
    +---------------------------------+-------+
    | Variable_name                   | Value |
    +---------------------------------+-------+
    | Innodb_fts_optimize_queue_count | 0     |
    +---------------------------------+-------+
    1 row in set (0.002 sec)
    

Comment by Roel Van de Paar [ 2021-05-03 ]
  • Additional attempt with all 4 processes in parallel.

    10.5.10 b0ef1e45d453b210e6480758aea348ebc13606b9 (Debug)

    10.5.10>show processlist; 
    +------+------+-----------+------+---------+------+----------+------------------------------------------------------------------------------------------------------+----------+
    | Id   | User | Host      | db   | Command | Time | State    | Info                                                                                                 | Progress |
    +------+------+-----------+------+---------+------+----------+------------------------------------------------------------------------------------------------------+----------+
    |    5 | root | localhost | NULL | Sleep   |   37 |          | NULL                                                                                                 |    0.000 |
    |    6 | root | localhost | NULL | Sleep   |   37 |          | NULL                                                                                                 |    0.000 |
    |    8 | root | localhost | NULL | Sleep   |   37 |          | NULL                                                                                                 |    0.000 |
    |    7 | root | localhost | NULL | Sleep   |   37 |          | NULL                                                                                                 |    0.000 |
    |  969 | root | localhost | test | Query   |    0 | starting | show processlist                                                                                     |    0.000 |
    | 9300 | root | localhost | test | Query   |    1 | Commit   | insert into t3 ( `col01`, `col02`, `col03`, `col04`, `col05`, `col06`, `col07`, `col08`, `col09`, `c |    0.000 |
    | 9301 | root | localhost | test | Query   |    1 | Commit   | insert into t3 ( `col01`, `col02`, `col03`, `col04`, `col05`, `col06`, `col07`, `col08`, `col09`, `c |    0.000 |
    | 9496 | root | localhost | test | Query   |    0 | Commit   | insert into t4 ( `col01`, `col02`, `col03`, `col04`, `col05`, `col06`, `col07`, `col08`, `col09`, `c |    0.000 |
    | 9497 | root | localhost | test | Query   |    0 | Commit   | insert into t4 ( `col01`, `col02`, `col03`, `col04`, `col05`, `col06`, `col07`, `col08`, `col09`, `c |    0.000 |
    | 9500 | root | localhost | test | Query   |    0 | Commit   | insert into t1 ( `col01`, `col02`, `col03`, `col04`, `col05`, `col06`, `col07`, `col08`, `col09`, `c |    0.000 |
    | 9501 | root | localhost | test | Query   |    0 | Commit   | insert into t1 ( `col01`, `col02`, `col03`, `col04`, `col05`, `col06`, `col07`, `col08`, `col09`, `c |    0.000 |
    | 9504 | root | localhost | test | Query   |    0 | Commit   | insert into t2 ( `col01`, `col02`, `col03`, `col04`, `col05`, `col06`, `col07`, `col08`, `col09`, `c |    0.000 |
    | 9505 | root | localhost | test | Query   |    0 | Commit   | insert into t2 ( `col01`, `col02`, `col03`, `col04`, `col05`, `col06`, `col07`, `col08`, `col09`, `c |    0.000 |
    +------+------+-----------+------+---------+------+----------+------------------------------------------------------------------------------------------------------+----------+
    13 rows in set (0.000 sec)
    

    Which gives a small increase, then decrease of innodb_fts (and subsequent similar fluctuations)

    10.5.10 b0ef1e45d453b210e6480758aea348ebc13606b9 (Debug)

    10.5.10>show global status like 'innodb_fts%';
    +---------------------------------+-------+
    | Variable_name                   | Value |
    +---------------------------------+-------+
    | Innodb_fts_optimize_queue_count | 3     |
    +---------------------------------+-------+
    1 row in set (0.001 sec)
     
    10.5.10>show global status like 'innodb_fts%';
    +---------------------------------+-------+
    | Variable_name                   | Value |
    +---------------------------------+-------+
    | Innodb_fts_optimize_queue_count | 2     |
    +---------------------------------+-------+
    1 row in set (0.003 sec)
    

Comment by Roel Van de Paar [ 2021-05-03 ]

Whilst Innodb_fts_optimize_queue_count seems stable at about 3 (with 4 parallel background threads of mysqlslap, each with concurrency=2) memory use indeed keeps growing/increases:

/test/MD010521-bb-10.5-thiru$ ps --sort -rss -eo pid,pmem,rss,vsz,comm | grep -E 'PID|2194347'
    PID %MEM   RSS    VSZ COMMAND
2194347  0.4 595740 6609272 mysqld
...4-7 minutes...
/test/MD010521-bb-10.5-thiru$ ps --sort -rss -eo pid,pmem,rss,vsz,comm | grep -E 'PID|2194347'
    PID %MEM   RSS    VSZ COMMAND
2194347  0.5 630544 6580544 mysqld
...5-10 minutes...
/test/MD010521-bb-10.5-thiru$ ps --sort -rss -eo pid,pmem,rss,vsz,comm | grep -E 'PID|2194347'
    PID %MEM   RSS    VSZ COMMAND
2194347  0.6 745260 6580544 mysqld

This continual growth can also be visually observed MB by MB in htop; 560MB.....580.....600.....620.....640..... etc. (ended test at 770MB)

Comment by Roel Van de Paar [ 2021-05-03 ]

Optimized builds also show the issue, and provides faster memory increases.

/test/MD010521-bb-10.5-thiru_opt$ ps --sort -rss -eo pid,pmem,rss,vsz,comm | grep -E 'PID|3107079'
    PID %MEM   RSS    VSZ COMMAND
3107079  0.5 629716 4648156 mysqld
...few minutes...
/test/MD010521-bb-10.5-thiru_opt$ ps --sort -rss -eo pid,pmem,rss,vsz,comm | grep -E 'PID|3107079'
    PID %MEM   RSS    VSZ COMMAND
3107079  0.8 992436 6462472 mysqld

Still, Innodb_fts_optimize_queue_count is again steady at 3 in this case also:

10.5.10 b0ef1e45d453b210e6480758aea348ebc13606b9 (Optimized)

10.5.10-opt>show status like 'innodb_fts%';
+---------------------------------+-------+
| Variable_name                   | Value |
+---------------------------------+-------+
| Innodb_fts_optimize_queue_count | 3     |
+---------------------------------+-------+
1 row in set (0.001 sec)

(And it looks like the GLOBAL and SESSION Innodb_fts_optimize_queue_count provide the exact same number at all times; likely just the same value)

Comment by Roel Van de Paar [ 2021-05-03 ]

$ cat slap
#!/bin/bash
 
./bin/mysqlslap -S/test/MD010521-bb-10.5-thiru_opt/socket.sock -u root --create-schema='test' --query='insert into t1 ( `col01`, `col02`, `col03`, `col04`, `col05`, `col06`, `col07`, `col08`, `col09`, `col10`, `col11`, `col12`, `col13`, `col14`, `col15`, `col16`, `col17`, `col18`, `col19`, `col20`, `col21`, `col22`, `col23`, `col24`, `col25`, `col26`, `col27`, `col28`, `col29`, `col30`) VALUES ( MD5(RAND()), MD5(RAND()), MD5(RAND()), MD5(RAND()), MD5(RAND()), MD5(RAND()), MD5(RAND()), MD5(RAND()), MD5(RAND()), MD5(RAND()), MD5(RAND()), MD5(RAND()), MD5(RAND()), MD5(RAND()), MD5(RAND()), MD5(RAND()), MD5(RAND()), MD5(RAND()), MD5(RAND()), MD5(RAND()), MD5(RAND()), MD5(RAND()), MD5(RAND()), MD5(RAND()), MD5(RAND()), MD5(RAND()), MD5(RAND()), MD5(RAND()), MD5(RAND()), MD5(RAND()))' --concurrency=2 --iterations=1000000 &
 
./bin/mysqlslap -S/test/MD010521-bb-10.5-thiru_opt/socket.sock -u root --create-schema='test' --query='insert into t2 ( `col01`, `col02`, `col03`, `col04`, `col05`, `col06`, `col07`, `col08`, `col09`, `col10`, `col11`, `col12`, `col13`, `col14`, `col15`, `col16`, `col17`, `col18`, `col19`, `col20`, `col21`, `col22`, `col23`, `col24`, `col25`, `col26`, `col27`, `col28`, `col29`, `col30`) VALUES ( MD5(RAND()), MD5(RAND()), MD5(RAND()), MD5(RAND()), MD5(RAND()), MD5(RAND()), MD5(RAND()), MD5(RAND()), MD5(RAND()), MD5(RAND()), MD5(RAND()), MD5(RAND()), MD5(RAND()), MD5(RAND()), MD5(RAND()), MD5(RAND()), MD5(RAND()), MD5(RAND()), MD5(RAND()), MD5(RAND()), MD5(RAND()), MD5(RAND()), MD5(RAND()), MD5(RAND()), MD5(RAND()), MD5(RAND()), MD5(RAND()), MD5(RAND()), MD5(RAND()), MD5(RAND()))' --concurrency=2 --iterations=1000000 &
 
./bin/mysqlslap -S/test/MD010521-bb-10.5-thiru_opt/socket.sock -u root --create-schema='test' --query='insert into t3 ( `col01`, `col02`, `col03`, `col04`, `col05`, `col06`, `col07`, `col08`, `col09`, `col10`, `col11`, `col12`, `col13`, `col14`, `col15`, `col16`, `col17`, `col18`, `col19`, `col20`, `col21`, `col22`, `col23`, `col24`, `col25`, `col26`, `col27`, `col28`, `col29`, `col30`) VALUES ( MD5(RAND()), MD5(RAND()), MD5(RAND()), MD5(RAND()), MD5(RAND()), MD5(RAND()), MD5(RAND()), MD5(RAND()), MD5(RAND()), MD5(RAND()), MD5(RAND()), MD5(RAND()), MD5(RAND()), MD5(RAND()), MD5(RAND()), MD5(RAND()), MD5(RAND()), MD5(RAND()), MD5(RAND()), MD5(RAND()), MD5(RAND()), MD5(RAND()), MD5(RAND()), MD5(RAND()), MD5(RAND()), MD5(RAND()), MD5(RAND()), MD5(RAND()), MD5(RAND()), MD5(RAND()))' --concurrency=2 --iterations=1000000 &
 
./bin/mysqlslap -S/test/MD010521-bb-10.5-thiru_opt/socket.sock -u root --create-schema='test' --query='insert into t4 ( `col01`, `col02`, `col03`, `col04`, `col05`, `col06`, `col07`, `col08`, `col09`, `col10`, `col11`, `col12`, `col13`, `col14`, `col15`, `col16`, `col17`, `col18`, `col19`, `col20`, `col21`, `col22`, `col23`, `col24`, `col25`, `col26`, `col27`, `col28`, `col29`, `col30`) VALUES ( MD5(RAND()), MD5(RAND()), MD5(RAND()), MD5(RAND()), MD5(RAND()), MD5(RAND()), MD5(RAND()), MD5(RAND()), MD5(RAND()), MD5(RAND()), MD5(RAND()), MD5(RAND()), MD5(RAND()), MD5(RAND()), MD5(RAND()), MD5(RAND()), MD5(RAND()), MD5(RAND()), MD5(RAND()), MD5(RAND()), MD5(RAND()), MD5(RAND()), MD5(RAND()), MD5(RAND()), MD5(RAND()), MD5(RAND()), MD5(RAND()), MD5(RAND()), MD5(RAND()), MD5(RAND()))' --concurrency=2 --iterations=1000000 &
 
$ cat slap_prep.sql 
create table t1 ( `id` int unsigned NOT NULL AUTO_INCREMENT, `col01` text, `col02` text, `col03` text, `col04` text, `col05` text, `col06` text, `col07` text, `col08` text, `col09` text, `col10` text, `col11` text, `col12` text, `col13` text, `col14` text, `col15` text, `col16` text, `col17` text, `col18` text, `col19` text, `col20` text, `col21` text, `col22` text, `col23` text, `col24` text, `col25` text, `col26` text, `col27` text, `col28` text, `col29` text, `col30` text, PRIMARY KEY (`id`), FULLTEXT KEY (`col01`), FULLTEXT KEY (`col02`), FULLTEXT KEY (`col03`), FULLTEXT KEY (`col04`), FULLTEXT KEY (`col05`), FULLTEXT KEY (`col06`), FULLTEXT KEY (`col07`), FULLTEXT KEY (`col08`), FULLTEXT KEY (`col09`), FULLTEXT KEY (`col10`), FULLTEXT KEY (`col11`), FULLTEXT KEY (`col12`), FULLTEXT KEY (`col13`), FULLTEXT KEY (`col14`), FULLTEXT KEY (`col15`), FULLTEXT KEY (`col16`), FULLTEXT KEY (`col17`), FULLTEXT KEY (`col18`), FULLTEXT KEY (`col19`), FULLTEXT KEY (`col20`), FULLTEXT KEY (`col21`), FULLTEXT KEY (`col22`), FULLTEXT KEY (`col23`), FULLTEXT KEY (`col24`), FULLTEXT KEY (`col25`), FULLTEXT KEY (`col26`), FULLTEXT KEY (`col27`), FULLTEXT KEY (`col28`), FULLTEXT KEY (`col29`), FULLTEXT KEY (`col30`)) engine=innodb;
 
create table t2 ( `id` int unsigned NOT NULL AUTO_INCREMENT, `col01` text, `col02` text, `col03` text, `col04` text, `col05` text, `col06` text, `col07` text, `col08` text, `col09` text, `col10` text, `col11` text, `col12` text, `col13` text, `col14` text, `col15` text, `col16` text, `col17` text, `col18` text, `col19` text, `col20` text, `col21` text, `col22` text, `col23` text, `col24` text, `col25` text, `col26` text, `col27` text, `col28` text, `col29` text, `col30` text, PRIMARY KEY (`id`), FULLTEXT KEY (`col01`), FULLTEXT KEY (`col02`), FULLTEXT KEY (`col03`), FULLTEXT KEY (`col04`), FULLTEXT KEY (`col05`), FULLTEXT KEY (`col06`), FULLTEXT KEY (`col07`), FULLTEXT KEY (`col08`), FULLTEXT KEY (`col09`), FULLTEXT KEY (`col10`), FULLTEXT KEY (`col11`), FULLTEXT KEY (`col12`), FULLTEXT KEY (`col13`), FULLTEXT KEY (`col14`), FULLTEXT KEY (`col15`), FULLTEXT KEY (`col16`), FULLTEXT KEY (`col17`), FULLTEXT KEY (`col18`), FULLTEXT KEY (`col19`), FULLTEXT KEY (`col20`), FULLTEXT KEY (`col21`), FULLTEXT KEY (`col22`), FULLTEXT KEY (`col23`), FULLTEXT KEY (`col24`), FULLTEXT KEY (`col25`), FULLTEXT KEY (`col26`), FULLTEXT KEY (`col27`), FULLTEXT KEY (`col28`), FULLTEXT KEY (`col29`), FULLTEXT KEY (`col30`)) engine=innodb;
 
create table t3 ( `id` int unsigned NOT NULL AUTO_INCREMENT, `col01` text, `col02` text, `col03` text, `col04` text, `col05` text, `col06` text, `col07` text, `col08` text, `col09` text, `col10` text, `col11` text, `col12` text, `col13` text, `col14` text, `col15` text, `col16` text, `col17` text, `col18` text, `col19` text, `col20` text, `col21` text, `col22` text, `col23` text, `col24` text, `col25` text, `col26` text, `col27` text, `col28` text, `col29` text, `col30` text, PRIMARY KEY (`id`), FULLTEXT KEY (`col01`), FULLTEXT KEY (`col02`), FULLTEXT KEY (`col03`), FULLTEXT KEY (`col04`), FULLTEXT KEY (`col05`), FULLTEXT KEY (`col06`), FULLTEXT KEY (`col07`), FULLTEXT KEY (`col08`), FULLTEXT KEY (`col09`), FULLTEXT KEY (`col10`), FULLTEXT KEY (`col11`), FULLTEXT KEY (`col12`), FULLTEXT KEY (`col13`), FULLTEXT KEY (`col14`), FULLTEXT KEY (`col15`), FULLTEXT KEY (`col16`), FULLTEXT KEY (`col17`), FULLTEXT KEY (`col18`), FULLTEXT KEY (`col19`), FULLTEXT KEY (`col20`), FULLTEXT KEY (`col21`), FULLTEXT KEY (`col22`), FULLTEXT KEY (`col23`), FULLTEXT KEY (`col24`), FULLTEXT KEY (`col25`), FULLTEXT KEY (`col26`), FULLTEXT KEY (`col27`), FULLTEXT KEY (`col28`), FULLTEXT KEY (`col29`), FULLTEXT KEY (`col30`)) engine=innodb;
 
create table t4 ( `id` int unsigned NOT NULL AUTO_INCREMENT, `col01` text, `col02` text, `col03` text, `col04` text, `col05` text, `col06` text, `col07` text, `col08` text, `col09` text, `col10` text, `col11` text, `col12` text, `col13` text, `col14` text, `col15` text, `col16` text, `col17` text, `col18` text, `col19` text, `col20` text, `col21` text, `col22` text, `col23` text, `col24` text, `col25` text, `col26` text, `col27` text, `col28` text, `col29` text, `col30` text, PRIMARY KEY (`id`), FULLTEXT KEY (`col01`), FULLTEXT KEY (`col02`), FULLTEXT KEY (`col03`), FULLTEXT KEY (`col04`), FULLTEXT KEY (`col05`), FULLTEXT KEY (`col06`), FULLTEXT KEY (`col07`), FULLTEXT KEY (`col08`), FULLTEXT KEY (`col09`), FULLTEXT KEY (`col10`), FULLTEXT KEY (`col11`), FULLTEXT KEY (`col12`), FULLTEXT KEY (`col13`), FULLTEXT KEY (`col14`), FULLTEXT KEY (`col15`), FULLTEXT KEY (`col16`), FULLTEXT KEY (`col17`), FULLTEXT KEY (`col18`), FULLTEXT KEY (`col19`), FULLTEXT KEY (`col20`), FULLTEXT KEY (`col21`), FULLTEXT KEY (`col22`), FULLTEXT KEY (`col23`), FULLTEXT KEY (`col24`), FULLTEXT KEY (`col25`), FULLTEXT KEY (`col26`), FULLTEXT KEY (`col27`), FULLTEXT KEY (`col28`), FULLTEXT KEY (`col29`), FULLTEXT KEY (`col30`)) engine=innodb;

Comment by Roel Van de Paar [ 2021-05-03 ]

Memory use can easily grow up to 3GB+

/test/MD010521-bb-10.5-thiru_opt$ ps --sort -rss -eo pid,pmem,rss,vsz,comm | grep -E 'PID|4108773'
    PID %MEM   RSS    VSZ COMMAND
4108773  2.6 3297560 8947608 mysqld

Comment by Roel Van de Paar [ 2021-05-03 ]

No related memory issues observed with ASAN build (optimized).

Comment by Thirunarayanan Balathandayuthapani [ 2021-05-17 ]

Problem should exist from 10.2+ version. So changing affected version and fix version

Comment by Thirunarayanan Balathandayuthapani [ 2021-05-27 ]

Analysis:

1) fts_add_doc_by_id() requesting fts sync of the table(fts_sync) once the cache size reaches 1/10th of innodb_ft_cache_size
2) fts_sync() releases cache lock when writing the word node. This is done to avoid block concurrent DMLs on the same table.
By doing this, Insert will keep on increasing the cache memory and SYNC will take more time to complete the row present in cache.

Comment by Thirunarayanan Balathandayuthapani [ 2022-02-11 ]

Pasting the conversation related to the fts sync issue.

marko  12 minutes ago
This is a producer-consumer problem, right? Can we do similar to other producer-consumer systems, such as the change buffer? If something wants to insert into the cache and the cache is already too full, try to increase the effort to apply the cache? Or is it not possible to add more than 1 consumer thread? (For the change buffer, merges can occur for arbitrarily many pages at nearly the same time, from multiple threads.)
 
thiru  9 minutes ago
producer here is a DML thread. Won't it affect the DML performance if it has to do the extra job of applying ?
 
marko  9 minutes ago
Sure, but if the cache is full, what else can we do?
 
thiru  8 minutes ago
we can increase the consumer thread. That is feasible solution.
 
thiru  6 minutes ago
DML can apply the log if it reaches 3/4 of cache. we can play around with the threshold later. We should suggest the user to increase the cache size too if they're expecting to use more fulltext index

Above ideas should be helpful in solving this issue.

Comment by Marko Mäkelä [ 2022-03-02 ]

I am reducing the priority of this, because fixing this is nontrivial and MDEV-27913 might provide an adequate work-around. This is a producer-consumer problem where the consumer cannot keep up with the producers.

Comment by Marko Mäkelä [ 2022-05-24 ]

The prototype patch looks nice, removing a lot of code. We discussed that there is more room for cleanup, and the acquisition of dict_sys.latch could be removed. However, it might be tricky to fix this in an earlier major version than 10.6, where the DDL and the dictionary cache handling were greatly simplified.

Comment by Marko Mäkelä [ 2022-06-02 ]

I posted some comments. It looks good, except that the commit message does not explicitly say what is replacing the asynchronous message FTS_MSG_SYNC_TABLE (I suppose it is the function fts_sync()) nor how we will guarantee that wsrep_sst_method=rsync (MDEV-25975) is going to work.

Comment by Matthias Leich [ 2022-06-03 ]

origin/bb-10.6-MDEV-25581 14a4d1af11f8f947a8bb4f14f2036ded9207c632 2022-06-02T19:05:47+05:30
behaved well in RQG testing.

Comment by Matthias Leich [ 2022-06-07 ]

origin/bb-10.6-MDEV-25581 613e6d20dcd1c5e0e6d9093d07b566ef8d3b7c9a 2022-06-07T11:35:36+05:30
behaved well in RQG testing.

Generated at Thu Feb 08 09:38:46 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.