[MDEV-24088] Assertion in InnoDB's FTS code may be triggered by a repeated words fed to simple_parser plugin Created: 2020-11-02 Updated: 2022-01-25 Resolved: 2022-01-24 |
|
| Status: | Closed |
| Project: | MariaDB Server |
| Component/s: | Full-text Search, Storage Engine - InnoDB |
| Affects Version/s: | 10.2, 10.3, 10.4, 10.5, 10.6 |
| Fix Version/s: | 10.2.42, 10.3.33, 10.4.23, 10.5.14, 10.6.6 |
| Type: | Bug | Priority: | Major |
| Reporter: | Rinat Ibragimov (Inactive) | Assignee: | Sergei Golubchik |
| Resolution: | Fixed | Votes: | 1 |
| Labels: | consistency, corruption, upstream | ||
| Issue Links: |
|
||||||||||||||||
| Description |
|
It's possible to trigger an assertion in debug builds by executing the following statements:
Here is what the stack trace looks like:
|
| Comments |
| Comment by Alice Sherepa [ 2020-11-02 ] | ||||||||||||||||||||||||||||||||||||||||||
|
Thank you for the report! I repeated as described on 10.2-10.5
| ||||||||||||||||||||||||||||||||||||||||||
| Comment by Marko Mäkelä [ 2021-06-08 ] | ||||||||||||||||||||||||||||||||||||||||||
|
I found the stack trace in this ticket while searching for fts_commit. While working on
In If you think about that above code, it should be glaringly obvious that we are doing things completely wrong.
The consistency breakage could explain some strange errors in fulltext search, such as a duplicate FTS_DOC_ID. I think that the logic needs to be moved earlier. This is not trivial. On a quick read, it would seem that for the SQL layer it needs to be done at or before innobase_commit_ordered(). That function cannot currently return an error. Besides, fts_commit() might not be fast, like the function comment promises. Update: This problem was filed separately as MDEV-24608. | ||||||||||||||||||||||||||||||||||||||||||
| Comment by Marko Mäkelä [ 2021-07-26 ] | ||||||||||||||||||||||||||||||||||||||||||
|
serg, a fix exists that would involve changing the fulltext parser API. | ||||||||||||||||||||||||||||||||||||||||||
| Comment by Sergei Golubchik [ 2022-01-09 ] | ||||||||||||||||||||||||||||||||||||||||||
|
marko, please, take a look at the suggested fix, commit 11a0b10d130 | ||||||||||||||||||||||||||||||||||||||||||
| Comment by Marko Mäkelä [ 2022-01-14 ] | ||||||||||||||||||||||||||||||||||||||||||
|
serg, your fix looks nice and simple to me. thiru, can you review it? I think that you are more familiar with the fulltext subsystem. | ||||||||||||||||||||||||||||||||||||||||||
| Comment by Thirunarayanan Balathandayuthapani [ 2022-01-18 ] | ||||||||||||||||||||||||||||||||||||||||||
|
serg Your patch stores the wrong position of the token while using simple parser.
Same test case without simple parser:
Token position should store the delta offset from the previous token present in the record. | ||||||||||||||||||||||||||||||||||||||||||
| Comment by Sergei Golubchik [ 2022-01-21 ] | ||||||||||||||||||||||||||||||||||||||||||
|
A position it stores isn't "wrong" as such, it's just different. A position is not part of the API, so the number stored in internal InnoDB tables is irrelevant and does not affect MATCH results. If it were a well defined number, like "the delta offset from the previous token present in the record", then it would've been wrong. But as far as it's some internal unimportant number that doesn't affect results — it doesn't matter what it is. | ||||||||||||||||||||||||||||||||||||||||||
| Comment by Thirunarayanan Balathandayuthapani [ 2022-01-24 ] | ||||||||||||||||||||||||||||||||||||||||||
|
serg Your patch is good to go. Simple parser has the MYSQL_FTPARSER_SIMPLE_MODE.
Position is being used only inside innodb for proximity search query. Simple parser doesn't support phrase search query at all. |