[MDEV-33383] fts query crashes in fts_query_calculate_ranking() - Jira

Details

Type: Bug
Status: Closed (View Workflow)
Priority: Critical
Resolution: Fixed
Affects Version/s: 10.5.23
Fix Version/s: 10.5.25, 10.6.18, 10.11.8, 11.0.6, 11.2.4, 11.1.5, 11.4.2
Component/s: Full-text Search, Storage Engine - InnoDB
Labels:
None
Environment:
linux

Description

Description:
fts query crashes in fts_query_calculate_ranking().

#0 0x00007f9d2b85b387 in raise () from /lib64/libc.so.6
#1 0x00007f9d2b85ca78 in abort () from /lib64/libc.so.6
#2 0x0000000001246471 in my_server_abort ()
at sql/signal_handler.cc:303
#3 0x0000000002233faa in my_abort ()
at mysys/my_init.cc:258
#4 0x000000000259ea5e in ut_dbg_assertion_failed (expr=expr@entry=0x30e5f02 "ret == 0",
file=file@entry=0x38fcf38 "storage/innobase/fts/fts0que.cc", line=line@entry=3285)
at storage/innobase/ut/ut0dbg.cc:99
#5 0x0000000002783269 in fts_query_calculate_ranking (ranking=0x7f9c163c8150, query=0x7f9d1f92ca10)
at storage/innobase/fts/fts0que.cc:3285
#6 fts_query_prepare_result (result=0x7f9c162033a0, query=0x7f9d1f92ca10)
at storage/innobase/fts/fts0que.cc:3436
#7 fts_query_get_result (result=<optimized out>, query=0x7f9d1f92ca10)
at storage/innobase/fts/fts0que.cc:3475
#8 fts_query (trx=trx@entry=0x7f9d24202728, index=index@entry=0x7f9c21eac888, flags=flags@entry=9,
query_str=query_str@entry=0x7f9c1623d2a0 "雪人", query_len=query_len@entry=6,
result=result@entry=0x7f9d1f92ed60, limit=<optimized out>)
at storage/innobase/fts/fts0que.cc:3793
#9 0x00000000023bf647 in ha_innobase::ft_init_ext (this=0x7f9c1629fc30, flags=9, keynr=3, key=<optimized out>)
at storage/innobase/handler/ha_innodb.cc:11267

Suggested fix:
The root cause is that compare function for rbtree is wrong.

/** Compare two fts_ranking_t doc_ids.
@return < 0 if n1 < n2, 0 if n1 == n2, > 0 if n1 > n2 */
static inline int fts_ranking_doc_id_cmp(const void p1, /!< in: id1 */
const void p2) /!< in: id2 */
{
const fts_ranking_t *rk1 = (const fts_ranking_t *)p1;
const fts_ranking_t *rk2 = (const fts_ranking_t *)p2;

return ((int)(rk1->doc_id - rk2->doc_id)); --wrong！！！
}

It's totally wrong to compare two integers as below, no matter signed or
unsigned.

Attachments

Issue Links

relates to

MDEV-33401 fts_savepoint_t::tables seems to duplicate trx_t::mod_tables

Open

MDEV-33431 Latching order violation reported fil_system.sys_space.latch and ibuf_pessimistic_insert_mutex

Closed

MDEV-33438 mariadb-backup fails to preserve innodb_encrypt_tables=1

Closed

Activity

Ascending order - Click to sort in descending order

Marko Mäkelä added a comment - 2024-02-05 16:14

I see that fts_ranking_doc_id_cmp() is only used in one compilation unit (always passed as a function pointer to rbt_create(), yet defined in another. It does not seem to make sense to specify a function pointer as inline. The pointer will always be assigned to ib_rbt_tree::compare.

The other big problem that I see there is that fts_ranking_t::doc_id is doc_id_t, which is a 64-bit unsigned integer. Comparison by subtraction is completely fine, but truncating a 64-bit result to a 32-bit int definitely is not. If the difference is 1U<<31 or more, the return value would be incorrect.

The function comment does not promise to normalize the result to -1, 0, 1, and the callers in ut0rbt.cc are checking the result for negative, positive, or 0.

An answer in https://stackoverflow.com/questions/66844378/fast-saturating-integer-conversion suggests an interesting trick for converting uint64_t to uint32_t without any conditional branches. Any x > ~0U would be converted to ~0U:

uint32_t saturating_conversion(uint64_t x)

  return uint32_t(((-(x >> 32)) | (x << 32)) >> 32);

This is almost what we need here. We would want to convert signed integers in a similar way. In C+26 there would be saturate_cast, but we will be stuck to C+11 for quite a while. I am sure that the above can be tweaked for 2’s complement 64-bit arithmetics.

Marko Mäkelä added a comment - 2024-02-05 16:14 I see that fts_ranking_doc_id_cmp() is only used in one compilation unit (always passed as a function pointer to rbt_create() , yet defined in another. It does not seem to make sense to specify a function pointer as inline . The pointer will always be assigned to ib_rbt_tree::compare . The other big problem that I see there is that fts_ranking_t::doc_id is doc_id_t , which is a 64-bit unsigned integer. Comparison by subtraction is completely fine, but truncating a 64-bit result to a 32-bit int definitely is not. If the difference is 1U<<31 or more, the return value would be incorrect. The function comment does not promise to normalize the result to -1, 0, 1, and the callers in ut0rbt.cc are checking the result for negative, positive, or 0. An answer in https://stackoverflow.com/questions/66844378/fast-saturating-integer-conversion suggests an interesting trick for converting uint64_t to uint32_t without any conditional branches. Any x > ~0U would be converted to ~0U : uint32_t saturating_conversion(uint64_t x) { return uint32_t(((-(x >> 32)) | (x << 32)) >> 32); } This is almost what we need here. We would want to convert signed integers in a similar way. In C+ 26 there would be saturate_cast , but we will be stuck to C +11 for quite a while. I am sure that the above can be tweaked for 2’s complement 64-bit arithmetics.

Marko Mäkelä added a comment - 2024-02-05 16:16

The line numbers in the stack trace do not match mariadb-10.5.23, but I assume that the code is the following:

		ret = rbt_search(query->word_freqs, &parent, &word);

		/* It must exist. */

		ut_a(ret == 0);

Marko Mäkelä added a comment - 2024-02-05 16:16 The line numbers in the stack trace do not match mariadb-10.5.23 , but I assume that the code is the following: ret = rbt_search(query->word_freqs, &parent, &word); /* It must exist. */ ut_a(ret == 0);

Marko Mäkelä added a comment - 2024-02-06 08:53

I think that I figured out how to implement a signed saturating conversion using 2’s complement arithmetics. My trick is to temporarily convert negative values to non-negative ones, and then to apply a saturating unsigned conversion to 1 bit less, so that we can restore the sign bit. I first implemented it for 16-to-8-bit conversion so that I could test the full range:

#include <cstdio>

#include <cstdint>

#include <cinttypes>

int8_t saturating_conversion(int16_t x)

  const int16_t sign(x >> 15), u(x ^ sign);

  return int8_t(((((-(u >> 7)) | (u << 9)) >> 9) & uint8_t(~0U) >> 1) ^

                int8_t(sign));

int main()

  for (uint16_t x= 0; ++x; )

    printf("% " PRIi16 " : %" PRIi8 "\n", x, saturating_conversion(x));

Then I ported this to 64-bit and 32-bit integers and tested a few values:

#include <cstdio>

#include <cstdint>

#include <cinttypes>

int32_t saturating_conversion(int64_t x)

  const int64_t sign{x >> 63}, u{x ^ sign};

  return int32_t(((((-(u >> 31)) | (u << 33)) >> 33) & (~0U >> 1)) ^

                 int32_t(sign));

int main()

  const int64_t tab[]=

    { 1000000000000, 8589934592, 4294967296, 4294967295, 4294967294 };

  for (int64_t x : tab)

    printf("% " PRIi64 " : %" PRIi32 "\n", x, saturating_conversion(x));

  for (int64_t x : tab)

    printf("% " PRIi64 " : %" PRIi32 "\n", -x, saturating_conversion(-x));

  for (int64_t x= 1LL << 33, d= 1LL<<32; x > 0; x-= d, d >>= 1, d++)

    printf("% " PRIi64 " : %" PRIi32 "\n", x, saturating_conversion(x));

  for (int64_t x= -1LL << 33, d= 1LL<<32; x < 0; x+= d, d >>= 1, d++)

    printf("% " PRIi64 " : %" PRIi32 "\n", x, saturating_conversion(x));

A translation of the 64-to-32-bit signed saturating convresion to the AMD64 ISA is not that long, and it could be at least 2 instructions less when inlined:

	movq	%rdi, %rdx

	sarq	$63, %rdx

	xorq	%rdx, %rdi

	movq	%rdi, %rax

	salq	$32, %rdi

	sarq	$32, %rax

	negq	%rax

	orq	%rdi, %rax

	sarq	$32, %rax

	xorl	%edx, %eax

ret

I see that a saturating conversion is missing in fts_trx_row_doc_id_cmp(), fts_ranking_doc_id_cmp(), and fts_doc_id_cmp(). The latter is only being passed to ib_vector_sort(), which is an inline wrapper for qsort(3), which indeed expects an int comparison result. Can you post a pull request for fixing all these?

I think that the call ib_vector_sort(doc_ids, fts_doc_id_cmp) deserves to be a non-inline function by itself, because it is being called from 3 different compilation units. Or it should be replaced with simple std::sort(), which would allow an inline comparison to be used.

As an alternative fix, we could replace the homebrew red-black tree rbt_ with std::map and the homebrew ib_vector with std::vector. It could make debugging easier too.

Marko Mäkelä added a comment - 2024-02-06 08:53 I think that I figured out how to implement a signed saturating conversion using 2’s complement arithmetics. My trick is to temporarily convert negative values to non-negative ones, and then to apply a saturating unsigned conversion to 1 bit less, so that we can restore the sign bit. I first implemented it for 16-to-8-bit conversion so that I could test the full range: #include <cstdio> #include <cstdint> #include <cinttypes> int8_t saturating_conversion(int16_t x) { const int16_t sign(x >> 15), u(x ^ sign); return int8_t(((((-(u >> 7)) | (u << 9)) >> 9) & uint8_t(~0U) >> 1) ^ int8_t(sign)); } int main() { for (uint16_t x= 0; ++x; ) printf ( "% " PRIi16 " : %" PRIi8 "\n" , x, saturating_conversion(x)); } Then I ported this to 64-bit and 32-bit integers and tested a few values: #include <cstdio> #include <cstdint> #include <cinttypes> int32_t saturating_conversion(int64_t x) { const int64_t sign{x >> 63}, u{x ^ sign}; return int32_t(((((-(u >> 31)) | (u << 33)) >> 33) & (~0U >> 1)) ^ int32_t(sign)); } int main() { const int64_t tab[]= { 1000000000000, 8589934592, 4294967296, 4294967295, 4294967294 }; for (int64_t x : tab) printf ( "% " PRIi64 " : %" PRIi32 "\n" , x, saturating_conversion(x)); for (int64_t x : tab) printf ( "% " PRIi64 " : %" PRIi32 "\n" , -x, saturating_conversion(-x)); for (int64_t x= 1LL << 33, d= 1LL<<32; x > 0; x-= d, d >>= 1, d++) printf ( "% " PRIi64 " : %" PRIi32 "\n" , x, saturating_conversion(x)); for (int64_t x= -1LL << 33, d= 1LL<<32; x < 0; x+= d, d >>= 1, d++) printf ( "% " PRIi64 " : %" PRIi32 "\n" , x, saturating_conversion(x)); } A translation of the 64-to-32-bit signed saturating convresion to the AMD64 ISA is not that long, and it could be at least 2 instructions less when inlined: movq %rdi, %rdx sarq $63, %rdx xorq %rdx, %rdi movq %rdi, %rax salq $32, %rdi sarq $32, %rax negq %rax orq %rdi, %rax sarq $32, %rax xorl %edx, %eax ret I see that a saturating conversion is missing in fts_trx_row_doc_id_cmp() , fts_ranking_doc_id_cmp() , and fts_doc_id_cmp() . The latter is only being passed to ib_vector_sort() , which is an inline wrapper for qsort(3) , which indeed expects an int comparison result. Can you post a pull request for fixing all these? I think that the call ib_vector_sort(doc_ids, fts_doc_id_cmp) deserves to be a non-inline function by itself, because it is being called from 3 different compilation units. Or it should be replaced with simple std::sort() , which would allow an inline comparison to be used. As an alternative fix, we could replace the homebrew red-black tree rbt_ with std::map and the homebrew ib_vector with std::vector . It could make debugging easier too.

Marko Mäkelä added a comment - 2024-02-06 09:06

Note: In the above I assumed that the difference of the doc_id values would always be less than 1ULL << 63. If we used std::map and std::sort and let them compare the unsigned 64-bit values directly, these problems would not exist.

Marko Mäkelä added a comment - 2024-02-06 09:06 Note: In the above I assumed that the difference of the doc_id values would always be less than 1ULL << 63 . If we used std::map and std::sort and let them compare the unsigned 64-bit values directly, these problems would not exist.

Marko Mäkelä added a comment - 2024-02-06 13:48

I played a bit with https://godbolt.org to find out that the following type of code would compile to conditional ALU instructions on ARMv7, ARMv8, IA-32 and AMD64, already starting with GCC 4.8.5, which is the oldest compiler that we currently target:

int cmp2(unsigned long long a, unsigned long long b) { return a < b ? -1 : a > b; }

Only for POWER, even the most recent GCC would generate conditional branches. I guess that it is an ISA limitation.

As an alternative, I will try to see if we can simply replace qsort with std::sort and the homebrew red-black tree with std::map.

Marko Mäkelä added a comment - 2024-02-06 13:48 I played a bit with https://godbolt.org to find out that the following type of code would compile to conditional ALU instructions on ARMv7, ARMv8, IA-32 and AMD64, already starting with GCC 4.8.5, which is the oldest compiler that we currently target: int cmp2(unsigned long long a, unsigned long long b) { return a < b ? -1 : a > b; } Only for POWER, even the most recent GCC would generate conditional branches. I guess that it is an ISA limitation. As an alternative, I will try to see if we can simply replace qsort with std::sort and the homebrew red-black tree with std::map .

Marko Mäkelä added a comment - 2024-02-06 16:53

Replacing the ib_rbt_t with C++11 std::multimap turns out to be a quite intrusive change. I reviewed and each comparison function that is passed to rbt_create(). One more strange one is fts_query_compare_rank(), which never returns a negative value when the ranks are equal and the doc_id are inequal. According to my interpretation of the function comment, it should return negative when r1->doc_id < r2->doc_id. This should affect the result of ha_innobase::ft_read().

Marko Mäkelä added a comment - 2024-02-06 16:53 Replacing the ib_rbt_t with C++11 std::multimap turns out to be a quite intrusive change. I reviewed and each comparison function that is passed to rbt_create() . One more strange one is fts_query_compare_rank() , which never returns a negative value when the ranks are equal and the doc_id are inequal. According to my interpretation of the function comment, it should return negative when r1->doc_id < r2->doc_id . This should affect the result of ha_innobase::ft_read() .

Marko Mäkelä added a comment - 2024-02-08 15:25

I filed ~~MDEV-33431~~ for an unrelated failure that mleich reported. The analysis of another unrelated failure will require more time. The crash on corrupted data could be something that was fixed in ~~MDEV-13542~~ a long time ago:

2024-02-07 14:42:34 0 [Warning] InnoDB: btr_pcur_open_low level: 0 called from file: /data/Server/10.5-MDEV-33383/storage/innobase/row/row0row.cc line: 1303 table: `test`.`t8` index: `k` error: Table is encrypted but decrypt failed.

mariadbd: /data/Server/10.5-MDEV-33383/storage/innobase/include/btr0pcur.inl:154: ulint btr_pcur_get_low_match(const btr_pcur_t*): Assertion `btr_cursor->low_match != ((ulint)(-1))' failed.

I want to find the cause of the corruption, and I think I can full well do it. We have rr replay trace of the server that was being backed up, as well as rr replay of applying the log, and this crash on the startup of the backup. We also have a copy of the backup before the log was applied.

Marko Mäkelä added a comment - 2024-02-08 15:25 I filed MDEV-33431 for an unrelated failure that mleich reported. The analysis of another unrelated failure will require more time. The crash on corrupted data could be something that was fixed in MDEV-13542 a long time ago: 2024-02-07 14:42:34 0 [Warning] InnoDB: btr_pcur_open_low level: 0 called from file: /data/Server/10.5-MDEV-33383/storage/innobase/row/row0row.cc line: 1303 table: `test`.`t8` index: `k` error: Table is encrypted but decrypt failed. mariadbd: /data/Server/10.5-MDEV-33383/storage/innobase/include/btr0pcur.inl:154: ulint btr_pcur_get_low_match(const btr_pcur_t*): Assertion `btr_cursor->low_match != ((ulint)(-1))' failed. I want to find the cause of the corruption, and I think I can full well do it. We have rr replay trace of the server that was being backed up, as well as rr replay of applying the log, and this crash on the startup of the backup. We also have a copy of the backup before the log was applied.

Marko Mäkelä added a comment - 2024-02-09 13:45

I diagnosed the root cause of corrupted backup and filed it as ~~MDEV-33438~~.

Marko Mäkelä added a comment - 2024-02-09 13:45 I diagnosed the root cause of corrupted backup and filed it as MDEV-33438 .

People

Assignee:: Marko Mäkelä

Reporter:: Shaohua Wang

Votes:: 1 Vote for this issue

Watchers:: 4 Start watching this issue

Dates

Created:: 2024-02-05 09:15

Updated:: 2024-02-20 09:31

Resolved:: 2024-02-12 16:38

Git Integration

Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.

MariaDB Server