Details
-
Bug
-
Status: Confirmed (View Workflow)
-
Major
-
Resolution: Unresolved
-
10.4(EOL), 10.5, 10.6, 10.11, 11.4
Description
This came up while I was reviewing MDEV-34348 (#3490) by bnestere.
The function compare_number_of_records() seems to be what motivates the definition of the interface qsort_cmp2, which is also known as qsort2_cmp and queue_compare.
Many implementations of the qsort_cmp2 interface do not actually use any "context" parameter. It would be more efficient to have optional parameters at the end of the parameter list. Then, for example, sortcmp2() could pass its actual used parameters more efficiently. In the System V AMD64 ABI it could just pass through its first two parameters rdi and rsi and replace its third parameter rdx before tail-calling a.k.a. jumping to sortcmp().
Similar functions include cmp_longlong(), cmp_double(), cmp_row(), cmp_decimal() and cmp_timestamp() in item_cmpfunc.cc, as well as subselect_rowid_merge_engine::cmp_keys_by_cur_rownum() in item_subselect.cc.
For example, the following currently has to shuffle the parameter registers before the tail-call:
static int cmp_row(void *cmp_arg, cmp_item_row *a, cmp_item_row *b) |
{
|
return a->compare(b); |
}
|
With the optimization that I am suggesting, the entire function translates into a mere jump in ABIs where this for member function calls is being passed in the same way as the first parameter of a non-member function. Another example is simple_raw_key_cmp(), which I converted for use in https://godbolt.org:
#include <string.h>
|
int simple_raw_key_cmp(const void* arg, const void* key1, const void* key2) |
{
|
return memcmp(key1, key2, *(const unsigned *) arg); |
}
|
int simpler_raw_key_cmp(const void* key1, const void* key2, const void* arg) |
{
|
return memcmp(key1, key2, *(const unsigned *) arg); |
}
|
For the System V AMD64 ABI, my suggestion saves 3 instructions. The original code is lucky to be able to use the return value register rax as a scratchpad:
simple_raw_key_cmp(void const*, void const*, void const*):
|
mov rax, rdi
|
mov rdi, rsi
|
mov rsi, rdx
|
mov edx, DWORD PTR [rax]
|
jmp memcmp
|
simpler_raw_key_cmp(void const*, void const*, void const*):
|
mov edx, DWORD PTR [rdx]
|
jmp memcmp
|
Note: With my suggestion, the functions Ordered_key::cmp_keys_by_row_data_and_rownum and simple_str_key_cmp() would take a penalty (added instructions for shuffling the registers). Therefore, some performance evaluation will be necessary.
Attachments
Issue Links
- relates to
-
MDEV-34348 MariaDB is violating clang-16 -Wcast-function-type-strict
- Closed