[MDEV-26271] Crashes in net_field_length since commit 295f3e4cfb4a8f132f36d53475efc92f2487aa97 Created: 2021-07-29 Updated: 2021-09-03 Resolved: 2021-09-03 |
|
| Status: | Closed |
| Project: | MariaDB Server |
| Component/s: | libmariadb |
| Affects Version/s: | 10.6.3 |
| Fix Version/s: | N/A |
| Type: | Bug | Priority: | Major |
| Reporter: | Antonio Rojas | Assignee: | Georg Richter |
| Resolution: | Not a Bug | Votes: | 0 |
| Labels: | need_feedback | ||
| Environment: |
Linux x86_64 |
||
| Issue Links: |
|
||||||||
| Description |
|
Since commit 295f3e4cfb4a8f132f, mariadb crashes (and has severe performance issues) when scanning the DigiKam database. The initial scan (with ~5000 pics) used to take <1 second with MariaDB 10.5, now it takes 10 seconds to reach 10% and then crashes
Downstream report: https://bugs.kde.org/show_bug.cgi?id=440296 |
| Comments |
| Comment by Vladislav Vaintroub [ 2021-07-29 ] | |||||||||||||||||||||||||||||||||||||||||||||||||||||
|
What's DigiKam? Kan you provide an isolated repro case , which only involved mariadb database, and mariadb client? | |||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Antonio Rojas [ 2021-08-03 ] | |||||||||||||||||||||||||||||||||||||||||||||||||||||
|
The database is managed wia Qt's mysql plugin - I don't know how to create a repro case. I can provide an affected database if needed. | |||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Vladislav Vaintroub [ 2021-08-03 ] | |||||||||||||||||||||||||||||||||||||||||||||||||||||
|
Maybe Digikam folks know how to create a repro case- that's their software after all . I mean, if someone determined that MariaDB is a problem, with exact git hash, I'd think there was some analysis before that. | |||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Antonio Rojas [ 2021-08-03 ] | |||||||||||||||||||||||||||||||||||||||||||||||||||||
|
The git hash was found by bisection, no knowledge of the internals is needed for that. I'll ask downstream if some digikam dev can help providing a test case. | |||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Vladislav Vaintroub [ 2021-08-03 ] | |||||||||||||||||||||||||||||||||||||||||||||||||||||
|
Thank you. While I would not exclude all possibility that the error is our (or, mine, as pointed by git hash), although on the first glance I'd tend to think it is more likely surrounding code. Also with the alleged slowness, a repro, that can be easily reproduced, using just mariadb, rather than layers on layers of someone else's code would be very helpful. | |||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Antonio Rojas [ 2021-08-07 ] | |||||||||||||||||||||||||||||||||||||||||||||||||||||
|
In the meantime, another Qt application (mythtv) has been reported to be broken by the same commit, so everything points to a bad interaction between the metadata cache and Qt's mysql driver code. The mythtv bug report contains a minimal test case (using only Qt): https://github.com/MythTV/mythtv/issues/373#issuecomment-891723028 | |||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Vladislav Vaintroub [ 2021-08-07 ] | |||||||||||||||||||||||||||||||||||||||||||||||||||||
|
Someone there will hopefully figure out, how to translate that into minimal test case (using only MariaDB), or, fix the bug inside their driver, if there is a bug there. | |||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Vladislav Vaintroub [ 2021-08-07 ] | |||||||||||||||||||||||||||||||||||||||||||||||||||||
|
Also, reading all the QT errors, that is server, not the client error, and whether the result set metadata is cached or not, has no influence on that from https://github.com/MythTV/mythtv/issues/373#issuecomment-890342273
Fails, because someone is using an unquoted keyword offset, which was added in MariaDB server 10.6.0 (see https://mariadb.com/kb/en/reserved-words/ ) and so on. So, there is some unrelated stuff, which seems to be collectively blamed as "bad interaction between metadata cache and QT " | |||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Sergei Golubchik [ 2021-08-07 ] | |||||||||||||||||||||||||||||||||||||||||||||||||||||
|
wlad, may be a tcpdump capture could help? They can collect it without digging into exactly how two pieces of software interact | |||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Vladislav Vaintroub [ 2021-08-07 ] | |||||||||||||||||||||||||||||||||||||||||||||||||||||
|
I do not think it would be too helpful, in this case. It does not crash parsing something, it is a memory overwrite. I'd usually think a problem such as memory overwrite is easiest to analyze with ASAN or Valgrind, or MSAN or AppVerifier, or some other memory checker. | |||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Daniel Black [ 2021-08-09 ] | |||||||||||||||||||||||||||||||||||||||||||||||||||||
|
Came across https://bugreports.qt.io/browse/QTBUG-95071, because of | |||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Sergei Golubchik [ 2021-08-09 ] | |||||||||||||||||||||||||||||||||||||||||||||||||||||
|
arojas, if you're bisecting, that is, you compile MariaDB anyway, you can, perhaps, compile with -DWITH_ASAN=1 ? | |||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Antonio Rojas [ 2021-08-09 ] | |||||||||||||||||||||||||||||||||||||||||||||||||||||
|
> Also, reading all the QT errors, This error comes from mythtv (code is here [1] and here [2]) and fails because the Qt prepared query returns null. Running the same query on mysql directly works. This is what the minimal test case is testing, and I've bisected it to the same commit 295f3e4cfb4a8f132f (more specifically, to commit 6a763b90006c559 in mariadb-connector-c) and confirmed it by building the 10.6.3 release with that connector-c commit reverted, which fixes all issues. [1] https://github.com/MythTV/mythtv/blob/master/mythtv/programs/mythtv-setup/main.cpp#L556 | |||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Antonio Rojas [ 2021-08-09 ] | |||||||||||||||||||||||||||||||||||||||||||||||||||||
|
> Came across https://bugreports.qt.io/browse/QTBUG-95071, because of I'm aware (I'm the reporter of that bug) - all testing here has been done with a Qt patched to workaround that version check issue, to make sure prepared statements are used. In fact, with an unpatched Qt everything works (for the wrong reasons: the two bugs cancel each other out). | |||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Vladislav Vaintroub [ 2021-08-09 ] | |||||||||||||||||||||||||||||||||||||||||||||||||||||
|
That's a newly built 10.6, with mysql 8.0 client . It returns NULL, because timezone support was not installed. It would return the same with prepared or unprepared statement, with any client , in fact | |||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Antonio Rojas [ 2021-08-09 ] | |||||||||||||||||||||||||||||||||||||||||||||||||||||
|
IIUC, this test should work regardless of timezone support, and it works for me from mysql:
Now, with the Qt test case (using the same query):
| |||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Vladislav Vaintroub [ 2021-08-09 ] | |||||||||||||||||||||||||||||||||||||||||||||||||||||
|
This is how to execute a prepared statement using executables that are build with MariaDB - that is , let's leave the Qt examples aside, as we do not build Qt (or Digikam, or MythTV) here. Prepared statement from command line can be executed with
ps-protocol is documented here
and here is the result (note - no NULLs) on pristine database
and here is how to check that prepared statement is used - allow general_log with the server, and dump the general log - in this case it will have something like that . Note the Prepare and Execute in the output.
Now do the same with Etc/UTC
As expected, without loaded timezones, it returns NULL Now load timezones.
Now , repeat the SELECT with timezone name
Make sure that it is 10.6 that I'm running.
What does it all prove ? It proves that there is no bug in SELECT CONVERT_TZ(NOW(), 'SYSTEM', 'Etc/UTC'); | |||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Thiago Macieira [ 2021-08-13 ] | |||||||||||||||||||||||||||||||||||||||||||||||||||||
|
Qt developer here attempting to debug this issue. This problem appears to happen exclusively with connector-c 3.2 and the 10.6 server. If I try the older connector with 10.6 or the new connector with 10.5, no problem happens. Here's an ltrace of the calls Qt made:
mysql_fetch_field call comes from QMYSQLResultPrivate::bindInValues. In this code:
I can confirm by debugging with both connectors that:
As you can see, Qt stores the value length + 1 in the variable, so it starts as 20. With a hardware watchpoint, I can see where the value changes. With connector 3.1:
But with connector 3.2:
So when Qt goes to convert the fetched data into its representable form, in QMYSQLResult::data():
We see:
| |||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Thiago Macieira [ 2021-08-13 ] | |||||||||||||||||||||||||||||||||||||||||||||||||||||
|
The byte pattern is actually obvious... It is the time stamp, expressed in bytes: 2021 7 8 13 19 31 | |||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Vladislav Vaintroub [ 2021-08-13 ] | |||||||||||||||||||||||||||||||||||||||||||||||||||||
|
Did you try the obvious, e.g loading the timezone tables? | |||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Vladislav Vaintroub [ 2021-08-13 ] | |||||||||||||||||||||||||||||||||||||||||||||||||||||
|
I'm wondering why would Qt try to interpret every data it fetches as a string, but I leave up to you guys to figure out. I think the general rule for datetime is to fetch that into some MYSQL_TIME https://dev.mysql.com/doc/c-api/8.0/en/c-api-prepared-statement-data-structures.html | |||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Thiago Macieira [ 2021-08-13 ] | |||||||||||||||||||||||||||||||||||||||||||||||||||||
|
I don't see how it would be MYSQL_TIME, since the field length is 19 and the sizeof(MYSQL_TIME) is 40. I'll attempt to implement something bound to MYSQL_TIME at a later time. Anyway, I've found it. This was it:
The overriding was actually writing to the MYSQL_FIELD structure returned by mysql_fetch_field. That function returns a non-const pointer and our code had been using it to store some data for 15 years. After I changed all the variables to const MYSQL_FIELD *, the problem went away. Working now on cleaning up the patch for submission. | |||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Thiago Macieira [ 2021-08-13 ] | |||||||||||||||||||||||||||||||||||||||||||||||||||||
|
Testcase:
| |||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Vladislav Vaintroub [ 2021-08-13 ] | |||||||||||||||||||||||||||||||||||||||||||||||||||||
|
Indeed, this is a problem. User of this API is not supposed to change MYSQL_FIELD * returned by mysql_fetch_field. What's the point? It is internally used by the server to know how to decipher the data in prepared statements. | |||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Vladislav Vaintroub [ 2021-08-13 ] | |||||||||||||||||||||||||||||||||||||||||||||||||||||
|
The "field length" is not the data length used for storage, nor in the client-server protocol, nor in client structs. Just like int(10) in "show create table t" does not represent 10-byte integers. It is "display length", the length in characters for textual representation. | |||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Thiago Macieira [ 2021-08-14 ] | |||||||||||||||||||||||||||||||||||||||||||||||||||||
Because the returned pointer wasn't pointing to const data, so whoever wrote the code to support binding results to output variables with the MySQL 4.1.8 API made a shortcut. It happened some time between Qt 3.3 (2004) and 4.5 (2009), for which there is no public Git history, so I can't pinpoint the exact commit. But you can see the oldest implementation at https://github.com/qt/qt/blob/v4.5.1/src/sql/drivers/mysql/qsql_mysql.cpp#L331-L376. So that developer realised that the output type had to be forced to string for some of the column types, they modified fieldInfo->type. Not an excuse, but it's likely what happened. I recommend adding const to data structures that the user shouldn't modify, when you have a chance to clean the API and apply const-correctness. | |||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Vladislav Vaintroub [ 2021-09-03 ] | |||||||||||||||||||||||||||||||||||||||||||||||||||||
|
I'm closing as not a bug. I can't stop wondering how did that in Qt ever work, it could not have in my opinion. If all Qt wants are strings, there is a wonderful and easy API, mysql_query which gives strings. You do not need prepared statement for that.. Const is a good suggestion, but I'm not sure 20 year old APIs can be fixed like that. |