[CONCPP-99] Invalid queries generated when rewriteBatchedStatements is used in batch mode Created: 2022-04-26 Updated: 2022-10-06 Resolved: 2022-07-24 |
|
| Status: | Closed |
| Project: | MariaDB Connector/C++ |
| Component/s: | General |
| Affects Version/s: | 1.0.1 |
| Fix Version/s: | 1.0.2, 1.1.2 |
| Type: | Task | Priority: | Major |
| Reporter: | M. Tarek Spotts | Assignee: | Lawrin Novitsky |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | None | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Attachments: |
|
||||||||
| Issue Links: |
|
||||||||
| Description |
|
When enabling the `rewriteBatchedStatements` option in the C++ connector, the resulting queries are invalid, even for a trivial table. Given a simple INSERT operation such as:
The resulting batch operation will fail with an error such as this:
At the very least, the statements are missing parentheses before the first value in the VALUES clause (should read `VALUES (1`, not `VALUES1`.) Below is a minimal example to reproduce the error. Sample app code
The app can be compiled the same way as the sample task app, e.g.,
App output
Expected output General query log
Workarounds tried
We then tried enabling useServerPrepStmts (with rewriteBatchedStatements still false). This was to try to force the usage of server side prepared statements to see if this would remove the unnecessary round trips. The queries themselves will succeed in this mode as well, but once again, we see what appears to be continuous round-trips to the DBMS instead of a true batched operation:
Essentially, with useServerPrepStmts mode the repetitive Query log entries are just replaced with repetitive Execute calls on the statement handler. At the end of the day it still appears to be ten separate round trips to the DBMS to perform 10 trivial row inserts. Expected server side log pattern
Notice in the NodeJS version that only one Execute operation is done on the INSERT statement. This demonstrates a benefit of batching, e.g., fewer round trips to the server, since the ten trivial inserts are all done in one call. I am including the sample code used to run the Node.JS version as an attachment, since NodeJS code is not directly germane to this bug report about the C++ connector; but running such code may still be helpful in comparing the General Log output of the two implementations. Server info |
| Comments |
| Comment by Lawrin Novitsky [ 2022-05-20 ] | |
|
Thank you for your report. Apparently this was not tested, but on other hand I don't think we advertised rewriteBatchedStatements as a supported. Yeah, I wrote in README, that list of options is (undefined) subset of C/J options, and for the table of C/C++ supported options I made remark, that it's not complete. Probably I have to re-write it to be more specific. | |
| Comment by M. Tarek Spotts [ 2022-05-20 ] | |
|
Thank you for the reply. There were several things that had caused me to think this was supported. In the readme file it said "Not complete list of supported options - mostly newly added or with new aliases...". This led me to think that there may be other options that are available but simply are not listed in the docs. When I saw that, I figured the best place to check whether the option was supported was to look in the source. In doing so, I found several things that led me to think that executing batches was tested and supported:
I am not sure if these tests actually pass (or if they do, how they are passing), but there are ASSERTs in place and a series of tests that make it look like batching is actually being tested. All of this led to a lot of confusion for me, because, with those types of tests present in the codebase at least, it does make it seem that inserting or updating rows using either a regular or prepared statement is supposed to work. In any case, thank you for looking into this. | |
| Comment by Lawrin Novitsky [ 2022-05-20 ] | |
|
The tests are about batch feature(addBatch/executeBatch), that is part of JDBC. rewriteBatchedStatements is option controlling the way in which bacth is executed. | |
| Comment by M. Tarek Spotts [ 2022-05-20 ] | |
|
Thank you for your answers. In my 'workarounds' section I did try just using the batch commands, but without rewriteBatchedStatements. Those queries did succeed, but in the General Log it appeared that the driver was really just sending the queries one by one instead of saving some round trips. The behaviour I saw in the NodeJS driver is that it did not rewrite the queries, but its batch implementation did only one statement prepare and one statement exec was done even though we were inserting 10 rows. I think that the payload for multiple operations is sent alongside the prepared statement as long as the total payload would be less than max_packet_size. Do we know what the "point" is of the `addBatch`/`executeBatch` is in its current state in the C++ driver? Is it just to be compliant with the JDBC interface? Or is the current batching capability going to save some network round-trips and maybe I was just misunderstanding the General Log results? | |
| Comment by Lawrin Novitsky [ 2022-05-22 ] | |
|
In the current state it's more like API compliance. Lots of optimizations is possible. At least some of them will be the result of this ticket. | |
| Comment by Lawrin Novitsky [ 2022-07-24 ] | |
|
rewriteBatchedStatements option support For insert queries, rewrites batch of staments to execute in a single | |
| Comment by Lawrin Novitsky [ 2022-10-06 ] | |
|
I've changed type to "Task" as it's more like adding support of the feature, rather then fixing it. Also, linked related task, that implemented another mode of batch execution. |