[MDEV-22041] Allow to use binary row format even for COM_QUERY Created: 2020-03-25 Updated: 2024-01-15 |
|
| Status: | Open |
| Project: | MariaDB Server |
| Component/s: | Binary Protocol |
| Fix Version/s: | None |
| Type: | New Feature | Priority: | Major |
| Reporter: | Diego Dupin | Assignee: | Ralf Gebhardt |
| Resolution: | Unresolved | Votes: | 1 |
| Labels: | None | ||
| Description |
|
There are two different row format for resultset : TEXT and BINARY format. When retrieving numeric values in text format there is a lot of overhead:
The difference mainly concern :
-When tested, representative workload TPC-C data is 40208 bytes for text vs 31325 for binary row (without header).- Proposal: Add an additionally capability MARIADB_BINARY_RESULT. If supported, the client will send this capability flag during handshake. The server afterwards will send result sets for COM_QUERY in binary format. |
| Comments |
| Comment by Vladislav Vaintroub [ 2020-03-26 ] | |||||||||||||||||
|
numeric: text vs native will "0" will take 2 bytes vs 8 byte for binary. | |||||||||||||||||
| Comment by Vladislav Vaintroub [ 2020-03-26 ] | |||||||||||||||||
|
if there are no NULLs, null bitmap wastes column_count/8 + 1 bytes space. parsing integers is not that bad, and most languages that do not support memcpy, making an int out of 4 bytes buffer is parsing int, binary protocol style parsing int, text style That's not too bad, compared to alleged horrors of parsing I've heard about for fixed size int, e.g in dates you can use better routines : parsing 4 digit int
there is some insignificant overhead compared to shifts and OR, but it is miniscule. As for size of data, how do rows look like, in TPCC queries? What's DDL and result set. It would be nice if you could break it down , for the result sets. | |||||||||||||||||
| Comment by Diego Dupin [ 2020-04-17 ] | |||||||||||||||||
|
I've run TPCC new time, because after reflexion, that much difference surprise me. Row data bytes represent 26% of read exchanges and size is very similar using binary protocol compare to text (1% difference) A surprise is that metadata represent 65% ! of read bytes. To give an idea of what this 1% represent : ok packet represent 6% of read exchanges. (4.5% of all exchanges are from text info ok OkPacket = "Rows matched: 1 Changed: 1 Warnings: 0" | |||||||||||||||||
| Comment by Vladislav Vaintroub [ 2020-04-17 ] | |||||||||||||||||
|
the OK packet text maybe should be optionally suppressed, this is really just for the command line client. | |||||||||||||||||
| Comment by Georg Richter [ 2020-04-19 ] | |||||||||||||||||
|
Python benchmark:
num_fetchloop: Mean +- std dev: 839 us +- 34 us | |||||||||||||||||
| Comment by Sergei Golubchik [ 2020-08-16 ] | |||||||||||||||||
|
wouldn't EXECUTE IMMEDIATE be an exact replacement of COM_QUERY with binary protocol? protocol command EXECUTE IMMEDIATE, not COM_QUERY that sends "EXECUTE IMMEDIATE" string, of course | |||||||||||||||||
| Comment by Georg Richter [ 2020-08-16 ] | |||||||||||||||||
|
Serg, unless EXECUTE IMMEDIATE doesn't have the same limitations as prepare (MEV-16708). |