[CONPY-266] Make executemany() faster on non-MariaDB servers Created: 2023-06-24  Updated: 2023-06-28

Status: Open
Project: MariaDB Connector/Python
Component/s: None
Affects Version/s: None
Fix Version/s: None

Type: New Feature Priority: Major
Reporter: markus makela Assignee: Georg Richter
Resolution: Unresolved Votes: 1
Labels: None

Issue Links:
Relates
relates to CONPY-165 Optimize executemany() for non MariaD... Open
relates to MXS-4646 Convert LOAD DATA LOCAL INFILE into x... Closed

 Description   

The MariaDB Python connector appears to use COM_STMT_BULK_EXECUTE for performing executemany() calls. This optimization is only enabled for MariaDB versions 10.2 or newer. For older versions or non-MariaDB servers (e.g. Xpand), the fallback code is to convert it into multiple execute() calls which performs very poorly.

The MySQL Python connector has an optimization that converts executemany() calls with INSERT statements into a multi-valued INSERT statment. This is one way of improving bulk operation speed but it requires both parsing and SQL string creation.

Instead, a pipelined approach could be taken. The MariaDB/MySQL protocol is capable of pipelined execution where multiple requests can be sent without first reading their responses. One example of this is the mysql_send_query() API function in the MariaDB Connector/C. The Python connector could be extended to support mysql_send_query() or the COM_STMT_EXECUTE APIs in Connector/C could be improved (here's a pull request that adds the support for it in Connector/C). If the latter approach is taken, the executemany() call can always use the binary protocol and the only difference would be how much overhead the non-optimized COM_STMT_EXECUTE adds compared to the optimized COM_STMT_BULK_EXECUTE.



 Comments   
Comment by Georg Richter [ 2023-06-28 ]

INSERT/REPLACE statements should be optimized (see CONPY-165). All other statements should be executed in a loop: This will stop the execution at first error (while a batch would execute all before fetching an error).

Comment by markus makela [ 2023-06-28 ]

One case where pipelined execution would still be fine is if it's a INSERT IGNORE statement.

Generated at Thu Feb 08 03:31:29 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.