[ODBC-31] High CPU usage and low network utilization when processing a huge result set Created: 2016-03-15  Updated: 2016-08-19  Resolved: 2016-08-19

Status: Closed
Project: MariaDB Connector/ODBC
Component/s: None
Affects Version/s: 1.0.5
Fix Version/s: 2.0.12

Type: Bug Priority: Major
Reporter: Hong Dai Thanh Assignee: Lawrin Novitsky
Resolution: Fixed Votes: 1
Labels: None
Environment:

The client machine is a Windows 7 machine (4 cores 3.4 GHz, 8GB RAM) with ODBC driver version 1.0.5, connecting to Windows Server 2008 R2 (virtual machine) hosting MariaDB server.


Issue Links:
Problem/Incident
is caused by CONC-196 alloc_root is about 4 times slower (c... Closed

 Description   

We test by loading the example employee database from MySQL (https://dev.mysql.com/doc/employee/en/) to MariaDB server, then run the test on the client machine with MariaDB and MySQL ODBC driver.

We found that the following query takes extremely long time to execute:

SELECT s.emp_no, s.salary, e.emp_no, e.first_name, e.last_name, e.gender FROM salaries s, employees e WHERE s.emp_no = e.emp_no;

Using MariaDB ODBC driver, the query above takes 123.7 seconds to complete, compared to MySQL ODBC driver, which does so in only 15.8 seconds.

Details on the connection string and execution time:

>> Connection String : Driver={MySQL ODBC 5.3 Unicode Driver};Server=192.168.xxx.xxx;Port=3306;Database=employees;UID=root;PWD=123;
>> Query String : SELECT s.emp_no, s.salary, e.emp_no, e.first_name, e.last_name, e.gender FROM salaries s, employees e WHERE s.emp_no = e.emp_no;
>> Number of Loops : 1
>> Make Connection...  SUCCESSFUL.
>> Begin querying 1 time(s)... 
15834, 
>> Close Connection...  SUCCESSFUL.
 
>> Connection String : Driver={MariaDB ODBC 1.0 Driver};Server=192.168.xxx.xxx;Port=3306;Database=employees;UID=root;PWD=123;
>> Query String : SELECT s.emp_no, s.salary, e.emp_no, e.first_name, e.last_name, e.gender FROM salaries s, employees e WHERE s.emp_no = e.emp_no;
>> Number of Loops : 1
>> Make Connection...  SUCCESSFUL.
>> Begin querying 1 time(s)... 
123740, 
>> Close Connection...  SUCCESSFUL.

Although it only shows one run of the query, we can reproduce this issue any time.

Checking the performance on the client machine, the test program runs at 100% of a single core (our machine has 4 cores, so it shows up as 25%), and the network utilization drops to only 1-2 percent of the 100 Mbps link. In contrast, MySQL ODBC driver utilizes around 60% of the link, with the same backend.



 Comments   
Comment by Lawrin Novitsky [ 2016-03-15 ]

Thank you for your report.

Long time of execution on huge resultsets is server limitation, and not connectors issue. And if you change MariaDB server to MySQL, the picture will be the same.
With MySQL connector you can mitigate this problem with PREFETCH=n connection option(n can be few hundreds of thousands). It will help to get first results faster, but total time to fetch all rows will probably be longer.
As for different network consumption - it's a bit puzzling for me at the moment. Both connectors are supposed to store complete results on the client, by default. Your observations look like if our connector fetched the result row by row.

But still I don't see what can we fix based on your bug report

Comment by Hong Dai Thanh [ 2016-03-16 ]

>> Long time of execution on huge resultsets is server limitation, and not connectors issue. And if you change MariaDB server to MySQL, the picture will be the same.

The bug report is about the low performance of the MariaDB ODBC driver on Windows. With the same backend (MariaDB server), MySQL driver manages to get all the rows within 16 seconds, while MariaDB ODBC driver takes 123 seconds (2+ minutes).

This is clearly a connector's issue. I have edited the description to clarify it, since the number is only shown in the output of the test program.

>> As for different network consumption - it's a bit puzzling for me at the moment. Both connectors are supposed to store complete results on the client, by default. Your observations look like if our connector fetched the result row by row.

Our test program is implemented by fetching the result set row by row with SQLFetch (rowset size = 1). However, I think this should not prevent the driver to fetch more rows and cache it on the client side.

We actually tested Sybase, MySQL, SQL Server, Postgres (with their respective drivers) with the same query, and none of them have any trouble fetching the whole result set in less than 20 seconds.

Comment by Lawrin Novitsky [ 2016-03-16 ]

Ok, no it's got sense. I will investigate the issue. Given your information about network load it makes even more probable that for some reason connector does not store result on the client, but fetches it row by row. Even though it shouldn't be the case with default options.
Thank you for your report once again.

Comment by Richard Lane [ 2016-07-11 ]

Was able to recreate this on a Windows 7 enterprise client connecting to a MariaDB 10.1.14 server.
Using the MariaDB ODBC connector 2.0.11, the following SQL Statement took on order of 150 seconds to complete. All but 2 seconds was in SQLExecute():
SELECT s.emp_no, s.salary, e.emp_no, e.first_name, e.last_name, e.gender FROM salaries s, employees e WHERE s.emp_no = e.emp_no;
Using MySQL ODBC connector 5.3.6, the same select statement to the same MariaDB database took < 10 seconds.

Executing statements with same number of records returned on single tables returned within a resonable time. This statement requiring Join's is taking orders of magnitude longer.

Also observed with MySQL connector, used memory went up quickly during the 10 seconds, but using MariaDB connector, used memory grew slowly during the 150 seconds and constantly used 25% of 4 CPUs across the entire interval.

Just for comparison, also compared the performance of MySQL vs MariaDB Linux ODBC connector and that was even more different. 3 seconds for MySQL Linux ODBC connector vs 755 seconds for MariaDB connector. Not as worried about this since unlikely that Unix ODBC connector will be used by us.

Comment by Lawrin Novitsky [ 2016-07-12 ]

> Executing statements with same number of records returned on single tables returned within a resonable time. This statement requiring Join's is taking orders of magnitude longer.

You mean MariaDB Connector and server? this is interesting. Do I understand correctly, that you simply do SQLPrepare() + SQLExecute()? Then for connector it should be fairly indifferent, if the query on single table, or on joined tables. Does mysql connector+mariadb server demonstrate some difference between queries on single table and on joined tables?

Comment by Richard Lane [ 2016-07-12 ]

I have a MariaDB 10.1.14 Server with Galera Cluster (3 nodes) installed. Only difference from these experiments is the ODBC connector used at my Windows 7 PC to query the database. I have defined the following two DSNs in ODBC in Windows:
MDB_Employees (MariaDB ODBC 2.0 Driver)
MYSQL_Employees (MySQL ODBC 5.3 ANSI Driver)
Both point to the exact same IP:port identifying the MariaDB 10.1.14 server described above (so target MariaDB is exactly the same). The MariaDB server has an employees database from opensource.

I have a Windows 32-bit client (since Windows ODBC is 32-bit) that allows me to select the DSN to query and the SQL statement to execute. The app connects to the DSN then does an SQLPrepare() followed by SQLExecute(). I time the SQLExecute() statement.
Here are the results from querying various tables using each of the DSNs above:

MYSQL_Employees MDB_Employees SQL Statement
------------------------- --------------------- -------------------------------------------------------------------------
0.22 seconds 2.27 seconds SELECT * FROM titles; (returns 2844047 rows)
1.48 seconds 1.18 seconds SELECT * FROM salaries; (returns 443308 rows)
0.20 seconds 1.45 seconds SELECT * FROM employees; (returns 300024 rows)
2.03 seconds 164.47 seconds SELECT s.emp_no, s.salary, e.emp_no, e.first_name, e.last_name, e.gender FROM salaries s, employees e WHERE s.emp_no = e.emp_no; (returns 2844047 rows)

As can be seen above, simple queries from single tables is comparable, however the select statement that pulls certain rows from two tables is what is taking orders of magnitude longer.

This is easily repeatable using the above select statement using the Windows MariaDB ODBC connector.

Comment by Lawrin Novitsky [ 2016-07-26 ]

So far it is not clear whether the problem in the c/odbc or in the underlying connector/C library. I changed the testcase to force mysql c/odbc also to use prepared statements, and also tested both connectors against mysql server. result is the same, thus I think we can exclude server from possible reasons.
Curious that if to remove either first_name or last_name from the query, it works ok(like SELECT * FROM employees
This kinda suggests that c/c can be blamed. However it is still possible that c/odbc doesn't do something, that would help c/c not to have the issue with initial query and any other query, like it does not have with changed query or 'SELECT * FROM employees'.
call of mysql_stmt_store_result takes 99% of time. This is from mysql C API, i.e. from c/c
This is more like to tell you, that I am on your bug finally.

Comment by Georg Richter [ 2016-08-03 ]

The problem was caused by unnecessary extra loops in alloc_root() function of Connector/C.
Fixed in C/C 2.3.1 and 3.0.1

Comment by Lawrin Novitsky [ 2016-08-19 ]

This is C/C issue, and it is fixed there. Next C/ODBC release(2.0.12) will be linked against latest C/C 2.3 version, and will have this fix. Closing

Generated at Thu Feb 08 03:25:43 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.