[MDEV-28410] probleme de performance powerbi requetant mariaDB Created: 2022-04-25  Updated: 2024-01-13

Status: Open
Project: MariaDB Server
Component/s: Information Schema, Platform Windows
Affects Version/s: 10.1.34, 10.6.7
Fix Version/s: None

Type: Bug Priority: Major
Reporter: JM974 Assignee: Unassigned
Resolution: Unresolved Votes: 0
Labels: information_schema, performance, powerbi
Environment:

windows server 2016


Attachments: PNG File show engines.png    
Issue Links:
Relates
relates to ODBC-410 An option for catalog functions to us... Open

 Description   

Bonjour,
un applicatif métier nous impose un moteur mariaDB 10.1.34 installé sous un windows serveur 2016 64bits.
L'applicatif en lui même n'a aucun problème de performance.
L'architecture de l'applicatif est : 1 client = 1 base.
Nous avons environ 450 clients, donc l'instance mariaDB contient 450 bases.

Par contre, nous avons démarré un projet de reporting basé sous powerbi.
La version de powerbi est la version desktop de mars 2022 64bits et on utilise le connecteur ODBC pour mariaDB version 3.1.15
La connexion à la base ne pose pas de problème, mais dès lors qu'on souhaite visualiser ou importer une table dans powerbi, cela prend environ 10 min alors que les tables sont très petites (<1000lignes)
Dans le moteur mariaDB, on voit que powerbi genere des requetes dans la base information_schema et ce sont ces requetes qui posent problème.

Voici l'une des requêtes, qui prend environ 10 min, 1vCPU et environ 4Go de RAM avant de retourner aucune données:
select
fkcol.REFERENCED_COLUMN_NAME as PK_COLUMN_NAME,
fkcol.TABLE_SCHEMA AS FK_TABLE_SCHEMA,
fkcol.TABLE_NAME AS FK_TABLE_NAME,
fkcol.COLUMN_NAME as FK_COLUMN_NAME,
fkcol.ORDINAL_POSITION as ORDINAL,
concat(fkcon.CONSTRAINT_SCHEMA, '_', fkcon.CONSTRAINT_NAME) as FK_NAME
from
INFORMATION_SCHEMA.REFERENTIAL_CONSTRAINTS fkcon
inner join
INFORMATION_SCHEMA.KEY_COLUMN_USAGE fkcol
on fkcon.CONSTRAINT_SCHEMA = fkcol.CONSTRAINT_SCHEMA
and fkcon.CONSTRAINT_NAME = fkcol.CONSTRAINT_NAME
where fkcol.REFERENCED_TABLE_SCHEMA = '******' and fkcol.REFERENCED_TABLE_NAME = '******'
and fkcol.ORDINAL_POSITION = fkcol.POSITION_IN_UNIQUE_CONSTRAINT
order by concat(fkcon.CONSTRAINT_SCHEMA, '_', fkcon.CONSTRAINT_NAME), fkcol.ORDINAL_POSITION;

J'ai montée une autre instance en 10.1.34 avec seulement la base qui m'intéressait, et dans cette configuration, le requêtage est instantané.

Est ce que c'est un bug? un problème de configuration? un problème d'architecture de base ?

Merci pour votre aide!

JM



 Comments   
Comment by Marko Mäkelä [ 2022-04-25 ]

Désolé, je ne parle pas français. Je vais continuer en anglais.

The MariaDB 10.1 series was scheduled to reach its end of life on October 17, 2020. The last release in the series (10.1.48) was released a couple of weeks after that. Version 10.1.34 was released in June 2018, almost 4 years ago.

That said, it is possible that such an INFORMATION_SCHEMA query would be slow also in a supported version of MariaDB Server. Is that query slow only on the first execution after the server was started up, or also on repeated execution? One possible source of slowness could be opening the tables.

Can you collect some profiling data for the duration of that slow query? sudo perf record -g $(pgrep -x mysqld), interrupt it with ctrl-c, and then post the output of sudo perf report. Can you try to import the data to a later major version of MariaDB? Many InnoDB performance fixes were implemented the 10.5 and 10.6 series. However, I suspect that the reason for this slowness could be outside InnoDB.

Comment by JM974 [ 2022-04-25 ]

Hi Marko,

first of all, thanks for your help!

About mariaDB version, I know about its EOL, but unfortunately, I'm stuck with this version because of the software editor. I tried to contact him in order to try to upgrade the mariaDB server. I'm waiting for an answer ....

About repeated queries, it changes nothing. It seems that there not enough cache on the server or, a query on information_schema has to open all tables in all 450 databases, that's implies a lots of I/O, and then, slow query.

About profiling data, I'm on windows environnement and unable to perform your script ... Is it possible to run it on Windows?

More informations, maybe usefull:
I think we dont use innoDB engine but MyISAM, maybe the reason of the pb of performance?

Thanks a lot for your help.

Regards,

JM

Comment by JM974 [ 2022-04-25 ]

I'll try your idea: install latest version of mariadb on another server, and try slow queries on it.

I'll send you the result.

regards

JM

Comment by Marko Mäkelä [ 2022-04-25 ]

JM974, thank you. I realized that you did specify the environment, but somehow I gave my standard advice. Sorry about that. wlad is our Windows expert and he could give some profiling hints.

I believe that InnoDB is the only MariaDB storage engine that supports FOREIGN KEY constraints. The data for INFORMATION_SCHEMA.REFERENTIAL_CONSTRAINTS ought to be fetched from the InnoDB storage engine. The task MDEV-16417 has been filed, to store the metadata outside of InnoDB, and in MDEV-22361 the FOREIGN KEY constraints should be implemented for all storage engines, in the SQL layer. There hasn’t been any progress on either task lately.

It is of course possible that you are using MyISAM and there are no FOREIGN KEY constraints (INFORMATION_SCHEMA.REFERENTIAL_CONSTRAINTS reports no records). But could the query then take extremely long to execute?

If a bad query execution plan is being chosen, that should be revealed by prefixing the SELECT clause with EXPLAIN. In later versions of MariaDB, there is also an optimizer_trace feature that could indicate where the time is being spent.

Comment by Sergei Golubchik [ 2022-05-05 ]

Closing as Won't fix, as it's 10.1.

Ask on the mailing list or zulip, you might get some help there.

https://mariadb.com/kb/en/mailing-lists/
https://mariadb.com/kb/en/meta/irc-chat-servers-and-zulip-instance/

Comment by JM974 [ 2022-05-06 ]

Hi,

I created a new VM with 10.6.7 MariaDB engine installed.
I imported all the 450 databases but the problem remains : I got the same performance problem with powerbi, so it doesnt seem to be a pb with mariadb engine version.

Is it possible to re-open the ticket as I still got the problem?

Thx

JM

Comment by Sergei Golubchik [ 2022-05-06 ]

Yes, of course. Thanks!

Comment by JM974 [ 2022-05-09 ]

Hi again,

I succeded to bypass the performance issue with this:
I flushed all grants (deleted select grant in global privileges) on the read-only user created in the mariadb DB and add only select grant on 5/10 databases needed to generate the powerbi reporting.
The time to query decreaded from 10 min to 5 sec ...

The pb of the ODBC connector / powerbi is that even if we query 1 table in 1 database, if the user has select global privilege, the mariadb server parses all databases through a query on information_schema to fin KEY_COLOMN_USAGE/REFERENTIAL_CONSTRAINTS.
And all these datas are totally useless.

Is there a way to disable this feature?

thx again.

JM

Comment by JM974 [ 2022-05-09 ]

I tried mariadb AND mysql ODBC connectors.

And the results are the same.

thx

Comment by Sebastian Hölscher [ 2022-05-24 ]

Hi JM,
I'm using mysql 8 and got the same issue with powerbi.
Just found this problem here because I was thinking about trying mariadb.

We found out how to disable these queries in powerbi, so the reports refresh fast again.
You can deactivate it by unchecking the "Include relationship columns" under advanced options.
Or directly in the M-Query Options by setting the `CreateNavigationProperties` to `false` (https://docs.microsoft.com/en-US/powerquery-m/mysql-database)
Hopefully, this helps you as well.

Comment by JM974 [ 2022-05-27 ]

Hi Sebastian,

thank you so much for the tip. it works perfectly well and solves my problem!

However, I have to user mysql connector in order to see the "include relationship columns" parameter.
The mariadb connector doesn't have this feature ...

Thx again.

regards
JM

Comment by Lawrin Novitsky [ 2024-01-13 ]

MySQL's driver used to have option not to use INFORMATION_SCHEMA for catalog functions, but to process output of different SHOW commands (with sometimes lower data quality as a trade off). The name, out of top of my head, was NO_I_S.
We could add the similar option, if I_S queries cannot be made faster. I don't quite like the idea, but that can be done faster, than server gets fix for this. I will also take a look if maybe the query can be optimized.
I've created ODBC ticket and linked to this issue.

Generated at Thu Feb 08 10:00:32 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.