[MDEV-21159] Queries take more time due to Character set on Spider 3.3.15. Created: 2019-11-27  Updated: 2020-06-01  Resolved: 2020-06-01

Status: Closed
Project: MariaDB Server
Component/s: Storage Engine - Spider
Affects Version/s: 10.4.10
Fix Version/s: N/A

Type: Bug Priority: Critical
Reporter: VINAY KUMAR MOLGURAM Assignee: Kentoku Shiba (Inactive)
Resolution: Incomplete Votes: 0
Labels: need_feedback


 Description   

We've been testing Spider Engine with Mariadb 10.2 and Mariadb10.4. Use case is a read-only with multiple data nodes.

MariaDb 10.2.25 with Spider 3.2.37:
character_set_server = utf8
collation_server = utf8_bin

MariaDb 10.4.10 with Spider 3.3.15:
character_set_server = utf8
collation_server = utf8_bin
– Added below to ensure it uses utf8.
spider_remote_access_charset = utf8
spider_use_table_charset = 1

Data nodes also has utf8 as default charset.

On MariaDB 10.2, the queries runs quick and fine.

But the same query on MariaDB 10.4 takes more time and realized it is trying to change the charset for each columns with _utf8 and, _latin in the query which is creating the huge temporary tables on the data node.

Sample query executed from Spider node

   set @proc_dt = CURRENT_DATE;
   CREATE TEMPORARY TABLE tmp_ids
         ( INDEX(ue_type)
         , INDEX(ue_id))
         SELECT u.type AS ue_type
                , u.id AS ue_id
                , u.id_id AS rf_id
                , MAX(lastUpdate) AS last_chk_dtm
                , MAX(creation) as create_dtm
                , MAX(lastchange) as last_upd_dtm
           FROM TableEntity u
          WHERE DATE(COALESCE(u.lastchange,creation)) BETWEEN @proc_dt - INTERVAL '1' DAY AND @proc_dt
           AND u.type <> 'l.product'
          GROUP BY 1,2,3
          LIMIT 10 ;

On MariaDb 10.2.25 with Spider 3.2.37:
We can see this query being run as:

 'select `id`,`creation`,`lastUpdate`,`type`,`Id_id`,`lastChange` from `TableEntity` where 
((cast((coalesce(`lastChange` , `creation`)) as date) between \'2019-11-26\'  and  \'2019-11-27\')
 and (`type` <> \'l.product\')) limit 10000000 lock in share mode'

On MariaDb 10.4.10 with Spider 3.3.15:
We can see this query being run as:

'select t0.`type` `ue_type`,t0.`id` `ue_id`,t0.`Id_id` `rf_id`,max(t0.`lastUpdate`) 
`last_chk_dtm`,max(t0.`creation`) `create_dtm`,max(t0.`lastChange`) `last_upd_dtm` from 
`TableEntity` t0 where ((cast((coalesce(t0.`lastChange` , t0.`creation`)) as date) between 
_latin1\'2019-11-26\'  and  _latin1\'2019-11-27\') and (t0.`type` <> _utf8\'l.product\')) group by 
t0.`type`,t0.`id`,t0.`Id_id` order by t0.`type`,t0.`id`,t0.`Id_id` limit 10 lock in share mode'

Can you please review and let me know if this is a expected behavior of Spider 3.3.15 or this can be tuned with any configuration variables.



 Comments   
Comment by Kentoku Shiba (Inactive) [ 2020-04-15 ]

Creating the huge temporary tables on the data node seems to be caused by "group by" clause.
vmolguram Would you please try to create index for "group by" clause on data nodes?

Generated at Thu Feb 08 09:05:02 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.