[MDEV-17802] Check how optimizer behaves for TPC-DS queries Created: 2018-11-22  Updated: 2023-12-20

Status: Stalled
Project: MariaDB Server
Component/s: Optimizer
Fix Version/s: None

Type: Task Priority: Major
Reporter: Sergei Petrunia Assignee: Sergei Petrunia
Resolution: Unresolved Votes: 0
Labels: benchmarking

Attachments: File adopt-queries-for-mariadb-part1.diff     File adopt-queries-for-mariadb-v2.diff     File mariadb-vs-mysql-nov24.ods     Text File nov23-out34-run-log.txt     Text File nov23-out34-run-time.txt     File query_0.sql    
Issue Links:
Relates
relates to MDEV-17807 Incorrect usage of CUBE/ROLLUP and OR... Open

 Description   

We had looked at some of TPC-DS queries on a case-by-case basis before.

It will be interesting to do a more comprehensive test. Possible goals

  • How does MariaDB compare across versions and with other databases?
  • Does EITS help?
  • What is the effect of the new optimizer defaults
  • etc

References

References:



 Comments   
Comment by Sergei Petrunia [ 2018-11-22 ]

Fo example, there is a fairly nice post here: https://blog.pgaddict.com/posts/performance-since-postgresql-7-4-to-9-4-tpc-ds

Comment by Sergei Petrunia [ 2018-11-22 ]

The data generator will produce csv files with empty spaces for NULLs. There is a compiler #define to make it produce data suitable for MySQL/MariaDB:

cd v2.8.0rc4/tools
mv makefile makefile.original
sed  's/^\(CFLAGS.*\)$/\1 -D_MYSQL/' < makefile.original > makefile
make

Generating ~1G of data:

mkdir tmp
./dsdgen -scale 1 -dir ./tmp
du -sh ./tmp/

Loading the data

#!/bin/bash
 
DATADIR="/home/psergey/tpcds/try2/v2.8.0rc4/tools/tmp/"
 
echo "call_center
catalog_page 
catalog_returns 
catalog_sales 
customer_address 
customer 
customer_demographics 
date_dim 
dbgen_version 
household_demographics 
income_band 
inventory 
item 
promotion 
reason 
ship_mode 
store 
store_returns 
store_sales 
time_dim 
warehouse 
web_page 
web_returns 
web_sales 
web_site" | while read a; do
  echo "load data local infile '${DATADIR}$a.dat' into table $a fields terminated by '|' enclosed by '\"';"
done

Comment by Sergei Petrunia [ 2018-11-22 ]

Generating queries:

mkdir tmp-queries
echo 'define _END = "";' >> ../query_templates/netezza.tpl
./dsqgen \
  -scale 1 \
  -dialect netezza \
  -verbose y \
  -qualify y  \
  -directory ../query_templates \
  -input ../query_templates/templates.lst \
  -output_dir ./tmp-queries/
ls -lah ./tmp-queries/query_0.sql

Output of the first run: - query_0.sql

Comment by Sergei Petrunia [ 2018-11-22 ]

The queries are not in MySQL's dialect of SQL. Made some obvious fixes, but there are issues that can't be fixed easily:

Query 80: We dont support this (parse error):

 group by rollup (channel, id)

changing it to

 group by channel, id with rollup

produces:
ERROR 1221 (HY000): Incorrect usage of CUBE/ROLLUP and ORDER BY

Query 86: same as above:

 group by rollup(i_category,i_class)
 order by
   lochierarchy desc,
   case when lochierarchy = 0 then i_category end,
   rank_within_parent

Query 27: the same:

 group by (i_item_id, s_state)
 order by i_item_id
         ,s_state

Query 36:

   ,grouping(i_category)+grouping(i_class) as lochierarchy
   ...
 group by rollup(i_category,i_class)
 order by
 
Query 70 - the same
Query 97 - full outer join
Query 51 - full outer join

Query 22: group by rollup(...) order by ...
Query 77: group by rollup (channel, id) order by channel ,id
Query 18: group by rollup (i_item_id, ca_country, ...) order by ca_country, ca_state ...
Query 14: Derived table must have alias, and then group by rollup issue.

Comment by Sergei Petrunia [ 2018-11-22 ]

adopt-queries-for-mariadb-part1.diff - Basic fixes to get SQL inline with MySQL/MariaDB dialect.

Comment by Sergei Petrunia [ 2018-11-22 ]

Filed the cause of us being unable to run some queries as MDEV-17807. (Yes, one can use a derived table to work around this limitation.)

Comment by Sergei Petrunia [ 2018-11-23 ]

adopt-queries-for-mariadb-v2.diff - Got all queries to run, except the 12 queries listed above. Queries that produce errors are marked with MARIADB-EXPECTED-ERROR.

Names of the failing queries:

query14.tpl
query18.tpl
query22.tpl
query27.tpl
query36.tpl
query51.tpl
query5.tpl
query70.tpl
query77.tpl
query80.tpl
query86.tpl
query97.tpl

Comment by Sergei Petrunia [ 2018-11-23 ]

nov23-out34-run-log.txt nov23-out34-run-time.txt
The run takes about 30 min (excluding the data load)
Heaviest queries:

 Query Name  	 Milliseconds 	% of total time
 query72.tpl 	902,748	49.67
  query2.tpl  	106,285	5.85
 query78.tpl 	99,782	5.49
 query23.tpl 	71,419	3.93
 query39.tpl 	57,444	3.16
 query14.tpl 	57,121	3.14
  query4.tpl  	42,685	2.35
 query47.tpl 	37,982	2.09
 query31.tpl 	36,645	2.02
 query21.tpl 	28,829	1.59
 query11.tpl 	27,422	1.51
 query74.tpl 	26,401	1.45
 query87.tpl 	21,437	1.18
 query38.tpl 	19,223	1.06
 query59.tpl 	18,353	1.01
 query88.tpl 	17,906	0.99
 query67.tpl 	15,604	0.86
 query57.tpl 	12,986	0.71
  query6.tpl  	12,442	0.68
 query46.tpl 	11,281	0.62
 query54.tpl 	10,685	0.59
 query68.tpl 	10,546	0.58
...

Comment by Sergei Petrunia [ 2018-11-24 ]

Attempted to run on MySQL 8.0. A few queries produce errors not observed with MariaDB:

 
query38.tpl - INTERSECT
query8.tpl  - INTERSECT
query87.tpl - EXCEPT
query54.tpl - CAST (AS INT) -- fixed by changing to AS SIGNED INTEGER

Comment by Sergei Petrunia [ 2018-11-24 ]

MariaDB 10.3 vs MySQL 8.0 comparison - mariadb-vs-mysql-nov24.ods. Total time is about the same. The query execution time varies (for some queries, MariaDB is better, for others, MySQL).

Need to run it again and check everything before making any conclusions.

Comment by Sergei Petrunia [ 2018-11-29 ]

Re-ran the benchmark on 10.4 over re-generated data. Execution time is very different.
I think, the reason is random generator seed. Both dsdgen and dsqgen have RNGSEED parameter.
We'll need to make use of it:

  • Run with the same value for reproducible results
  • Try different values to see how stable the plan is.
Comment by Sergei Petrunia [ 2018-12-03 ]

The idea to use pt-query-digest to collect query time histograms was not a good one, as they have fixed bounds:

# Query_time distribution
#   1us
#  10us
# 100us
#   1ms
#  10ms
# 100ms
#    1s
#  10s+  ################################################################
# Tables

Comment by Sergei Petrunia [ 2019-03-04 ]

A collection of scripts to run the benchmark: https://github.com/spetrunia/tpcds-run-tool

First results:
https://www.slideshare.net/SergeyPetrunya/lessons-for-the-optimizer-from-running-the-tpcds-benchmark

random seed generator for dataset matters a lot. Need to run more tests to analyze the impact (are there "heavy" RNGSEED parameter values that cause all databases to be slow? Or some databases are slow and some are fast?) I think, the latter is the case but I need to have the data from multiple controlled runs to be able to tell this with certainty.

Comment by Julien Fritsch [ 2023-12-05 ]

Automated message:
----------------------------
Since this issue has not been updated since 6 weeks, it's time to move it back to Stalled.

Generated at Thu Feb 08 08:39:13 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.