[MCOL-4957] Functions on TIMESTAMP columns in projection, e.g. year(timestamp_col) slows down SELECT 50x. Created: 2022-01-05 Updated: 2022-06-02 Resolved: 2022-02-22 |
|
| Status: | Closed |
| Project: | MariaDB ColumnStore |
| Component/s: | ExeMgr, PrimProc |
| Affects Version/s: | 6.2.2 |
| Fix Version/s: | 6.3.1 |
| Type: | Bug | Priority: | Major |
| Reporter: | Roman | Assignee: | Daniel Lee (Inactive) |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | None | ||
| Issue Links: |
|
||||||||||||
| Sprint: | 2021-16, 2021-17 | ||||||||||||
| Description |
|
Testing Uber's taxi rides dataset [1] [2] I found out we have a serious drawback in time-related functions. Here is the query that I run with the develop-6 HEAD twice to get the timings.
CPU cores were fully utilized but poor man's profiling technic tells me OS spinlocks burn CPU. Here are two related callstacks:
With a hacky patch [3] that removes both localtime_r() calls and the mutex I was able to reduce the timings to this. First is a cold cache run:
*The suggested solution is to reduce a number of localtime_r() calls down to one and remove mutex in the Func class. * 1. https://tech.marksblogg.com/billion-nyc-taxi-rides-redshift.html
|
| Comments |
| Comment by Roman [ 2022-01-05 ] | ||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
Here is the DDL for the table I used.
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Gagan Goel (Inactive) [ 2022-02-16 ] | ||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
For QA: DDL for the table:
Dataset to be loaded into the table is available at: https://www.dropbox.com/sh/4xm5vf1stnf7a0h/AADRRVLsqqzUNWEPzcKnGN_Pa?dl=0 Download and extract using gunzip the following 5 compressed csv files: trips_xaa.csv.gz, trips_xab.csv.gz, trips_xac.csv.gz, trips_xad.csv.gz, trips_xae.csv.gz Load the 5 csv files into the table using cpimport like so: cpimport -s "," test trips trips_xaa.csv. Repeat for the other csv files. Execute the following query to notice the query runtime before and after the fix:
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Daniel Lee (Inactive) [ 2022-02-22 ] | ||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
Build verified: 6.2.4-1 (b3993) After the fix, it is 11x faster.
---------------- Before the fix (6.2.2-1):
---------------- |