[MCOL-4691] Major Regression: Selects with aggregates 2x slower in 5.x than in 1.2 (due to collation support) Created: 2021-04-21 Updated: 2023-10-27 Resolved: 2023-10-27 |
|
| Status: | Closed |
| Project: | MariaDB ColumnStore |
| Component/s: | PrimProc |
| Affects Version/s: | None |
| Fix Version/s: | Icebox |
| Type: | Bug | Priority: | Critical |
| Reporter: | Gregory Dorman (Inactive) | Assignee: | Leonid Fedorov |
| Resolution: | Won't Fix | Votes: | 0 |
| Labels: | performance | ||
| Attachments: |
|
||||||||||||||||||||||||||||||||||||
| Issue Links: |
|
||||||||||||||||||||||||||||||||||||
| Sub-Tasks: |
|
||||||||||||||||||||||||||||||||||||
| Sprint: | 2021-7, 2021-8, 2021-9, 2021-10, 2021-11, 2021-12 | ||||||||||||||||||||||||||||||||||||
| Description |
|
First pointed out by Quinnstreet, now confirmed by drrtuy. This requires profiling in both releases and identification of where the extra time is eaten. While it also shows lack of user scaling in both releases, that part is not the focus of this ticket (there is related ticket for that). This one is only about raw difference between releases. drrtuy has the environment and a reproduction. tail_num is VARCHAR(6). The table's charset is utf8. In 1.2 tests
in 5.2.2 tests
|
| Comments |
| Comment by Alexander Barkov [ 2021-05-12 ] | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
Performance comparsion of this query:
for various collations in develop-5:
Note, this query (notice the SMALLINT column taxi_in):
takes 0.124 seconds. | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Roman [ 2021-06-17 ] | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
My results comparing 1.2.5 and 5ebac6772 are different for smallint for a single-node setup.
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Roman [ 2021-06-17 ] | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
These are results for varchar(6) with utf8 and default collation taken in a single-node setup.
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Roman [ 2021-06-17 ] | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
There are a number of CHARSET_INFO::strnncollsp calls in develop-5 at the commit 5ebac6772 that were added as part of an effort to bring collation aware ops. W/o these calls the GROUP BY timings is even faster comparing with 1.2.5.
From the first glance at profiling results the calculation of a string length brings the only visible difference b/w the fast hack patch and the original develop-5 at 5ebac6772. | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Roman [ 2021-06-18 ] | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
Here are the results for develop 96f2a55 VARCHAR(6)/latin1/latin1_nopad_bin:
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by David Hall (Inactive) [ 2022-05-25 ] | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
While all this slow down is created by our support of charset and collations, some improvement is expected from MCOL-5043 and |