[MDEV-23042] View execution slower than underlying query Created: 2020-06-29 Updated: 2020-07-01 |
|
| Status: | Open |
| Project: | MariaDB Server |
| Component/s: | Views |
| Fix Version/s: | None |
| Type: | Task | Priority: | Major |
| Reporter: | Bernd Jänichen | Assignee: | Unassigned |
| Resolution: | Unresolved | Votes: | 0 |
| Labels: | performance | ||
| Attachments: |
|
| Description |
|
Execution of a query on a view is up to 3 times slower in comparison to the underlying query even if there's no data within underlying tables. Root cause is function check_duplicate_names() in sql/sql_view.cc which tests for duplicate column-names within views or derived queries. It is invoked in selects accessing a view though this check has already been made at view-creation time. However the implementation in check_duplicate_names() can be improved to gain better performance: mariadbViewPerformance.patch Testing of performance improvement can be done with attached SQL file test.sql
|
| Comments |
| Comment by Vladislav Vaintroub [ 2020-06-29 ] |
|
without looking why VIEWs do the check multiple times, only about the proposed patch - for a short list, the STL map will be slower due to necessary allocations for temporary objects, and O(N^2) search like today would beat O(N*logN) in proposed patch. I'd guess, in most cases those Lists are tiny. |
| Comment by Bernd Jänichen [ 2020-06-30 ] |
|
Good point, had only huge lists in mind, though for short-lists i guess the difference of using STL maps compared to a nested loop will be in micro-secs. |
| Comment by Bernd Jänichen [ 2020-07-01 ] |
|
So i've done extensive testing and optimizations for the map lookups see attached test-patch. You're right that for short-lists the performance is lower compared to the original implementation Starting with 70-90 elements performance of the lookup begins to be better than the original implementation. These numbers are not representative as i tested it with a single linux-machine only but it shows that I do understand that for OLTP use-cases a number of 4000 elements/columns per view makes no sense To avoid minimal performance loss in OLTP scenarios the map-function will be used only if the threshold of 100 elements is exceeded. |
| Comment by Vladislav Vaintroub [ 2020-07-01 ] |
|
Yes, hybrid approach (O(N^2) for small list, O(N*logN) for large ones) is probably better that either one |