[MCOL-5477] Improve Disk Join Step to handle corner cases for large data Created: 2023-04-17  Updated: 2023-09-22  Resolved: 2023-08-15

Status: Closed
Project: MariaDB ColumnStore
Component/s: None
Affects Version/s: None
Fix Version/s: 23.10.0

Type: New Feature Priority: Major
Reporter: Denis Khalikov Assignee: Denis Khalikov
Resolution: Fixed Votes: 0
Labels: rm_big_data

Sprint: 2023-8
Assigned for Review: Roman Roman
Assigned for Testing: Kirill Perov Kirill Perov

 Description   

By the current design of DJS it tries to distribute rows to buckets based on a hashing result for each row, when the bucket exceed the memory limit defined in session variables it tries to redistribute the current bucket into the small buckets again using a hash with different seed, in case all rows a the same in the bucket we cannot distribute them into different buckets as designed, we go into endless recursion trying to do this. The current task is to handle this corner cases by updating the approach of bucket selection algorithm and bucket distribution algorithm.


Generated at Thu Feb 08 02:58:10 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.