[MDEV-31955] Make optimizer handle LCASE(varchar_col)=... Created: 2023-08-18  Updated: 2023-12-22

Status: Open
Project: MariaDB Server
Component/s: Optimizer
Fix Version/s: 11.5

Type: Task Priority: Minor
Reporter: Sergei Golubchik Assignee: Oleg Smirnov
Resolution: Unresolved Votes: 0
Labels: optimizer-feature

Attachments: File lcase-discrepancies.sql     File uca_1400_ucase.sql    
Issue Links:
Relates
relates to MDEV-31496 Make optimizer handle UCASE(varchar_c... Closed

 Description   

extend MDEV-31496 to handle LCASE()



 Comments   
Comment by Oleg Smirnov [ 2023-09-11 ]

The attached lcase-discrepancies.sql shows characters for which the condition

(t1.a=t2.a) = (LCASE(t1.a)=LCASE(t2.a))

is not fulfilled. That means it is not possible to remove LCASE() for strings containing those characters.

Comment by Oleg Smirnov [ 2023-09-12 ]

bar suggested looking into the UCA 1400 collation. The attached script uca_1400_ucase.sql shows there are discrepancies in UCASE() comparisons while there are none for LCASE().
Since there are no discrepancies for LCASE(), it is possible to implement the optimization for columns having the uca_1400_ai_ci collation.

Another suggestion is having a constraint

CHECK(a=LCASE(a))

on the target column. The optimizer might check if there is such a constraint and apply the optimization. So the users willing this optimization applied must first add the constraints for target columns.

Comment by Oleg Smirnov [ 2023-09-13 ]

One more complication: it is not enough for UCA 1400 to run one-character checks as they are in the script. Two- or three-character combinations may produce different weights for a single character depending on its position, so additional checks are required.

Generated at Thu Feb 08 10:27:43 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.