[MDEV-6045] MySQL Bug#11829861 - SUBSTRING_INDEX() RESULTS "OMIT" CHARACTER WHEN USED INSIDE LOWER() Created: 2014-04-08  Updated: 2014-06-22  Resolved: 2014-06-02

Status: Closed
Project: MariaDB Server
Component/s: None
Affects Version/s: 5.3.12, 5.5.37, 10.0.10
Fix Version/s: 5.5.39, 10.0.12, 5.3.13

Type: Bug Priority: Major
Reporter: Sergey Vojtovich Assignee: Alexander Barkov
Resolution: Fixed Votes: 0
Labels: None

Issue Links:
PartOf
is part of MDEV-4784 merge test cases from 5.6 Stalled

 Description   

revno: 3402.50.240
committer: Chaithra Gopalareddy <chaithra.gopalareddy@oracle.com>
branch nick: mysql-5.6
timestamp: Thu 2012-02-23 15:38:33 +0530
message:
  Bug#11829861 - SUBSTRING_INDEX() RESULTS "OMIT" CHARACTER WHEN USED
        INSIDE LOWER()
 
  PROBLEM
  Output of the function substring_index would have missing characters
  when used with string conversion functions like lower().
  Ex:
    SET @user_at_host = 'root@mytinyhost-PC.local';
    SELECT LOWER(SUBSTRING_INDEX(@user_at_host, '@', -1));
    mytinyhost-pc. ocal
 
  ANALYSIS:
  In the function Item_func_substr_index::val_str(), the final
  evaluated string(Item_func_substr_index::tmp_value) is marked
   as constant after the first evaluation. (The reason for the
  same is mentioned in Bug#14676).
 
  Once evaluated, we try to convert this string to lower case.
  While doing so, we call the function "copy_if_not_alloced".
  This function does a copy or allocation, based on the
  "alloced length"s of the strings passed. Since, "tmp_value" is
  marked as constant, "Alloced length" for that string becomes
  zero, thereby forcing allocation and then a subsequent
  copy which results in the missing space.
 
  What we need to note here is that, the source string(tmp_value)
  for the function "copy_if_not_alloced" would be pointing to an
  address inside the destination string, which is the original
  string. Hence the missing letters.
 
  Code Snippets:
  Item_str_conv::val_str(str)//conversion to lower case
  {
    res=Item_func_substr_index::val_str(str)
    (res is actully pointing to an address inside str)
    res= copy_if_not_alloced(str,res,res->length());
  }
  copy_if_not_alloced(to,from,from_length)
  {
   if (to->realloc(from_length))
     return from;                         // Actually an error
   if ((to->str_length=min(from->str_length,from_length)))
     memcpy(to->Ptr,from->Ptr,to->str_length);
  }
 
  If we do not, mark the "tmp_value" as const, we would have
  returned from "copy_if_not_alloced" much earlier, avoiding
  the overwriting.
 
  So the fix is to "not mark tmp_value as const", as there is
  no need for it.As for the fix for the bug#14676, we fix it by
  allocating a temporary buffer to get the delimiter. As, we were
  using "tmp_value" to get the delimiter and also to return the
  evaluated string, we were seeing the problem.
 
  Also, there is one more bug present in this function associated
  with bug#42404.substring_index function returns inconsistent
  results when delimiter is present at offset "0" while the count
  is negative and greater than the number of times the delimiter
  is present in the string.
 
  Currently, if the delimiter is present at offset "0", we skip
  setting of "tmp_value"(this contains final evaluated string),
  instead return the previously set "tmp_value". This was reason
  for the inconsistent results stated in the problem description.
  With this fix, we return the original string if the count is
  non-zero at the end of the loop.



 Comments   
Comment by Alexander Barkov [ 2014-04-21 ]

Pushed into 5.3 and 5.5.

Generated at Thu Feb 08 07:08:57 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.