[MDEV-23079] A fully UTF-8 supported database can't store encrypted string Created: 2020-07-03  Updated: 2020-08-05  Resolved: 2020-08-05

Status: Closed
Project: MariaDB Server
Component/s: Character Sets
Affects Version/s: 10.3.22
Fix Version/s: N/A

Type: Bug Priority: Critical
Reporter: Athens Yan Assignee: Alexander Barkov
Resolution: Not a Bug Votes: 0
Labels: Compatibility
Environment:

Ubuntu 20.04



 Description   

I try to use the following Perl script to insert an encrypted text into my table, however error message "Incorrect string value: '\xF5\xE5\x83\xFF\x8A\xE8...' for column `mydb`.`test`.`encrypted_text` at row 1" is returned.

It seems that my fully UTF-8 supported database and table cannot store encrypted string characters. However, it is worked flawlessly on older version of MariaDB, such as MariaDB version 10.3.17 on CentOS 8.

#/usr/bin/perl
 
use strict;
use DBI;
use Crypt::CBC;
 
my $dsn;
my $dbh;
my $sth;
my $sql;
 
sub _encrypt_str {
  my ($plaintext, $key) = @_;
  my ($ok, $cipher, $encrypted);
 
  $encrypted = '';
  $ok = 0;
 
  $cipher = Crypt::CBC->new(-key => $key, -cipher => 'Rijndael');
  if ($cipher) {
    $encrypted = $cipher->encrypt($plaintext);
    $ok = 1;
  }
 
  return ($ok, $encrypted);
}
 
 
sub _generateRandomStr {
  my ($option, $max_len) = @_;
  my ($max_ascii_value, $char, $stop_run, $cnt, $result, @ascii_list);
 
  $option = ($option eq '')? 'A' : uc($option);
  $max_len = ($max_len + 0 <= 0)? 10 : $max_len;
 
  if ($option eq 'N') {
    for (my $i = 48; $i <= 57; $i++) {
      push @ascii_list, $i;
    }
 
    $max_ascii_value = 57;
  }
  elsif ($option eq 'S') {
    for (my $i = 65; $i <= 90; $i++) {
      push @ascii_list, $i;
    }
 
    for (my $i = 97; $i <= 122; $i++) {
      push @ascii_list, $i;
    }
 
    $max_ascii_value = 122;
  }
  else {
    for (my $i = 48; $i <= 57; $i++) {
      push @ascii_list, $i;
    }
 
    for (my $i = 65; $i <= 90; $i++) {
      push @ascii_list, $i;
    }
 
    for (my $i = 97; $i <= 122; $i++) {
      push @ascii_list, $i;
    }
 
    $max_ascii_value = 122;
  }
 
  $result = '';
  $stop_run = $cnt = 0;
  while (!$stop_run) {
    my $this_ascii = sprintf("%.0f", rand($max_ascii_value));
    my $valid_value = 0;
 
    foreach my $ascii (@ascii_list) {
      if ($ascii == $this_ascii) {
        $valid_value = 1;
      }
      last if ($valid_value);
    }
 
    if ($valid_value) {
      $char = chr($this_ascii);
      $result .= $char;
    }
 
    if (length($result) >= $max_len) {
      $stop_run = 1;
    }
 
    if ($cnt >= 90000) {
      $stop_run = 1;
    }
    else {
      $cnt++;
    }
  }
 
  return $result;
}
 
#***************************************************************
# Note: The table is created as follows:
# 
# create table `test` (encrypted_text longtext) engine=InnoDB default charset=utf8mb4;
#***************************************************************
 
$dsn = "DBI:mysql:database=mydb;host=localhost;";
$dbh = DBI->connect($dsn, 'myusername', 'mypassword', {AutoCommit => 1, PrintError => 0, RaiseError => 0});
$sth = $dbh->prepare("use mydb");
$sth->execute();
$sth->finish();
 
my $seed = _generateRandomStr('A', 32);
my ($ok, $encrypted_text) = _encrypt_str('It is my secret message', $seed);
 
$sql = <<__SQL;
INSERT INTO test
(encrypted_text)
VALUES
( ? )
__SQL
 
$sth = $dbh->prepare($sql);
if ($sth->execute($encrypted_text)) {
  print "Save successfully\n";
}
else {
  print "Error: " . $sth->errstr;
}
$sth->finish();
$dbh->disconnect();

Here is my table definition:

+-------+--------------------------------------------------------------------------------------------------------+
| Table | Create Table                                                                                           |
+-------+--------------------------------------------------------------------------------------------------------+
| test  | CREATE TABLE `test` (
  `encrypted_text` longtext DEFAULT NULL
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 |
+-------+--------------------------------------------------------------------------------------------------------+



 Comments   
Comment by Alexander Barkov [ 2020-08-05 ]

Your script inserts binary data into an utf8mb4 longtext column, which cannot insert any arbitrary sequence of bytes: it has constraints on wellformedness of the inserted data.

\xF5\xE5\x83\xFF\x8A\xE8... is not a valid utf8mb4 string. The server cannot insert it into the column of this type.

Please use longblob instead.

It's hard to say why it worked in earlier versions. Possibly, with the older version the client started with the latin1 character set by default.

Generated at Thu Feb 08 09:19:41 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.