[CONPY-238] Querying mysql.slow_log table might cause UnicodeDecodeError Created: 2022-11-21  Updated: 2022-11-27

Status: Open
Project: MariaDB Connector/Python
Component/s: None
Affects Version/s: 1.0.7
Fix Version/s: None

Type: Bug Priority: Minor
Reporter: Haenige3 Assignee: Georg Richter
Resolution: Unresolved Votes: 0
Labels: None
Environment:

10.4.27-MariaDB-1:10.4.27+maria~deb10-log mariadb.org binary distribution


Python Version: 3.7.3

 Description   

I tried to query the mysql.slow_log table and got this error:

UnicodeDecodeError: 'utf-8' codec can't decode byte 0xed in position 6795: invalid continuation byte   
 
SystemError: <class 'UnicodeDecodeError'> returned a result with an error set
 
The above exception was the direct cause of the following exception:
 
Traceback (most recent call last):
  File "./query_slow_log.py", line 13, in <module>
    main()
  File "./query_slow_log.py", line 9, in main
    cur.fetchall()
SystemError: <method 'fetchall' of 'mariadb.connection.cursor' objects> returned a result with an error set

Here is the test script:

#!/usr/bin/env python3
 
import mariadb
 
def main():
    conn = mariadb.connect()
    cur = conn.cursor()
    cur.execute("SELECT * FROM mysql.slow_log")
    cur.fetchall()
    conn.close()
 
if __name__=='__main__':
    main()

Is it possible that the slow_log's sql_text column content is 'latin-1'? And since Connection.charactar_set is always utf8mb4 this error might pop up? If that's the case is there anything I can do about it?



 Comments   
Comment by Georg Richter [ 2022-11-26 ]

This looks like a server bug - the system tables are defined as utf8mb3, so there shouldn't be a unicode error, since utf8mb3 is a subset of utf8mb4,

Would it be possible to extract the failing record, e.g. by iterating with fetchone() ?

Comment by Haenige3 [ 2022-11-27 ]

In general it seems that the records which are causing these errors are located in columns of type mediumblob or longblob.

The only sample I got at the moment contains a longblob which might contain sensitive information. Is there a way to share the sample nonpublicly?

Generated at Thu Feb 08 03:31:17 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.