Uploaded image for project: 'MariaDB Connector/C'
  1. MariaDB Connector/C
  2. CONC-608

mysql_stmt_fetch() returns 1, but mysql_stmt_errno() returns 0

Details

    • Bug
    • Status: Open (View Workflow)
    • Major
    • Resolution: Unresolved
    • 3.3.1
    • 3.1, 3.3
    • Prepared Statements
    • None
    • Linux

    Description

      It appears some SELECT ... FOR UPDATE statements sometimes fail during mysql_stmt_fetch() but no mysql_stmt_errno() is set and mysql_stmt_error() is blank. It looks like we don't need to restart the connection or anything so the connection didn't fail or similar.

      I am using Percona XtraDB Cluster 5.7 (MySQL 5.7 + Galera) that is geo-distributed with a max latency between nodes of 10ms. The requests it occurs on are likely to be run on different nodes simultaneously, though I'm not sure if this is relevant to the issue or not, just a guess.

      I haven't seen if I could write a specific test case to reproduce this.

      The error is output from:
      https://github.com/Monetra/mstdlib/blob/master/sql/mysql/mstdlib_sql_mysql.c#L864

      Looking at the MariaDB Connector/C code, I can see mysql_stmt_fetch() calls out to methods->db_stmt_fetch (mthd_stmt_fetch_row) and methods->db_stmt_fetch_to_bind (mthd_stmt_fetch_to_bind).

      mthd_stmt_fetch_to_bind() can only return 0 or MYSQL_DATA_TRUNCATED(101), so the problem isn't there.

      mthd_stmt_fetch_row points to stmt->fetch_row_func, which appears to be either stmt_unbuffered_fetch() or stmt_cursor_fetch(). As far as I know, we aren't requesting cursors (and don't use STMT_ATTR_PREFETCH_ROWS which from what I know would create a cursor), so I'm pretty sure that means we're taking the stmt_unbuffered_fetch() path.

      In stmt_unbuffered_fetch() I do see a check for pkt_len == packet_error where it will return (1) as an error but never sets an errno or error message, so I'm pretty sure this is what we are hitting.

      This then means why are we getting a packet error?

      Attachments

        Issue Links

          Activity

            georg Georg Richter added a comment -

            Hi Brad,

            That no error was set is a bug, but it would be more interesting to know why a packet error shows up while fetching a result set. Can you reproduce this issue? If yes, would it be possible to attach a tcp dump?

            georg Georg Richter added a comment - Hi Brad, That no error was set is a bug, but it would be more interesting to know why a packet error shows up while fetching a result set. Can you reproduce this issue? If yes, would it be possible to attach a tcp dump?
            bradh352 Brad House added a comment -

            I'll see if I can write a test case for this and reproduce at any consistent rate. If so, I should be able to get a packet capture. Right now the issue happens extremely infrequently (less than once per day), but I've seen it happen across multiple tables that have similar usage patterns.

            The gist of the usage pattern is there are multiple nodes querying a task list table periodically (30s - 1min) in a SERIALIZABLE transaction, and it claims the tasks by either updating the records or deleting the records after the "select ... for update". The task list can contain 0 or more records, but can return thousands of records during times of high load.

            I've never seen this occur in any other situation, such as a "select ... for update" with a single row. Nor a multi-row select outside of a transaction.

            Is it possible that in a serializable transaction with "select ... for update" that the select can be aborted before all rows are output on the server side? Basically, might it detect its resultset has already been invalidated by another transaction and stop sending records and instead send some sort of error code indicating we need to rollback?

            bradh352 Brad House added a comment - I'll see if I can write a test case for this and reproduce at any consistent rate. If so, I should be able to get a packet capture. Right now the issue happens extremely infrequently (less than once per day), but I've seen it happen across multiple tables that have similar usage patterns. The gist of the usage pattern is there are multiple nodes querying a task list table periodically (30s - 1min) in a SERIALIZABLE transaction, and it claims the tasks by either updating the records or deleting the records after the "select ... for update". The task list can contain 0 or more records, but can return thousands of records during times of high load. I've never seen this occur in any other situation, such as a "select ... for update" with a single row. Nor a multi-row select outside of a transaction. Is it possible that in a serializable transaction with "select ... for update" that the select can be aborted before all rows are output on the server side? Basically, might it detect its resultset has already been invalidated by another transaction and stop sending records and instead send some sort of error code indicating we need to rollback?
            georg Georg Richter added a comment -

            I dont know if it fails for same reason, but it is another example where the server returns an error without error code.

            georg Georg Richter added a comment - I dont know if it fails for same reason, but it is another example where the server returns an error without error code.

            People

              georg Georg Richter
              bradh352 Brad House
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated:

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.