Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-19237

Skip sending metadata when possible for binary protocol

Details

    Description

      This task goal is to permit not sending metadata every time

      What MySQL has done in version 8

      MySQL has changed the protocol to permit skipping metadata for version 8.
      see https://dev.mysql.com/doc/refman/8.0/en/c-api-optional-metadata.html
      and https://dev.mysql.com/worklog/task/?id=8134

      In description MySQL task indicate some big improvement:

      • METADATA_FULL : 3.48w TPS, Net send 113M
      • METADATA_IGNORE: 13.8w TPS, Net send 30M

      MySQL implementation send metadata (or not) according to resultset_metadata variable.
      Associate capability is CLIENT_OPTIONAL_RESULTSET_METADATA (1UL << 25)

      When resultset_metadata = 0 (NONE) no metadata is send for COM_STMT_PREPARE/COM_STMT_EXECUTE/COM_QUERY

      Exchange format change

      Column count packet

      There is a "new metadata_follows flag"

      * int<lenenc> column count
      * int<1> metadata_follows flag (0x00: No metadata, 0x01: metadata follows)
      

      When metadata_follows = 0x00, protocol skip column informations packet (continuing by an EOF_packet if CLIENT_DEPRECATE_EOF capability is not set or rows if CLIENT_DEPRECATE_EOF is set).
      be careful not to rely on mysql documentation that indicate them in a wrong order

      COM_PREPARE response

      COM_STMT_PREPARE_OK:

      * int<1> 0x00 COM_STMT_PREPARE_OK header
      * int<4> statement id
      * int<2> number of columns in the returned result set (or 0 if statement does not return result set)
      * int<2> number of prepared statement parameters ('?' placeholders)
      * string<1> -not used-
      * int<2> number of warnings
      * if capabilities & CLIENT_OPTIONAL_RESULTSET_METADATA
      **  int<1> metadata_follows flag (0x00: No metadata, 0x01: metadata follows)
      }
      

      if metadata is set to 0x00, column definition packet that decribe each parameter and each column are skipped.

      Requirement:

      Connectors need metadata to parse resultset.
      Proposal is to offer this functionality, but in a more modular way.

      What do connectors need:

      For first execute of a prepared statement:

      For next execute:

      There are two issues with not sending metadata each time:

      • Metadata can change over time. Server already automatically reprepares internally, so it knows to resend the metadata even if the client asked not to send the metadata every time.
      • commands like "SELECT ?" will have metadata type changing according to parameter. Metadata must then be send each time.

      COM_STMT_PREPARE still need to have the possibility to ask for metadata (at least java need that possibility to have metadata (example) without any COM_STMT_EXECUTE execution)

      For production environment where DDL never change an option can be set on client level to use that same functionality for COM_QUERY.
      Client can then cache metadata and ask for COM_QUERY without metadata.

      MySQL doesn't permits modularity, so client request have to add this information.
      That can be either by:

      • create new commands COM_STMT_PREPARE_WITHOUT_META, COM_STMT_EXECUTE_WITHOUT_META and COM_QUERY_WITHOUT_META
      • changing existing COM_STMT_PREPARE, COM_STMT_EXECUTE and COM_QUERY format.

      Attachments

        Issue Links

          Activity

            wlad Vladislav Vaintroub added a comment - - edited

            I think EOF packet is not needed after column count 0xFC0000. It is not easy to imagine any of the existing clients would be able to cope with 0 metadata packets

            wlad Vladislav Vaintroub added a comment - - edited I think EOF packet is not needed after column count 0xFC0000. It is not easy to imagine any of the existing clients would be able to cope with 0 metadata packets
            serg Sergei Golubchik added a comment - - edited

            An approach that won't need need new COM commands and no application side changes could be:

            • send metadata on COM_STMT_PREPARE
            • send metadata on COM_STMT_EXECUTE iff it has changes since the last time metadata was sent for this prepared statement.

            Practically it'll mean that metadata will only be sent on COM_STMT_PREPARE.

            Connector will need to cache the metadata and update the cache every time metadata is received.

            Note: multiple result sets. A connector should be able to cache multiple metadata sets per one prepared statement.


            This approach will work automatically for any application as soon as the connector and the server are upgraded. One can reduce the transferred data even more with application-aware changes. Meaning, application tells the server "don't send any metadata, I know it already or don't care". I would suggest to move this application-aware optimization into a separate MDEV.

            serg Sergei Golubchik added a comment - - edited An approach that won't need need new COM commands and no application side changes could be: send metadata on COM_STMT_PREPARE send metadata on COM_STMT_EXECUTE iff it has changes since the last time metadata was sent for this prepared statement. Practically it'll mean that metadata will only be sent on COM_STMT_PREPARE. Connector will need to cache the metadata and update the cache every time metadata is received. Note: multiple result sets. A connector should be able to cache multiple metadata sets per one prepared statement. This approach will work automatically for any application as soon as the connector and the server are upgraded. One can reduce the transferred data even more with application-aware changes. Meaning, application tells the server "don't send any metadata, I know it already or don't care". I would suggest to move this application-aware optimization into a separate MDEV.
            wlad Vladislav Vaintroub added a comment - - edited

            I think with multiple result sets, it is rather peculiar.

            A prepared statement can produce multiple result sets. It happens (only) if it is a prepared CALL,
            which has SELECTS (SHOW, or CHECKSUM TABLE), either normal , or SQL-level PREPARED or EXECUTE IMMEDIATE. The user does not know how many result sets will come from such CALL. The number of result sets can change from one execution to another. The user does not get result set metadata of the internals SELECTs as response from COM_STMT_PREPARE, and I guess, these result sets metadata can't really be cached.

            However CALL can also have a single designated result set, which might be OK to cache on the client - output parameters set.

            With this, I think client should only be able to cache a single result set, and the server can have at most one "skip-metadata" result set per COM_STMT_EXECUTE.

            Note : semicolon-batch can't be prepared, and the protocol has no provision to report multiple results sets during COM_STMT_PREPARE.

            wlad Vladislav Vaintroub added a comment - - edited I think with multiple result sets, it is rather peculiar. A prepared statement can produce multiple result sets. It happens (only) if it is a prepared CALL, which has SELECTS (SHOW, or CHECKSUM TABLE), either normal , or SQL-level PREPARED or EXECUTE IMMEDIATE. The user does not know how many result sets will come from such CALL. The number of result sets can change from one execution to another. The user does not get result set metadata of the internals SELECTs as response from COM_STMT_PREPARE, and I guess, these result sets metadata can't really be cached. However CALL can also have a single designated result set, which might be OK to cache on the client - output parameters set. With this, I think client should only be able to cache a single result set, and the server can have at most one "skip-metadata" result set per COM_STMT_EXECUTE. Note : semicolon-batch can't be prepared, and the protocol has no provision to report multiple results sets during COM_STMT_PREPARE.
            wlad Vladislav Vaintroub added a comment - - edited

            sanja, could you please review ?
            You will notice constant MDEV_23913_FIXED which is currently set to 0, and this is about MDEV-23913, where metadata changes from prepare to execute, and that was not expected to change.

            Currently the detection of change is always based checksumming fields to be sent, but when MDEV-23913 is fixed, and MDEV_23913_FIXED is removed, it will be more efficient. Then change of the metadata will be almost always derived from whether reprepare ran, and fallbacks to checksumming will only be in degenerate cases, like SELECT ? or SELECT @user_var, or SELECT GREATEST(@a,@b) basically, when the type or the length/decimals etc of the result depends on user variables or a parameter

            wlad Vladislav Vaintroub added a comment - - edited sanja , could you please review ? You will notice constant MDEV_23913_FIXED which is currently set to 0, and this is about MDEV-23913 , where metadata changes from prepare to execute, and that was not expected to change. Currently the detection of change is always based checksumming fields to be sent, but when MDEV-23913 is fixed, and MDEV_23913_FIXED is removed, it will be more efficient. Then change of the metadata will be almost always derived from whether reprepare ran, and fallbacks to checksumming will only be in degenerate cases, like SELECT ? or SELECT @user_var, or SELECT GREATEST(@a,@b) basically, when the type or the length/decimals etc of the result depends on user variables or a parameter

            Address Review comments, add vertical space and doxygenize function comments

            wlad Vladislav Vaintroub added a comment - Address Review comments, add vertical space and doxygenize function comments

            People

              wlad Vladislav Vaintroub
              diego dupin Diego Dupin
              Votes:
              1 Vote for this issue
              Watchers:
              8 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.