Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-13621

CONNECT engine table_type=JDBC UPDATE containing single or double quote chars produces wrong result

Details

    Description

      In testing with the Microsoft JDBC environemnt,

      INSERTing a value like the following DOES work: Bob's
      INSERTing a value like the following DOES work: size is 24"

      However, trying to UPDATE a column and specifying those values does NOT work...
      for UPDATE with a string like Bob's you get an error, and for UPDATE with size is 24" you don't get an error but it is saved as \" and the next query returns that.

      The escaping is the same for INSERT and UPDATE, but it doesn't seem to work correctly for UPDATE.

      Attachments

        Issue Links

          Activity

            Insert works differently: CONNECT make an INSERT statement (prepared when accepted by the driver) and fills the values before executing it.

            For UPDATE and DELETE, the original query is retrieved from MariaDB and edited to change the table name. The issue is that the query retrieved using the thd_query_string funtion is not identical to the original one. For instance, when executing:

            update mynumb set v = 'Fred\'s' where n = 1;
            update mynumb set v = 'size is 28"' where n = 2;
            

            The thd_query_string funtion returns:

            update mynumb set v = 'Fred\\'s' where n = 1;
            update mynumb set v = 'size is 28\"' where n = 2;
            

            I think this is a MariaDB problem (bug?), not a CONNECT one.

            Note that when connected via JDBC to a MySQL or MariaDB server, the update is done correctly, probably because the parser knows how to eliminate the spurious backslash.

            Meanwhile a turnaround seems to replace escaping by doubling the character:

            update mynumb set v = 'Fred''s' where n = 1;
            update mynumb set v = "size is 28""" where n = 2;
            

            Providing the remote server accepts it.

            bertrandop Olivier Bertrand added a comment - Insert works differently: CONNECT make an INSERT statement (prepared when accepted by the driver) and fills the values before executing it. For UPDATE and DELETE, the original query is retrieved from MariaDB and edited to change the table name. The issue is that the query retrieved using the thd_query_string funtion is not identical to the original one. For instance, when executing: update mynumb set v = 'Fred\'s' where n = 1; update mynumb set v = 'size is 28"' where n = 2; The thd_query_string funtion returns: update mynumb set v = 'Fred\\'s' where n = 1; update mynumb set v = 'size is 28\"' where n = 2; I think this is a MariaDB problem (bug?), not a CONNECT one. Note that when connected via JDBC to a MySQL or MariaDB server, the update is done correctly, probably because the parser knows how to eliminate the spurious backslash. Meanwhile a turnaround seems to replace escaping by doubling the character: update mynumb set v = 'Fred''s' where n = 1; update mynumb set v = "size is 28""" where n = 2; Providing the remote server accepts it.
            rdyas Robert Dyas added a comment -

            Given the two bug reports I've submitted where INSERT works correctly and UPDATE does not, would it be possible for UPDATE to be changed to work more like INSERT? ("CONNECT makes an INSERT statement (prepared when accepted by the driver) and fills the values before executing it.")

            It seems that if UPDATE were to bind variables using a prepared statement it would avoid multiple problems 1) this one with different servers requiring different escaping of special characters (the driver's bind would deal with this) and 2) issues with certain characters not encoding correctly (bug filed where INSERT does it right and UPDATE does it wrong with many foreign characters, see MDEV-13504).

            rdyas Robert Dyas added a comment - Given the two bug reports I've submitted where INSERT works correctly and UPDATE does not, would it be possible for UPDATE to be changed to work more like INSERT? ("CONNECT makes an INSERT statement (prepared when accepted by the driver) and fills the values before executing it.") It seems that if UPDATE were to bind variables using a prepared statement it would avoid multiple problems 1) this one with different servers requiring different escaping of special characters (the driver's bind would deal with this) and 2) issues with certain characters not encoding correctly (bug filed where INSERT does it right and UPDATE does it wrong with many foreign characters, see MDEV-13504 ).

            Yes. However, apparently, JDBC does not have the SELECT ... FOR UPDATE statement. Therefore, constructing a prepared statement for UPDATE or DELETE would require to re-parse the statement and I don't think this can be done from a storage engine. Or to make it like for INSERT from the used columns.

            The issue is that this prepared statement must also include the WHERE clause. This does not exists for INSERT. It can be retrieved by the cond_push feature but there is no garanty it is allways completely retrieved. This could lead to a somewhat destructing statement that would update more rows than wished.

            All these problems would be solved if CONNECT could get the original query unchanged.

            bertrandop Olivier Bertrand added a comment - Yes. However, apparently, JDBC does not have the SELECT ... FOR UPDATE statement. Therefore, constructing a prepared statement for UPDATE or DELETE would require to re-parse the statement and I don't think this can be done from a storage engine. Or to make it like for INSERT from the used columns. The issue is that this prepared statement must also include the WHERE clause. This does not exists for INSERT. It can be retrieved by the cond_push feature but there is no garanty it is allways completely retrieved. This could lead to a somewhat destructing statement that would update more rows than wished. All these problems would be solved if CONNECT could get the original query unchanged.
            rdyas Robert Dyas added a comment -

            RE "It can be retrieved by the cond_push feature but there is no granularity it is always completely retrieved." Can you explain more or provide an example?

            pstmt = conn.prepareStatement("UPDATE `Foreign_Table_Name` SET `col1` = ? WHERE `id` = ? ");
            pstmt.setString(1, "somevalue not escped");
            pstmt.setString(2, "some PK value");
            int rowsAffected = pstmt.executeUpdate();

            So if the UPDATE given to MariaDB has a WHERE clause specified, it would always be in cond_push correct? And if not, then that is correct as well (i.e. not specified, so applies to all rows).

            rdyas Robert Dyas added a comment - RE "It can be retrieved by the cond_push feature but there is no granularity it is always completely retrieved." Can you explain more or provide an example? pstmt = conn.prepareStatement("UPDATE `Foreign_Table_Name` SET `col1` = ? WHERE `id` = ? "); pstmt.setString(1, "somevalue not escped"); pstmt.setString(2, "some PK value"); int rowsAffected = pstmt.executeUpdate(); So if the UPDATE given to MariaDB has a WHERE clause specified, it would always be in cond_push correct? And if not, then that is correct as well (i.e. not specified, so applies to all rows).
            bertrandop Olivier Bertrand added a comment - - edited

            As it is, the cond_push function does not handle all the where clause possibilities that can be fairly complex with functions, like, subselect, and applied to other tables in case of join.

            And how to handle in a prepared statement the IN, IS NULL and IS NOT NULL clauses?

            I don't say it is impossible but it represents quite a new implementation, possibly buggy, while retrieving the unchanged original query statement is a simple fix for all this.

            bertrandop Olivier Bertrand added a comment - - edited As it is, the cond_push function does not handle all the where clause possibilities that can be fairly complex with functions, like, subselect, and applied to other tables in case of join. And how to handle in a prepared statement the IN, IS NULL and IS NOT NULL clauses? I don't say it is impossible but it represents quite a new implementation, possibly buggy, while retrieving the unchanged original query statement is a simple fix for all this.
            rdyas Robert Dyas added a comment -

            Not every variable needs to be bound in a prepared statement, you can use a hybrid approach like this:

            pstmt = conn.prepareStatement("UPDATE `foreignTable` SET col1 = ? WHERE id = 12345 ");
            pstmt.setString(1, "somevalue not escped");
            

            To bind NULL values in a SET clause, this will work:

            pstmt.setNull(1, Types.NULL);
            

            For WHERE clause I believe you do need to specify IS NULL or IS NOT NULL which is easy with hybrid approach.

            If you could make it so the SET portion of the clause is bound but the WHERE clause is as it is now, that would be a nice improvement if it didn't introduce new bugs.

            rdyas Robert Dyas added a comment - Not every variable needs to be bound in a prepared statement, you can use a hybrid approach like this: pstmt = conn.prepareStatement( "UPDATE `foreignTable` SET col1 = ? WHERE id = 12345 " ); pstmt.setString( 1 , "somevalue not escped" ); To bind NULL values in a SET clause, this will work: pstmt.setNull( 1 , Types.NULL); For WHERE clause I believe you do need to specify IS NULL or IS NOT NULL which is easy with hybrid approach. If you could make it so the SET portion of the clause is bound but the WHERE clause is as it is now, that would be a nice improvement if it didn't introduce new bugs.

            The WHERE clause can contain quotes and double quotes as well.

            bertrandop Olivier Bertrand added a comment - The WHERE clause can contain quotes and double quotes as well.
            rdyas Robert Dyas added a comment -

            Absolutely, but in our use cases most queries are primary key based matching integers, or ranges matching dates, or simple text queries (seldom quotes or extended characters). No doubt the WHERE clause is an issue, but the SET clause handling the extended chars, single and double quotes is at least 10x more important for our use cases. So a partial solution is a huge step forward.

            Is the "hybrid" approach above possible? (i.e. bind the SET variables and do WHERE like now)

            rdyas Robert Dyas added a comment - Absolutely, but in our use cases most queries are primary key based matching integers, or ranges matching dates, or simple text queries (seldom quotes or extended characters). No doubt the WHERE clause is an issue, but the SET clause handling the extended chars, single and double quotes is at least 10x more important for our use cases. So a partial solution is a huge step forward. Is the "hybrid" approach above possible? (i.e. bind the SET variables and do WHERE like now)

            Thanks to Sergei information, I found how to retrieve the original query without additional escapes.

            This is the perfect fix for all our problems.

            bertrandop Olivier Bertrand added a comment - Thanks to Sergei information, I found how to retrieve the original query without additional escapes. This is the perfect fix for all our problems.
            rdyas Robert Dyas added a comment -

            Will this work even if the original (by "original" I mean the one the user submits) query includes sub selects in the where clause that reference tables not in the remote system?

            Will the following work? (Bob's has two single quotes)

            UPDATE remoteT1 SET col1 = 'Bob''s fence is 24" high' WHERE id = (select distinct id from localT2 where a > 17)
            

            The above is not our immediate use case, but super great if supported.

            rdyas Robert Dyas added a comment - Will this work even if the original (by "original" I mean the one the user submits) query includes sub selects in the where clause that reference tables not in the remote system? Will the following work? (Bob's has two single quotes) UPDATE remoteT1 SET col1 = 'Bob' 's fence is 24" high' WHERE id = (select distinct id from localT2 where a > 17 ) The above is not our immediate use case, but super great if supported.

            Of course it won't work and reply table localT2 does not exist.

            Supporting it would require a complete change. Something like replacing the query by something like:

            SELECT ... FOR UPDATE
            

            I don't even dream of trying implementing this which, by the way, would not work in our case. Remember that the WHERE clause is parsed and must be understood locally but is applied by the remote server.

            bertrandop Olivier Bertrand added a comment - Of course it won't work and reply table localT2 does not exist. Supporting it would require a complete change. Something like replacing the query by something like: SELECT ... FOR UPDATE I don't even dream of trying implementing this which, by the way, would not work in our case. Remember that the WHERE clause is parsed and must be understood locally but is applied by the remote server.
            rdyas Robert Dyas added a comment -

            Well, the following works today (I just tested it):

            select * from MyConnectTable where id in (select distinct Customer_ID from MyLocalNonConnect)
            

            But an UPDATE statement with the same WHERE clause fails.

            In theory (I realize not how it works today) couldn't sub queries be processed by MariaDB locally (select distinct Customer_ID from MyLocalNonConnect) and the VALUES from the result be passed to the remote server? To be clear this is not a feature request - just thinking.

            rdyas Robert Dyas added a comment - Well, the following works today (I just tested it): select * from MyConnectTable where id in (select distinct Customer_ID from MyLocalNonConnect) But an UPDATE statement with the same WHERE clause fails. In theory (I realize not how it works today) couldn't sub queries be processed by MariaDB locally (select distinct Customer_ID from MyLocalNonConnect) and the VALUES from the result be passed to the remote server? To be clear this is not a feature request - just thinking.

            I am trying to keep things simple and working, even warning that UPDATE and DELETE are supported "in a simplified way".

            For instance, how could CONNECT decide whether the sub-query applies to a local or remote table?

            bertrandop Olivier Bertrand added a comment - I am trying to keep things simple and working, even warning that UPDATE and DELETE are supported "in a simplified way". For instance, how could CONNECT decide whether the sub-query applies to a local or remote table?
            rdyas Robert Dyas added a comment -

            I can reopen this discussion in another MDEV at a later date, as I don't want to slow down the fixing of the single and double quote issue itself... those are important core use cases as we can't control what a user types in, and many of our customers have product descriptions using both characters (e.g. Gold 8" long pendant from Kyle's Jewlery). And stability is most important at this stage.

            One final thought ... but since I have no idea about the internal workings of MariaDB and how queries interact with storage engines, all I can do is speculate on what a solution might look like. But one solution regarding how CONNECT could decide if sub-query is local or remote --> don't decide If you could take any subquery and pass it back up to the top MariaDB server (local table or remote), get the results as a derived table with a single column, then use those in a hard-coded query... seems like that would handle a lot of use cases (not correlated sub queries, but others).

            So this:

            UPDATE remoteTable SET col1 = 'abc' WHERE id IN (SELECT id from localTable where a > 7)
            

            takes the sub-query and passes it to mariadb top level to execute, and the results of that are used to re-write the sub-query to something like this:

            UPDATE remoteTable SET col1 = 'abc' WHERE id IN (45,27,1234,5554,67)
            

            Right now sub-query in update is purely a speculative use case for us whereas the single and double quote handling (and the Ntext datatype support) are real-life immediate needs. The rest seems to be working very well, and stability of the Java connect stuff is excellent.

            rdyas Robert Dyas added a comment - I can reopen this discussion in another MDEV at a later date, as I don't want to slow down the fixing of the single and double quote issue itself... those are important core use cases as we can't control what a user types in, and many of our customers have product descriptions using both characters (e.g. Gold 8" long pendant from Kyle's Jewlery). And stability is most important at this stage. One final thought ... but since I have no idea about the internal workings of MariaDB and how queries interact with storage engines, all I can do is speculate on what a solution might look like. But one solution regarding how CONNECT could decide if sub-query is local or remote --> don't decide If you could take any subquery and pass it back up to the top MariaDB server (local table or remote), get the results as a derived table with a single column, then use those in a hard-coded query... seems like that would handle a lot of use cases (not correlated sub queries, but others). So this: UPDATE remoteTable SET col1 = 'abc' WHERE id IN (SELECT id from localTable where a > 7 ) takes the sub-query and passes it to mariadb top level to execute, and the results of that are used to re-write the sub-query to something like this: UPDATE remoteTable SET col1 = 'abc' WHERE id IN ( 45 , 27 , 1234 , 5554 , 67 ) Right now sub-query in update is purely a speculative use case for us whereas the single and double quote handling (and the Ntext datatype support) are real-life immediate needs. The rest seems to be working very well, and stability of the Java connect stuff is excellent.

            People

              bertrandop Olivier Bertrand
              rdyas Robert Dyas
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.