Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-22760

Bulk INSERT...ON DUPLICATE KEY UPDATE updates only a fraction of rows

    XMLWordPrintable

Details

    • Bug
    • Status: Confirmed (View Workflow)
    • Major
    • Resolution: Unresolved
    • 10.4.13, 10.4(EOL), 10.5
    • 10.5
    • Server
    • None
    • Windows 10 64-bit

    Description

      Consider the following table:

      CREATE TABLE `demo_upsert` (
          `ID` BIGINT(20) NOT NULL AUTO_INCREMENT,
          `Url` TEXT NOT NULL COLLATE 'utf8mb4_unicode_ci',
          `LastListID` BIGINT(20) NULL DEFAULT NULL,
          `Statistics` BIGINT(20) NOT NULL DEFAULT '0',
          PRIMARY KEY (`ID`) USING BTREE,
          UNIQUE INDEX `Url` (`Url`) USING HASH
      )
      COLLATE='utf8mb4_unicode_ci'
      ENGINE=InnoDB
      ;
      

      When running the following program:

      #include <cassert>
      #include <string>
      #include <random>
      #include <iostream>
      #include "mariadb/mysql.h"
      #include "Secret.h" // for database credentials
      using namespace std;
       
      string randStr(const int len, mt19937_64& rng) {
        string ans;
        for (int i = 0; i < len; i++) {
          ans += ('a' + rng() % 26);
        }
        return ans;
      }
       
      int main() {
        MYSQL mysql;
        mysql_library_init(0, nullptr, nullptr);
        mysql_thread_init();
        MYSQL* pmsRet = mysql_init(&mysql);
        assert(pmsRet);
        pmsRet = mysql_real_connect(&mysql, ConnProp::_cpHost, ConnProp::_cpUser, ConnProp::_cpPasswd,
          ConnProp::_cpDB, ConnProp::_cpPort, nullptr, CLIENT_COMPRESS | CLIENT_LOCAL_FILES | CLIENT_MULTI_STATEMENTS
          | CLIENT_MULTI_RESULTS);
        assert(pmsRet);
        const int lastListID = 10;
        mt19937_64 rng;
        for (int i = 0; i < 3; i++) {
          string query;
          query = "INSERT INTO demo_upsert (Url, LastListID) VALUES ";
          for (int j = 0; j < 15000; j++) {
            if (j != 0) {
              query += ", ";
            }
            query += "('" + randStr(3, rng) + "', " + to_string(lastListID) + ")";
          }
          query += " ON DUPLICATE KEY UPDATE LastListID=" + to_string(lastListID);
          int iRet = mysql_real_query(&mysql, query.c_str(), query.size());
          assert(iRet == 0);
          int64_t nAffected = mysql_affected_rows(&mysql);
          cout << nAffected << ' ';
          query = "UPDATE demo_upsert SET LastListID=NULL";
          iRet = mysql_real_query(&mysql, query.c_str(), query.size());
          assert(iRet == 0);
          nAffected = mysql_affected_rows(&mysql);
          cout << nAffected << endl;
        }
        return 0;
      }
      

      I get for example the following output:
      10130 10130
      11627 5814
      11620 5811

      It indicates the following:
      Initially, when the table is empty, bulk INSERT...ON DUPLICATE KEY UPDATE seems to insert correctly 10130 distinct values (out of 15000 non-distinct) because the following UPDATE query sets exactly that amount to NULL.
      However, the next bulk INSERT...ON DUPLICATE KEY UPDATE sets LastListID to non-null for only 5814 distinct rows out of ~10K expected distinct rows out of 15K non-distinct. We infer this because the following UPDATE reports affected only 5814 rows, meaning that the previous INSERT...ON DUPLICATE KEY UPDATE called UPDATE only for that number of rows.

      I cannot reproduce the problem from HeidiSQL 11.0.0.5989. I could only reproduce it when using Connector/C versions 3.1.7 and 3.1.8 (I didn't try the older versions). The problem appears for both a direct and prepared statement.

      MariaDB server versions tried: 10.4.13 and 10.4.12.

      Attachments

        Issue Links

          Activity

            People

              serg Sergei Golubchik
              rserge Serge Rogatch
              Votes:
              0 Vote for this issue
              Watchers:
              5 Start watching this issue

              Dates

                Created:
                Updated:

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.