Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-38113

Vector select is not always returning the same result set

    XMLWordPrintable

Details

    • Bug
    • Status: Open (View Workflow)
    • Major
    • Resolution: Unresolved
    • 11.8.5
    • None
    • Vector search
    • Distributor ID: Ubuntu
      Description: Ubuntu 22.04.5 LTS
      Release: 22.04
      Codename: jammy

    Description

      The number of rows returned with below query is not always the same

      SELECT run_errorString_embeddings.run_id, VEC_DISTANCE_EUCLIDEAN(embedding, VEC_FromText(@myvec)) as distance FROM run_errorString_embeddings ORDER BY distance limit 1000;
      

      running this query multiple times gives us the following results:

      495 rows in set (0.219 sec)
      1000 rows in set (0.509 sec)
      479 rows in set (0.252 sec)
      501 rows in set (0.263 sec)
      1000 rows in set (0.476 sec)
      481 rows in set (0.244 sec)
      

      the table format is created like below:

      CREATE TABLE `run_errorString_embeddings` (
      	`run_id` INT(10) UNSIGNED NOT NULL,
      	`attributes` LONGTEXT NOT NULL COLLATE 'utf8mb4_bin',
      	`embedding` VECTOR(1024) NOT NULL,
      	`vector_added` TINYINT(1) NOT NULL DEFAULT '0',
      	PRIMARY KEY (`run_id`) USING BTREE,
      	VECTOR INDEX `vect_embeddings` (`embedding`),
      	CONSTRAINT `attributes` CHECK (json_valid(`attributes`))
      )
      COLLATE='utf8mb4_bin'
      ENGINE=InnoDB
      ;
      

      Attachments

        Activity

          People

            serg Sergei Golubchik
            Maikel Punie Maikel Punie
            Votes:
            1 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated:

              Git Integration

                Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.