Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-34356

a helper to configure vector search

    XMLWordPrintable

Details

    • New Feature
    • Status: Open (View Workflow)
    • Major
    • Resolution: Unresolved
    • 11.8
    • None
    • None

    Description

      Vector search has a set of configurable parameters. Some of them affect the speed of the search and the recall. Others affect the speed of inserting and the recall. Some affect the memory and storage consumption, others don't. Usually the longer the insert (or search) takes the higher the recall is, so there's always a tradeoff. Also, one can make the search slower and the inserting faster with the same recall. Or the vice versa, the inserting slower, but the search faster. It's another tradeoff. The optimal parameter values depend on the actual data and are fairly impossible to guess. We need a way to help users to tune their vector search parameters.

      Luckily, it's rather easy to do, conceptually. Even if very slow. The user need to load the data — the larger the sample is, the better. And collect actual queries, that is query vectors, again the more the better. After that the tool will perform the search using provided queries in the provided data without using an index — to obtain exact set of nearest neighbors — and then it could tune the index for the optimal recall and speed, as directed by the user.

      This could be a separate tool, a set of stored routines, or a mix of both.

      Attachments

        Issue Links

          Activity

            People

              serg Sergei Golubchik
              serg Sergei Golubchik
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated:

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.