Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-36338

vector search with Cosine Distance is slow

Details

    • Bug
    • Status: Closed (View Workflow)
    • Major
    • Resolution: Duplicate
    • None
    • N/A
    • Vector search
    • None

    Description

      run ann-benchmarks with nytimes-256-angular is slow.
      cpu perf pic is attached
      pstack pic is attached show PatternedSimdBloomFilter size is too large

      Attachments

        1. m1.jpg
          m1.jpg
          1.38 MB
        2. m2.jpg
          m2.jpg
          474 kB

        Issue Links

          Activity

            myx myx added a comment - - edited

            we should reset ef_power = 0.6 in MHNSW_Share::acquire

            diff --git a/sql/vector_mhnsw.cc b/sql/vector_mhnsw.cc
            index d8a63a7558c..14eec69bbd2 100644
            --- a/sql/vector_mhnsw.cc
            +++ b/sql/vector_mhnsw.cc
            @@ -715,7 +715,7 @@ int MHNSW_Share::acquire(MHNSW_Share **ctx, TABLE *table, bool for_update)
                 if (table->file->has_transactions())
                   mysql_rwlock_rdlock(&(*ctx)->commit_lock);
               }
            -
            +  (*ctx)->ef_power = 0.6;
               if ((*ctx)->start)
                 return 0;
            

            myx myx added a comment - - edited we should reset ef_power = 0.6 in MHNSW_Share::acquire diff --git a/sql/vector_mhnsw.cc b/sql/vector_mhnsw.cc index d8a63a7558c..14eec69bbd2 100644 --- a/sql/vector_mhnsw.cc +++ b/sql/vector_mhnsw.cc @@ - 715 , 7 + 715 , 7 @@ int MHNSW_Share::acquire(MHNSW_Share **ctx, TABLE *table, bool for_update) if (table->file->has_transactions()) mysql_rwlock_rdlock(&(*ctx)->commit_lock); } - + (*ctx)->ef_power = 0.6 ; if ((*ctx)->start) return 0 ;
            myx myx added a comment -

            ctx->diameter also need reset in MHNSW_Share::acquire

            myx myx added a comment - ctx->diameter also need reset in MHNSW_Share::acquire
            serg Sergei Golubchik added a comment -

            ctx->diameter must not be reset, it's the largest distance between any two vectors in the index.

            The bloom filter size is far from optimal, indeed. It will be fixed in MDEV-35897

            serg Sergei Golubchik added a comment - ctx->diameter must not be reset, it's the largest distance between any two vectors in the index. The bloom filter size is far from optimal, indeed. It will be fixed in MDEV-35897

            People

              Unassigned Unassigned
              myx myx
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.