Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-36317

vector search with Cosine Distance, the recall rate of the returned results is very low

Details

    • Bug
    • Status: Open (View Workflow)
    • Major
    • Resolution: Unresolved
    • None
    • 11.8
    • Vector search
    • None

    Description

      vector search with Cosine Distance, the recall rate of the returned results is very low.
      abs2 in FVector is memory value, when we load data from disk, we need init abs2 = 1.0f
      The fix code is as follows

      diff --git a/sql/vector_mhnsw.cc b/sql/vector_mhnsw.cc
      index d8a63a7558c..91256a31910 100644
      --- a/sql/vector_mhnsw.cc
      +++ b/sql/vector_mhnsw.cc
      @@ -820,7 +820,7 @@ int FVectorNode::load_from_record(TABLE *graph)
         FVector *vec_ptr= FVector::align_ptr(tref() + tref_len());
         memcpy(vec_ptr->data(), v->ptr(), v->length());
         vec_ptr->postprocess(ctx->vec_len);
      -
      +  if (ctx->metric == COSINE) vec_ptr->abs2 = 1.0f;
         longlong layer= graph->field[FIELD_LAYER]->val_int();
         if (layer > 100) // 10e30 nodes at M=2, more at larger M's
           return my_errno= HA_ERR_CRASHED;
      

      Attachments

        Issue Links

          Activity

            People

              serg Sergei Golubchik
              myx myx
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated:

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.