Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-36317

vector search with Cosine Distance, the recall rate of the returned results is very low

Details

    • Bug
    • Status: Open (View Workflow)
    • Major
    • Resolution: Unresolved
    • None
    • 11.8
    • Vector search
    • None

    Description

      vector search with Cosine Distance, the recall rate of the returned results is very low.
      abs2 in FVector is memory value, when we load data from disk, we need init abs2 = 1.0f
      The fix code is as follows

      diff --git a/sql/vector_mhnsw.cc b/sql/vector_mhnsw.cc
      index d8a63a7558c..91256a31910 100644
      --- a/sql/vector_mhnsw.cc
      +++ b/sql/vector_mhnsw.cc
      @@ -820,7 +820,7 @@ int FVectorNode::load_from_record(TABLE *graph)
         FVector *vec_ptr= FVector::align_ptr(tref() + tref_len());
         memcpy(vec_ptr->data(), v->ptr(), v->length());
         vec_ptr->postprocess(ctx->vec_len);
      -
      +  if (ctx->metric == COSINE) vec_ptr->abs2 = 1.0f;
         longlong layer= graph->field[FIELD_LAYER]->val_int();
         if (layer > 100) // 10e30 nodes at M=2, more at larger M's
           return my_errno= HA_ERR_CRASHED;
      

      Attachments

        Issue Links

          Activity

            myx myx created issue -
            myx myx made changes -
            Field Original Value New Value
            alice Alice Sherepa made changes -
            Assignee Sergei Golubchik [ serg ]
            serg Sergei Golubchik made changes -
            Description vector search with Cosine Distance, the recall rate of the returned results is very low.
            abs2 in FVector is memory value, when we load data from disk, we need init abs2 = 1.0f
            The fix code is as follows
            ```
            diff --git a/sql/vector_mhnsw.cc b/sql/vector_mhnsw.cc
            index d8a63a7558c..91256a31910 100644
            --- a/sql/vector_mhnsw.cc
            +++ b/sql/vector_mhnsw.cc
            @@ -820,7 +820,7 @@ int FVectorNode::load_from_record(TABLE *graph)
               FVector *vec_ptr= FVector::align_ptr(tref() + tref_len());
               memcpy(vec_ptr->data(), v->ptr(), v->length());
               vec_ptr->postprocess(ctx->vec_len);
            -
            + if (ctx->metric == COSINE) vec_ptr->abs2 = 1.0f;
               longlong layer= graph->field[FIELD_LAYER]->val_int();
               if (layer > 100) // 10e30 nodes at M=2, more at larger M's
                 return my_errno= HA_ERR_CRASHED;
            ```
            vector search with Cosine Distance, the recall rate of the returned results is very low.
            abs2 in FVector is memory value, when we load data from disk, we need init abs2 = 1.0f
            The fix code is as follows
            {code:diff}
            diff --git a/sql/vector_mhnsw.cc b/sql/vector_mhnsw.cc
            index d8a63a7558c..91256a31910 100644
            --- a/sql/vector_mhnsw.cc
            +++ b/sql/vector_mhnsw.cc
            @@ -820,7 +820,7 @@ int FVectorNode::load_from_record(TABLE *graph)
               FVector *vec_ptr= FVector::align_ptr(tref() + tref_len());
               memcpy(vec_ptr->data(), v->ptr(), v->length());
               vec_ptr->postprocess(ctx->vec_len);
            -
            + if (ctx->metric == COSINE) vec_ptr->abs2 = 1.0f;
               longlong layer= graph->field[FIELD_LAYER]->val_int();
               if (layer > 100) // 10e30 nodes at M=2, more at larger M's
                 return my_errno= HA_ERR_CRASHED;
            {code}
            serg Sergei Golubchik made changes -
            Status Open [ 1 ] Needs Feedback [ 10501 ]
            serg Sergei Golubchik made changes -
            Priority Minor [ 4 ] Major [ 3 ]
            serg Sergei Golubchik made changes -
            Fix Version/s 11.8 [ 29921 ]
            serg Sergei Golubchik made changes -
            Status Needs Feedback [ 10501 ] Open [ 1 ]
            serg Sergei Golubchik made changes -
            Component/s Vector search [ 20205 ]
            serg Sergei Golubchik made changes -

            People

              serg Sergei Golubchik
              myx myx
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated:

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.