Details

    Description

      VEC_DISTANCE() function that takes two binary strings (which are vectors of floats of the same length) and computes Euclidean distance between two multi-dimensional points, much like ST_DISTANCE() does.

      it'll also work on the VECTOR(N) data type, when we'll have one

      Attachments

        Issue Links

          Activity

            serg Sergei Golubchik created issue -
            serg Sergei Golubchik made changes -
            Field Original Value New Value
            Labels vector
            serg Sergei Golubchik made changes -
            serg Sergei Golubchik made changes -
            serg Sergei Golubchik made changes -
            Fix Version/s 11.5 [ 29506 ]
            serg Sergei Golubchik made changes -
            Status Open [ 1 ] In Progress [ 3 ]
            serg Sergei Golubchik made changes -
            Status In Progress [ 3 ] In Testing [ 10301 ]
            serg Sergei Golubchik made changes -
            serg Sergei Golubchik made changes -
            serg Sergei Golubchik made changes -
            Fix Version/s 11.6 [ 29515 ]
            Fix Version/s 11.5 [ 29506 ]

            PlanetScale intends to support three or more different distance functions, so we need a syntax for that. Our proposal is VEC_L2Distance, VEC_IPDistance, and VEC_CosineDistance. This is similar to spatial types, which have separate functions for simple, Hausdorff, and Frechet distance.

            We may in the future also want VEC_HammingDistance for one-bit vectors.

            piki Patrick Reynolds added a comment - PlanetScale intends to support three or more different distance functions, so we need a syntax for that. Our proposal is VEC_L2Distance, VEC_IPDistance, and VEC_CosineDistance. This is similar to spatial types, which have separate functions for simple, Hausdorff, and Frechet distance. We may in the future also want VEC_HammingDistance for one-bit vectors.
            serg Sergei Golubchik made changes -
            Priority Major [ 3 ] Critical [ 2 ]
            ralf.gebhardt Ralf Gebhardt made changes -
            Labels vector Preview_11.6 vector
            ralf.gebhardt Ralf Gebhardt made changes -
            Labels Preview_11.6 vector vector
            rdyas Robert Dyas added a comment -

            Presumably all vector distance functions return a single float between 0 and 1 with 1.0 being a perfect match.

            rdyas Robert Dyas added a comment - Presumably all vector distance functions return a single float between 0 and 1 with 1.0 being a perfect match.
            serg Sergei Golubchik added a comment - - edited

            No, not at all. If the distance is L2 (euclidean distance, |x - y|, then the result is a non-negative number, where 0 means a perfect match. That's how the euclidean distance is defined.

            If the distance is cosine (1-cos(∠xy)), then it's 0 being a perfect match, 2 being the farthest possible. Again, that's just what a cosine distance is.

            serg Sergei Golubchik added a comment - - edited No, not at all. If the distance is L2 (euclidean distance, | x - y | , then the result is a non-negative number, where 0 means a perfect match. That's how the euclidean distance is defined. If the distance is cosine ( 1-cos(∠ xy ) ), then it's 0 being a perfect match, 2 being the farthest possible. Again, that's just what a cosine distance is.
            rdyas Robert Dyas added a comment -

            Thank you for the clarification.

            rdyas Robert Dyas added a comment - Thank you for the clarification.
            serg Sergei Golubchik made changes -
            Fix Version/s 11.7 [ 29815 ]
            Fix Version/s 11.6 [ 29515 ]

            as we're modelling function names after GIS, we'll use ST_DISTANCE() and ST_DISTANCE_SPHERE() as role models. Meaning we'll have VEC_DISTANCE_EUCLIDEAN(), VEC_DISTANCE_COSINE(), etc.

            May be VEC_DISTANCE() should mean Euclidean, but it'll likely be too confusing.

            serg Sergei Golubchik added a comment - as we're modelling function names after GIS, we'll use ST_DISTANCE() and ST_DISTANCE_SPHERE() as role models. Meaning we'll have VEC_DISTANCE_EUCLIDEAN() , VEC_DISTANCE_COSINE() , etc. May be VEC_DISTANCE() should mean Euclidean, but it'll likely be too confusing.
            serg Sergei Golubchik made changes -
            serg Sergei Golubchik made changes -
            serg Sergei Golubchik made changes -
            Component/s Vector search [ 20205 ]
            Component/s Server [ 13907 ]
            Fix Version/s 11.7.1 [ 29913 ]
            Fix Version/s 11.7 [ 29815 ]
            Resolution Fixed [ 1 ]
            Status In Testing [ 10301 ] Closed [ 6 ]

            People

              serg Sergei Golubchik
              serg Sergei Golubchik
              Votes:
              2 Vote for this issue
              Watchers:
              19 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.