Details

    Description

      For the first iteration of Vector search, we will implement HNSW algorithm.

      The implementation will only support Euclidean distance initially.

      Basic plan:
      Graph construction will be done according to HNSW paper.

      Storage wise, we'll store the graph as part of a subtable (MDEV-33404).

      The table's definition will be something along these lines:

        CREATE TABLE i (
          level int unsigned not null,
          src varbinary(255) not null,
          dst varbinary(255) not null,
          index (level,src),
          index (level,dst));
      

      For each link in the graph, there will be a corresponding entry in the table.

      • src and dst will store handler::position, a quick link to the actual vector blob in the main table.

      The index (level,src) will allow for quick jumping between nodes.
      To go deeper in search, one just needs to decrement the level and search using the same "src" value.

      If src is found on level n, then it is also found on level n - 1 and so on. Level 0 is the base level with all the nodes.

      Performance considerations:

      • Storing the vector in the subtable might be required. Looking up the blob value in the base table might be too costly.

      Attachments

        Issue Links

          Activity

            serg Sergei Golubchik created issue -
            serg Sergei Golubchik made changes -
            Field Original Value New Value
            serg Sergei Golubchik made changes -
            Description basic HNSW implementation basic HNSW implementation. Euclidean distance only.
            serg Sergei Golubchik made changes -
            serg Sergei Golubchik made changes -
            serg Sergei Golubchik made changes -
            serg Sergei Golubchik made changes -
            serg Sergei Golubchik made changes -
            serg Sergei Golubchik made changes -
            serg Sergei Golubchik made changes -
            cvicentiu Vicențiu Ciorbaru made changes -
            Description basic HNSW implementation. Euclidean distance only. For the first iteration of Vector search, we will implement HNSW algorithm.

            The implementation will only support Euclidean distance initially.

            Basic plan:
            Store the graph as part of a subtable
            cvicentiu Vicențiu Ciorbaru made changes -
            Description For the first iteration of Vector search, we will implement HNSW algorithm.

            The implementation will only support Euclidean distance initially.

            Basic plan:
            Store the graph as part of a subtable
            For the first iteration of Vector search, we will implement HNSW algorithm.

            The implementation will only support Euclidean distance initially.

            Basic plan:
            Store the graph as part of a subtable
            cvicentiu Vicențiu Ciorbaru made changes -
            Description For the first iteration of Vector search, we will implement HNSW algorithm.

            The implementation will only support Euclidean distance initially.

            Basic plan:
            Store the graph as part of a subtable
            For the first iteration of Vector search, we will implement HNSW algorithm.

            The implementation will only support Euclidean distance initially.

            *Basic plan:*
            Store the graph as part of a subtable (MDEV-33404).

            The table's definition will be something along these lines:
            {code:sql}
              CREATE TABLE i (
                level int unsigned not null,
                src varbinary(255) not null,
                dst varbinary(255) not null,
                index (level,src),
                index (level,dst));
            {code}

            For each link in the graph, there will be a corresponding entry in the table.
            * {{src}} and {{dst}} will store {{handler::position}}, a quick link to the actual vector blob in the main table.

            The index (level,src) will allow for quick jumping between nodes.
            To go deeper in search, one just needs to increment the level.

            If src is found on level {{n}}, then it is also found on level {{n - 1}} and so on. Level {{0}} is the base level with all the nodes.
            cvicentiu Vicențiu Ciorbaru made changes -
            Description For the first iteration of Vector search, we will implement HNSW algorithm.

            The implementation will only support Euclidean distance initially.

            *Basic plan:*
            Store the graph as part of a subtable (MDEV-33404).

            The table's definition will be something along these lines:
            {code:sql}
              CREATE TABLE i (
                level int unsigned not null,
                src varbinary(255) not null,
                dst varbinary(255) not null,
                index (level,src),
                index (level,dst));
            {code}

            For each link in the graph, there will be a corresponding entry in the table.
            * {{src}} and {{dst}} will store {{handler::position}}, a quick link to the actual vector blob in the main table.

            The index (level,src) will allow for quick jumping between nodes.
            To go deeper in search, one just needs to increment the level.

            If src is found on level {{n}}, then it is also found on level {{n - 1}} and so on. Level {{0}} is the base level with all the nodes.
            For the first iteration of Vector search, we will implement HNSW algorithm.

            The implementation will only support Euclidean distance initially.

            *Basic plan:*
            Store the graph as part of a subtable (MDEV-33404).

            The table's definition will be something along these lines:
            {code:sql}
              CREATE TABLE i (
                level int unsigned not null,
                src varbinary(255) not null,
                dst varbinary(255) not null,
                index (level,src),
                index (level,dst));
            {code}

            For each link in the graph, there will be a corresponding entry in the table.
            * {{src}} and {{dst}} will store {{handler::position}}, a quick link to the actual vector blob in the main table.

            The index (level,src) will allow for quick jumping between nodes.
            To go deeper in search, one just needs to increment the level.

            If {{src}} is found on level {{n}}, then it is also found on level {{n - 1}} and so on. Level {{0}} is the base level with all the nodes.
            cvicentiu Vicențiu Ciorbaru made changes -
            Description For the first iteration of Vector search, we will implement HNSW algorithm.

            The implementation will only support Euclidean distance initially.

            *Basic plan:*
            Store the graph as part of a subtable (MDEV-33404).

            The table's definition will be something along these lines:
            {code:sql}
              CREATE TABLE i (
                level int unsigned not null,
                src varbinary(255) not null,
                dst varbinary(255) not null,
                index (level,src),
                index (level,dst));
            {code}

            For each link in the graph, there will be a corresponding entry in the table.
            * {{src}} and {{dst}} will store {{handler::position}}, a quick link to the actual vector blob in the main table.

            The index (level,src) will allow for quick jumping between nodes.
            To go deeper in search, one just needs to increment the level.

            If {{src}} is found on level {{n}}, then it is also found on level {{n - 1}} and so on. Level {{0}} is the base level with all the nodes.
            For the first iteration of Vector search, we will implement HNSW algorithm.

            The implementation will only support Euclidean distance initially.

            *Basic plan:*
            Store the graph as part of a subtable (MDEV-33404).

            The table's definition will be something along these lines:
            {code:sql}
              CREATE TABLE i (
                level int unsigned not null,
                src varbinary(255) not null,
                dst varbinary(255) not null,
                index (level,src),
                index (level,dst));
            {code}

            For each link in the graph, there will be a corresponding entry in the table.
            * {{src}} and {{dst}} will store {{handler::position}}, a quick link to the actual vector blob in the main table.

            The index {{(level,src)}} will allow for quick jumping between nodes.
            To go deeper in search, one just needs to decrement the level and search using the same "src" value.

            If {{src}} is found on level {{n}}, then it is also found on level {{n - 1}} and so on. Level {{0}} is the base level with all the nodes.
            cvicentiu Vicențiu Ciorbaru made changes -
            Description For the first iteration of Vector search, we will implement HNSW algorithm.

            The implementation will only support Euclidean distance initially.

            *Basic plan:*
            Store the graph as part of a subtable (MDEV-33404).

            The table's definition will be something along these lines:
            {code:sql}
              CREATE TABLE i (
                level int unsigned not null,
                src varbinary(255) not null,
                dst varbinary(255) not null,
                index (level,src),
                index (level,dst));
            {code}

            For each link in the graph, there will be a corresponding entry in the table.
            * {{src}} and {{dst}} will store {{handler::position}}, a quick link to the actual vector blob in the main table.

            The index {{(level,src)}} will allow for quick jumping between nodes.
            To go deeper in search, one just needs to decrement the level and search using the same "src" value.

            If {{src}} is found on level {{n}}, then it is also found on level {{n - 1}} and so on. Level {{0}} is the base level with all the nodes.
            For the first iteration of Vector search, we will implement HNSW algorithm.

            The implementation will only support Euclidean distance initially.

            *Basic plan:*
            Graph construction will be done according to HNSW paper.

            Storage wise, we'll store the graph as part of a subtable (MDEV-33404).

            The table's definition will be something along these lines:
            {code:sql}
              CREATE TABLE i (
                level int unsigned not null,
                src varbinary(255) not null,
                dst varbinary(255) not null,
                index (level,src),
                index (level,dst));
            {code}

            For each link in the graph, there will be a corresponding entry in the table.
            * {{src}} and {{dst}} will store {{handler::position}}, a quick link to the actual vector blob in the main table.

            The index {{(level,src)}} will allow for quick jumping between nodes.
            To go deeper in search, one just needs to decrement the level and search using the same "src" value.

            If {{src}} is found on level {{n}}, then it is also found on level {{n - 1}} and so on. Level {{0}} is the base level with all the nodes.

            *Performance considerations:*
            * Storing the vector in the subtable might be required. Looking up the blob value in the base table might be too costly.
            cvicentiu Vicențiu Ciorbaru made changes -
            Assignee Vicențiu Ciorbaru [ cvicentiu ]
            cvicentiu Vicențiu Ciorbaru made changes -
            Fix Version/s 11.5 [ 29506 ]
            cvicentiu Vicențiu Ciorbaru made changes -
            Status Open [ 1 ] In Progress [ 3 ]
            serg Sergei Golubchik made changes -
            Fix Version/s 11.6 [ 29515 ]
            Fix Version/s 11.5 [ 29506 ]
            serg Sergei Golubchik made changes -
            Priority Major [ 3 ] Critical [ 2 ]
            serg Sergei Golubchik made changes -
            Issue Type Task [ 3 ] New Feature [ 2 ]
            serg Sergei Golubchik made changes -
            Status In Progress [ 3 ] In Testing [ 10301 ]
            ralf.gebhardt Ralf Gebhardt made changes -
            Labels vector
            serg Sergei Golubchik made changes -
            Fix Version/s 11.7 [ 29815 ]
            Fix Version/s 11.6 [ 29515 ]
            serg Sergei Golubchik made changes -
            serg Sergei Golubchik made changes -
            Labels vector
            serg Sergei Golubchik made changes -
            Component/s Vector search [ 20205 ]
            serg Sergei Golubchik made changes -
            Fix Version/s 11.7.1 [ 29913 ]
            Fix Version/s 11.7 [ 29815 ]
            Resolution Fixed [ 1 ]
            Status In Testing [ 10301 ] Closed [ 6 ]
            myx myx made changes -
            myx myx made changes -

            People

              cvicentiu Vicențiu Ciorbaru
              serg Sergei Golubchik
              Votes:
              3 Vote for this issue
              Watchers:
              12 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Git Integration

                  Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.