[MXS-3624] Use indexed virtual columns if such exist Created: 2021-06-18  Updated: 2023-12-15

Status: Open
Project: MariaDB MaxScale
Component/s: nosqlprotocol
Affects Version/s: 6
Fix Version/s: Icebox

Type: New Feature Priority: Major
Reporter: Johan Wikman Assignee: Joe Cotellese
Resolution: Unresolved Votes: 0
Labels: None

Issue Links:
Blocks
is blocked by MDEV-6017 Add support for Indexes on Expressions In Review
Relates
relates to MXS-4801 Add manual support for indexes Open
Epic Link: MongoDB Protocol Support

 Description   

The JSON documents are stored in a table containing one "real" column doc whose type is JSON and one virtual column id.

Now, it is possible to add additional virtual columns to the table. Suppose, for instance, that the table will contain information about cars and that documents often will be searched based on the car make. Then it would be possible to create a virtual column and an index on it:

ALTER TABLE cars ADD make VARCHAR(32) AS (JSON_VALUE(doc, '$.make'));
CREATE INDEX cars_make_ix ON cars(make);

However, that index is only used, if the virtual column is explicitly referred to as in

SELECT * FROM cars WHERE make = 'Toyota'

but not if you use the JSON function expression that defines the column

SELECT * FROM cars WHERE JSON_VALUE(doc, '$.make') = 'Toyota'

If nosqlprotocol is aware of what indexes have been created on a particular table, then it could, while converting a MongoDB find document into the equivalent SQL, explicitly refer to existing (indexed) virtual columns, instead of using the JSON function expressions.

That way, the benefit of indexes could be brought to users of nosqlprotocol before the server optimizer is extended to deal with this situation.



 Comments   
Comment by Johan Wikman [ 2022-06-08 ]

This should not be done, as there is MDEV-6017 that provides what is needed. A MaxScale based solution would be inferior in many respects.

Comment by Johan Wikman [ 2023-09-26 ]

Here are some numbers that show how useful this will be. In a test related to the internal caching an index was created manually and NoSQL was tweaked to use it.

The setup was to have some 10000 documents of the following kind

    {
      "Year": 2020,
      "Make": "Audi",
      "Model": "Q3",
      "Category": "SUV",
      "createdAt": "2020-01-27T20:44:17.665Z",
      "updatedAt": "2020-01-27T20:44:17.665Z",
      "_id": 1
    },

and then to delete all cars whose Make is Toyota (there's 579 of them) using a query like

db.runCommand({delete: "cars", deletes: [{q: { Make: "Toyota" }, limit: 0}]});

Without an index that took roughly 120ms but with an index on Make only 20ms. It's quite clear (obviously) that indexes will make NoSQL faster and MDEV-6017 provides a straightforward way to support them.

Generated at Thu Feb 08 04:22:47 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.