[MDEV-22308] Feature request: return slices of original tables for join queries - Jira

Details

Type: Task
Status: Open (View Workflow)
Priority: Minor
Resolution: Unresolved
Fix Version/s: None
Component/s: None
Labels:
- connect-engine
- json
- sql

Description

TLDR: optionally return multiple result tables to any query that joins tables, each being the relevant slice of the joined tables, so that the client can use the relations naturally while displaying, editing and then returning it to the database.

To me - and I guess many of you around here - relational databases are the ideal structure to store and query data from multiple perspectives. My assesment is that the growing dominance of NoSQL solutions is not fuelled by them being a better alternative, but by their preceived simplicity - at least at first sight. Isn’t it simpler to just save “movie” objects to a database, than mingle with multiple tables and then joining them just to get your data back? Well, it is, until you want to look at your data from a different perspecive, say female directors of european movies... that’s when you start apreciating the elegance of SQL.

This feature proposal is aimed at making relational databases better suited to modern development enviroments and fedarated data storage by sticking to to their data model, but making it’s usage more flexible and efficient, thus more desirable. Think rich client side frameworks like Vue.js and the Connect engine in MariaDB.

Example: simple database of movies, with the tables below, with movies and people joined to designate actors in movies, and movie status, a person’s sex and their role (director, actor/actress, cinematographer, effects, best boy, etc.) expanded by the dictionary table. Interactive schema at: https://catch23.co/spg-movies/

movie (id, title, status)
person (id, name, sex)
movie_person (mid, pid, role)
dictionary (kind, id, name)

Imagine a web client which needs to edit a few movies along with their cast. Wether using REST with multiple requests to various endpoints or GraphQL returning a hierachical object meant to be in the ideal format for the client to display you’ll have a server side component running either multiple SQL queries to get the data or a single one with all tables joined, and then transformed into the required format. Then you’ll have the web client mingle with that hierarchical format while editing it, and returning it, just to be dissected by a server side component and transformed back to relational tables. Suboptimal, when clients are perfectly capable of using the data as is in its original relational form.

Proposal

Have an option for SELECT queries to make them return multiple result tables instead of one with the data joined together and with values repeated. So if you join together a few movies of a given director, along with all actors and the relevant dictionary items, you would not have a movie’s id and title or dictionary items like “director” repeated for each cast member.

SELECT
*
FROM
movie m, person p, movie_person mp
WHERE
m.id = mp.mid AND mp.pid = p.id
AND m.title in (“The Unbelievable Truth (1989)”, “Trust (1990)”, “Amateur (1994)”)

Advatages

1: Client applications could be much more flexible in displaying or using the results set in different ways. A simple re-sorting for example would not require another query to be run.

2: Queries from multiply tables could still be updateable easily from client apps if they select the keys from all linked tables too. With a simple SELECT if you find an error in a name's spelling you have to form a different query to update that person's name. With the other approach you can provide update functioality automatically in client app development components.

3: Communication is more compact too. A simple SELECT has to transfer each movies' title castmember-count times, with M-SELECT only once. guess it would ease the load on the server too: less wainting for the net interface to eat all those repeated titles. This would be especially helpful with expternat data sources connected with the Connect engine if the data source doesn’t have the indexing or caching capabilities of an SQL database, thus running multiple queries is expensive.

4: Stored procedures could return more tables too, and this is really helpful in some situations.

Implementation

I presume the MySQL protocol already has the facility to return multiple datasets for a single query string as it happens when you use the multi query capability and join together a few queries with semicolons. Perhaps it would need an extension to make it more efficient to return records destined for diferent datasets intertwined.

The query plan would be the same but as it’s executed data from different source tables would be pushed into separate destination tables. There would be no column name collisions.

Data sources behind the Connect engine could be more efficient if they would get the whole request, instead of getting multiple queries going at the same elements. Let’s say an LDAP backend containing department logos could return the users and the logos of their departments in one go, basedon one LDAP request without repeating the logos or without the need to run multiple queries to avoid this.

PS: this idea is also is a work item for MySQL suggested by me a few years ago: https://dev.mysql.com/worklog/task/?id=358

Attachments

Activity

People

Assignee:: Unassigned

Reporter:: Andris Kardos

Votes:: 0 Vote for this issue

Watchers:: 1 Start watching this issue

Dates

Created:: 2020-04-19 20:54

Updated:: 2020-09-13 09:17

Git Integration

Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.

MariaDB Server