[MDEV-24001] Implement hidden PK for RBR of no-unique key table Created: 2020-10-21 Updated: 2023-12-22 |
|
| Status: | Open |
| Project: | MariaDB Server |
| Component/s: | Replication |
| Fix Version/s: | 11.5 |
| Type: | Task | Priority: | Major |
| Reporter: | Andrei Elkin | Assignee: | Andrei Elkin |
| Resolution: | Unresolved | Votes: | 1 |
| Labels: | beginner-friendly | ||
| Issue Links: |
|
||||||||||||||||||||||||||||||||
| Description |
|
Performance of row event applying on no-unique key table is inferior and degrades |
| Comments |
| Comment by Sergei Golubchik [ 2020-11-06 ] | ||||||
|
what will be a user interface for that? | ||||||
| Comment by Marko Mäkelä [ 2021-01-04 ] | ||||||
|
I remember first hearing of this problem in 2012 from a cloud service provider. I understood that they had an internal patch for this. This ticket was filed on my request. In the discussion thread, I had written my suggestion for the interface:
The above is not a user interface, which serg asked about. I think that we could introduce an SQL_mode for disabling this feature on DDL time, if someone really wants to create or alter a table that does not have any primary key, not even a hidden one. I note that we missed the opportunity to introduce such modes earlier, for example in | ||||||
| Comment by Andrei Elkin [ 2021-01-05 ] | ||||||
|
marko, sachin.setiya.007: to the user interfaces part, maybe something like | ||||||
| Comment by Marko Mäkelä [ 2021-01-05 ] | ||||||
|
Elkin, I do not think that storage engines should have any say in whether the SQL layer considers something to be a hidden column. A hidden (virtual) column was already implemented in As far as I understand, O(log n) operations can become O(n) or worse even when using a heap-organized table, such as ENGINE=MyISAM. InnoDB is internally creating a hidden primary key (GEN_CLUST_INDEX(DB_ROW_ID)) that is almost entirely hidden from the SQL layer. Such InnoDB-internal hidden primary key would never be created if the SQL layer automatically created a hidden primary key (in the .frm file and in TABLE_SHARE). I think that it could be feasible to unconditionally and automatically create such a hidden primary key. But, I have no idea how this would affect cross-version replication (replicating from a table that lacks a primary key to one that has a hidden primary key in TABLE_SHARE, or vice versa). For that, we might want to have some configuration parameter. | ||||||
| Comment by Marko Mäkelä [ 2021-01-23 ] | ||||||
|
A blog post that appeared soon after the MySQL 8.0.23 release suggests that one could explicitly create a hidden primary key:
I think that we should aim for something fool-proof, automatically creating such hidden primary keys based on a SET variable, such as SQL_mode. As far as I understand, the original problem statement was that a DBaaS provider gets severe replication performance problems when a user does something like this:
On the primary server, the UPDATE would be a table scan. But, on replicas, due to a missing primary key (or UNIQUE INDEX on NOT NULL columns), in row-based replication, applying each change would require a table scan (visiting n² rows instead of n rows). We would like the hidden primary key to be created automatically, so that row-based replication can take advantage of it. | ||||||
| Comment by Andrei Elkin [ 2022-04-26 ] | ||||||
|
serg to your question, MDEV-21181 reporter suggests to this matter. |