Details
-
New Feature
-
Status: Stalled (View Workflow)
-
Blocker
-
Resolution: Unresolved
-
1.5.3
-
None
-
2021-3, 2021-4, 2021-5, 2021-6, 2021-7, 2021-8, 2021-9, 2021-10, 2021-11, 2021-12, 2021-16, 2021-17, 2022-22, 2022-23, 2023-4, 2023-5, 2023-6, 2023-7, 2023-8, 2023-10, 2023-11, 2023-12
Description
When using storagemanager/S3 with Columnstore we want to avoid requiring additional third party hardware (NFS) or software (GlusterFS) for HA.
The subject is touched upon (without proposing a resolution) in https://docs.google.com/document/d/1USO3iXosBIv-jFOQNd820KSXdNPkOnlGDOPcfRkw1rA/edit#heading=h.2bu0ywfefwgs
Previous discussions centered around run-time synchronization of this object using plain network (like we do for Extent map). Another idea was offered recently (see below).
An adjacent part of the effort has to be "reconstruction from data" in case of crash.
++++++++++++++++++++++++++++++++++ proposal from Todd +++++++++++++++++++++++++++++++++++++++++++++++++
Maybe we should consider storing the metadata in a system database table. The current system uses files that are simple JSON and we already have JSON functionality within the server that we can take advantage of this. For example:
CREATE TABLE `columnstore_info`.`columnstore_meta` (
|
`id` int(11) NOT NULL AUTO_INCREMENT, |
`node` varchar(6) NOT NULL DEFAULT '', |
`path` varchar(128) NOT NULL DEFAULT '', |
`name` varchar(128) NOT NULL DEFAULT '', |
`metadata` longtext NOT NULL,
|
PRIMARY KEY (`id`)
|
) ENGINE=Aria AUTO_INCREMENT=3 DEFAULT CHARSET=utf8mb4 PAGE_CHECKSUM=1; |
Sample data here:
MariaDB [(none)]> select * from columnstore_info.columnstore_meta\G
|
*************************** 1. row *************************** |
id: 1 |
node: data1
|
path: systemFiles/dbrm
|
name: tablelocks.meta
|
metadata: {
|
"version": "1", |
"revision": "1", |
"objects": |
[
|
{
|
"offset": "0", |
"length": "4", |
"key": "c22873a7-77ef-4f95-b29e-de0a3a06145b_0_4_data1~systemFiles~dbrm~tablelocks" |
}
|
]
|
}
|
*************************** 2. row *************************** |
id: 2 |
node: data1
|
path: 000.dir/000.dir/003.dir/233.dir/000.dir |
name: FILE000.cdf.meta
|
metadata: {
|
"version": "1", |
"revision": "1", |
"objects": |
[
|
{
|
"offset": "0", |
"length": "2097152", |
"key": "d1250ef6-c4ae-4efb-b1a3-784c98f1f230_0_2097152_data1~000.dir~000.dir~003.dir~233.dir~000.dir~FILE000.cdf" |
}
|
]
|
}
|
2 rows in set (0.000 sec) |
MariaDB JSON functions handle all the CRUD functionality that we would need.
MariaDB [(none)]> SELECT node, path, name, json_value(metadata,'$.objects[0].key') as `key` from columnstore_info.columnstore_meta; |
+-------+-----------------------------------------+------------------+----------------------------------------------------------------------------------------------------------+
|
| node | path | name | key |
|
+-------+-----------------------------------------+------------------+----------------------------------------------------------------------------------------------------------+
|
| data1 | systemFiles/dbrm | tablelocks.meta | c22873a7-77ef-4f95-b29e-de0a3a06145b_0_4_data1~systemFiles~dbrm~tablelocks |
|
| data1 | 000.dir/000.dir/003.dir/233.dir/000.dir | FILE000.cdf.meta | d1250ef6-c4ae-4efb-b1a3-784c98f1f230_0_2097152_data1~000.dir~000.dir~003.dir~233.dir~000.dir~FILE000.cdf | |
+-------+-----------------------------------------+------------------+----------------------------------------------------------------------------------------------------------+
|
2 rows in set (0.000 sec) |
Instead of writing to files on disk, we should be sticking this right into the database. When written on the primary, this could be synced to the replicas via normal binlog traffic. (Same as our DDL)
Attachments
Issue Links
1.
|
Design and implement run-time sharing of S3 Journal and S3 metadata |
|
Stalled | Denis Khalikov |
2.
|
Generate metadata from S3 data. |
|
Needs Feedback | Denis Khalikov |