[MDEV-122] Data mapping between HBase and SQL Created: 2012-01-27  Updated: 2021-01-18  Resolved: 2013-03-21

Status: Closed
Project: MariaDB Server
Component/s: None
Fix Version/s: None

Type: Task Priority: Major
Reporter: Rasmus Johansson (Inactive) Assignee: Sergei Petrunia
Resolution: Won't Fix Votes: 0
Labels: None

Issue Links:
Duplicate
is duplicated by MDEV-5845 SE for hadoop and hbase Open
Relates
relates to MDEV-377 Name support for dynamic columns Closed
relates to MDEV-431 Cassandra storage engine Closed

 Description   

The spec is at http://kb.askmonty.org/en/hbase-storage-engine/



 Comments   
Comment by Sergei Petrunia [ 2012-01-30 ]

== Data mapping from HBase to SQL ==

Hbase table consists of rows, which are identified by row key. Each row has an
arbitrary (potentially, very large) number of columns. (Looking at HBase
documentation, one can conclude that HBase is specifically targeting two use
cases:

  • a column family has few columns
  • a column family has so many columns that "get column by name" operation is
    specifically optimized for (TODO: exact link!)

One can see two ways to map that to SQL tables:

=== Per-row mapping ===

Let each row in HBase table be mapped into a row from SQL point of view:

SELECT * FROM hbase_table;

row-id column1 column2
------ ------- -------
row1 data data
row2 data data
row3 data data

This is the most straightforward mapping. However, accessing some of the hbase
data will be difficult. For example, in hbase, it is natural to have this query

given a row-id, get a list of columns in the row (maybe, with their values)

in this mapping, result of this query will be one row, and there is no
convinient way to return arbitrarily-long list of columns on one row.

Table DDL could look like this:

create table hbase_tbl_rows (
row_id binary(MAX_HBASE_ROWID_LEN),
columns blob,
key (row_content)
) engine=hbase_row;

  1. TODO: Does Hbase have MAX_HBASE_ROWID_LEN limit? What is it?

=== Per-cell mapping ===

HBase shell has 'scan' command, here's an example of its output:

hbase(main):007:0> scan 'testtable'
ROW COLUMN+CELL
myrow-1 column=colfam1:q1, timestamp=1297345476469, value=value-1
myrow-2 column=colfam1:q2, timestamp=1297345495663, value=value-2
myrow-2 column=colfam1:q3, timestamp=1297345508999, value=value-3

Here, one HBase row produces multiple rows in the output. Each output row
represents one (row_id, column) combination, so rows with multiple columns
(and multiple revisions of column data) are easily represented.

create table hbase_tbl_cells (
row_id binary(MAX_HBASE_ROWID_LEN),
column_name binary(N),
timestamp TIMESTAMP,
value binary(N)
) engine=hbase_row;

  1. TODO: Does Hbase have MAX_HBASE_ROWID_LEN limit? What is it?
  • NOTE: It is nice to have table DDL independent of the hbase table. This
    saves us from need to synchronize table DDLs between hbase and mysql (which
    NDB cluster had to do and they ended up implementing a rather complex
    subsystem for it)

== Consistency and transactionality ==

  • HBase has single-record transactions. Does this mean that HBase engine
    will have MyISAM-like characteristics? e.g. if we fail in the middle of
    an UPDATE, there is no way to go back?
Comment by Rasmus Johansson (Inactive) [ 2012-01-30 ]

Finalize the description and any questions we might have. Let's then send it for review.

Comment by Sergei Petrunia [ 2012-01-30 ]

== Data mapping from HBase to SQL ==

Hbase table consists of rows, which are identified by row key. Each row has an
arbitrary (potentially, very large) number of columns. (Looking at HBase
documentation, one can conclude that HBase is specifically targeting two use
cases:

  • a column family has few columns
  • a column family has so many columns that "get column by name" operation is
    specifically optimized for (TODO: exact link!)

One can see two ways to map that to SQL tables:

=== Per-row mapping ===

Let each row in HBase table be mapped into a row from SQL point of view:

SELECT * FROM hbase_table;

row-id column1 column2
------ ------- -------
row1 data data
row2 data data
row3 data data

This is the most straightforward mapping. However, accessing some of the hbase
data will be difficult. For example, in hbase, it is natural to have this query

given a row-id, get a list of columns in the row (maybe, with their values)

in this mapping, result of this query will be one row, and there is no
convinient way to return arbitrarily-long list of columns on one row.

Table DDL could look like this:

CREATE TABLE hbase_tbl_rows (
row_id BINARY(MAX_HBASE_ROWID_LEN),
columns BLOB,
PRIMARY KEY (row_id)
) ENGINE=hbase_row;

  1. TODO: Does Hbase have MAX_HBASE_ROWID_LEN limit? What is it?

=== Per-cell mapping ===

HBase shell has 'scan' command, here's an example of its output:

hbase(main):007:0> scan 'testtable'
ROW COLUMN+CELL
myrow-1 column=colfam1:q1, timestamp=1297345476469, value=value-1
myrow-2 column=colfam1:q2, timestamp=1297345495663, value=value-2
myrow-2 column=colfam1:q3, timestamp=1297345508999, value=value-3

Here, one HBase row produces multiple rows in the output. Each output row
represents one (row_id, column) combination, so rows with multiple columns
(and multiple revisions of column data) are easily represented.

CREATE TABLE hbase_tbl_cells (
row_id binary(MAX_HBASE_ROWID_LEN),
column_name binary(MAX_HBASE_NAME_LEN),
timestamp TIMESTAMP,
value BLOB,
PRIMARY KEY (row_id, column_name)
) ENGINE=hbase_cell;

  1. TODO: Does Hbase have MAX_HBASE_ROWID_LEN limit? What is it?
  • NOTE: It is nice to have SQL table DDLs independent of the content of
    the backend hbase table. This saves us from the need to synchronize
    table DDLs between hbase and mysql (NDB cluster had to do this and they
    have ended up implementing a rather complex system for this).

== Consistency, transactions, etc ==

  • HBase has single-record transactions. Does this mean that HBase storage
    engine will have MyISAM-like characteristics? e.g. if we fail in the
    middle of a multi-row UPDATE, there is no way to go back.
  • How should writes be made? Should UPDATEs/INSERTs use checkAndPut operation
    so that they don't overwrite data that's already there?
  • Are the writes important at all? (e.g. if we've had the first version with
    provide read-only access, would that be useful?)
Comment by Sergei Petrunia [ 2012-04-12 ]

== Results of discussion with Monty ==

As first milestone, implement only Hbase_row<-->MySQL row (see above for DDL of
hbase_tbl_rows). The `columns` blob will contain all of the columns.

=== Partial blob writes/reads ===
Currently, MySQL uses this approach to work with blobs: If any fragment
of the query accesses the blob field, then the storage engine will retrieve
the entire blob and provide it to the SQL layer as a malloc'ed char*. It is not
possible to read/write a part of the blob.

We'll need to extend the storage engine API somehow to accomodate working on
parts of BLOBs.

TODO: how exactly will the optimizer/hbase-se recognize that we need to readd
only some particular column? Hook into table condition pushdown?

=== Need column names for Dynamic columns ===

Dynamic columns currently identify columns by numbers. HBase identifies them by
string names.

Two possible approaches

  • Genuinely support arbitrary column names
  • Store names in .frm file << TODO: Monty mentioned this has some advantage
    but I don't have access to where I've written this down.
Comment by Timour Katchaounov (Inactive) [ 2012-06-13 ]

Per Monty's request investigated if Cassandra provides any C/C++ API, and if such an API will be easier to program against compared to HBase.

Cassandra provides the following three levels of APIs ordered by ease of use:

  • High level clients - most recommended API by Cassandra. Provides clients for many different environments/languages.
    The one for C++ is: http://snapwebsites.org/project/libqtcassandra. It is used only in one project AFAIU, therefore I would not rely on this one much.
  • Internal API "StorageProxy". This is a native Java API. As such it would have to be accessed via JNI. Cassandra documentation advise strongly against it because it is unstable, and low-level.
    (source: http://wiki.apache.org/cassandra/ClientOptions)

In summary, I see no advantages in using Cassandra with respect to its API. The most reasonable choice seems to be Thrift, however HBase provides a Thrift API as well.
It may be possible that Cassandra has other advantages, such as richer set of data access primitives, but I did not compare Cassandra and HBase in this respect.

Comment by Lars George [ 2012-11-03 ]

Please note that HBase trunk (termed "singularity") is changing the RPC to ProtoBufs, just like Hadoop Common has done. That way it will be really easy to talk straight to the RPC natively. Obviously, this is not yet released, but seems like a good place to start given that work here still seems pending. Maybe a storage driver, one for ThriftHBase, and one later on for ProtoBufHBase if you want not to wait?

Comment by Sergei Petrunia [ 2013-03-21 ]

Lars, thanks for the note.

Alas, it seems that for now, MDEV-431 "Cassandra storage engine" is satisfying all the demand that there was for MySQL-> cloud-no-sql connectivity.

Comment by Sergei Petrunia [ 2013-03-21 ]

Right now, nobody has this work in their plans. Feel free to reopen if/when that changes.

Generated at Thu Feb 08 06:26:26 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.