Uploaded image for project: 'MariaDB Server'
  1. MariaDB Server
  2. MDEV-122

Data mapping between HBase and SQL

Details

    • Task
    • Status: Closed (View Workflow)
    • Major
    • Resolution: Won't Fix
    • None
    • None
    • None

    Attachments

      Issue Links

        Activity

          == Data mapping from HBase to SQL ==

          Hbase table consists of rows, which are identified by row key. Each row has an
          arbitrary (potentially, very large) number of columns. (Looking at HBase
          documentation, one can conclude that HBase is specifically targeting two use
          cases:

          • a column family has few columns
          • a column family has so many columns that "get column by name" operation is
            specifically optimized for (TODO: exact link!)

          One can see two ways to map that to SQL tables:

          === Per-row mapping ===

          Let each row in HBase table be mapped into a row from SQL point of view:

          SELECT * FROM hbase_table;

          row-id column1 column2
          ------ ------- -------
          row1 data data
          row2 data data
          row3 data data

          This is the most straightforward mapping. However, accessing some of the hbase
          data will be difficult. For example, in hbase, it is natural to have this query

          given a row-id, get a list of columns in the row (maybe, with their values)

          in this mapping, result of this query will be one row, and there is no
          convinient way to return arbitrarily-long list of columns on one row.

          Table DDL could look like this:

          create table hbase_tbl_rows (
          row_id binary(MAX_HBASE_ROWID_LEN),
          columns blob,
          key (row_content)
          ) engine=hbase_row;

          1. TODO: Does Hbase have MAX_HBASE_ROWID_LEN limit? What is it?

          === Per-cell mapping ===

          HBase shell has 'scan' command, here's an example of its output:

          hbase(main):007:0> scan 'testtable'
          ROW COLUMN+CELL
          myrow-1 column=colfam1:q1, timestamp=1297345476469, value=value-1
          myrow-2 column=colfam1:q2, timestamp=1297345495663, value=value-2
          myrow-2 column=colfam1:q3, timestamp=1297345508999, value=value-3

          Here, one HBase row produces multiple rows in the output. Each output row
          represents one (row_id, column) combination, so rows with multiple columns
          (and multiple revisions of column data) are easily represented.

          create table hbase_tbl_cells (
          row_id binary(MAX_HBASE_ROWID_LEN),
          column_name binary(N),
          timestamp TIMESTAMP,
          value binary(N)
          ) engine=hbase_row;

          1. TODO: Does Hbase have MAX_HBASE_ROWID_LEN limit? What is it?
          • NOTE: It is nice to have table DDL independent of the hbase table. This
            saves us from need to synchronize table DDLs between hbase and mysql (which
            NDB cluster had to do and they ended up implementing a rather complex
            subsystem for it)

          == Consistency and transactionality ==

          • HBase has single-record transactions. Does this mean that HBase engine
            will have MyISAM-like characteristics? e.g. if we fail in the middle of
            an UPDATE, there is no way to go back?
          psergei Sergei Petrunia added a comment - == Data mapping from HBase to SQL == Hbase table consists of rows, which are identified by row key. Each row has an arbitrary (potentially, very large) number of columns. (Looking at HBase documentation, one can conclude that HBase is specifically targeting two use cases: a column family has few columns a column family has so many columns that "get column by name" operation is specifically optimized for (TODO: exact link!) One can see two ways to map that to SQL tables: === Per-row mapping === Let each row in HBase table be mapped into a row from SQL point of view: SELECT * FROM hbase_table; row-id column1 column2 ------ ------- ------- row1 data data row2 data data row3 data data This is the most straightforward mapping. However, accessing some of the hbase data will be difficult. For example, in hbase, it is natural to have this query given a row-id, get a list of columns in the row (maybe, with their values) in this mapping, result of this query will be one row, and there is no convinient way to return arbitrarily-long list of columns on one row. Table DDL could look like this: create table hbase_tbl_rows ( row_id binary(MAX_HBASE_ROWID_LEN), columns blob, key (row_content) ) engine=hbase_row; TODO: Does Hbase have MAX_HBASE_ROWID_LEN limit? What is it? === Per-cell mapping === HBase shell has 'scan' command, here's an example of its output: hbase(main):007:0> scan 'testtable' ROW COLUMN+CELL myrow-1 column=colfam1:q1, timestamp=1297345476469, value=value-1 myrow-2 column=colfam1:q2, timestamp=1297345495663, value=value-2 myrow-2 column=colfam1:q3, timestamp=1297345508999, value=value-3 Here, one HBase row produces multiple rows in the output. Each output row represents one (row_id, column) combination, so rows with multiple columns (and multiple revisions of column data) are easily represented. create table hbase_tbl_cells ( row_id binary(MAX_HBASE_ROWID_LEN), column_name binary(N), timestamp TIMESTAMP, value binary(N) ) engine=hbase_row; TODO: Does Hbase have MAX_HBASE_ROWID_LEN limit? What is it? NOTE: It is nice to have table DDL independent of the hbase table. This saves us from need to synchronize table DDLs between hbase and mysql (which NDB cluster had to do and they ended up implementing a rather complex subsystem for it) == Consistency and transactionality == HBase has single-record transactions. Does this mean that HBase engine will have MyISAM-like characteristics? e.g. if we fail in the middle of an UPDATE, there is no way to go back?

          Finalize the description and any questions we might have. Let's then send it for review.

          ratzpo Rasmus Johansson (Inactive) added a comment - - edited Finalize the description and any questions we might have. Let's then send it for review.

          == Data mapping from HBase to SQL ==

          Hbase table consists of rows, which are identified by row key. Each row has an
          arbitrary (potentially, very large) number of columns. (Looking at HBase
          documentation, one can conclude that HBase is specifically targeting two use
          cases:

          • a column family has few columns
          • a column family has so many columns that "get column by name" operation is
            specifically optimized for (TODO: exact link!)

          One can see two ways to map that to SQL tables:

          === Per-row mapping ===

          Let each row in HBase table be mapped into a row from SQL point of view:

          SELECT * FROM hbase_table;

          row-id column1 column2
          ------ ------- -------
          row1 data data
          row2 data data
          row3 data data

          This is the most straightforward mapping. However, accessing some of the hbase
          data will be difficult. For example, in hbase, it is natural to have this query

          given a row-id, get a list of columns in the row (maybe, with their values)

          in this mapping, result of this query will be one row, and there is no
          convinient way to return arbitrarily-long list of columns on one row.

          Table DDL could look like this:

          CREATE TABLE hbase_tbl_rows (
          row_id BINARY(MAX_HBASE_ROWID_LEN),
          columns BLOB,
          PRIMARY KEY (row_id)
          ) ENGINE=hbase_row;

          1. TODO: Does Hbase have MAX_HBASE_ROWID_LEN limit? What is it?

          === Per-cell mapping ===

          HBase shell has 'scan' command, here's an example of its output:

          hbase(main):007:0> scan 'testtable'
          ROW COLUMN+CELL
          myrow-1 column=colfam1:q1, timestamp=1297345476469, value=value-1
          myrow-2 column=colfam1:q2, timestamp=1297345495663, value=value-2
          myrow-2 column=colfam1:q3, timestamp=1297345508999, value=value-3

          Here, one HBase row produces multiple rows in the output. Each output row
          represents one (row_id, column) combination, so rows with multiple columns
          (and multiple revisions of column data) are easily represented.

          CREATE TABLE hbase_tbl_cells (
          row_id binary(MAX_HBASE_ROWID_LEN),
          column_name binary(MAX_HBASE_NAME_LEN),
          timestamp TIMESTAMP,
          value BLOB,
          PRIMARY KEY (row_id, column_name)
          ) ENGINE=hbase_cell;

          1. TODO: Does Hbase have MAX_HBASE_ROWID_LEN limit? What is it?
          • NOTE: It is nice to have SQL table DDLs independent of the content of
            the backend hbase table. This saves us from the need to synchronize
            table DDLs between hbase and mysql (NDB cluster had to do this and they
            have ended up implementing a rather complex system for this).

          == Consistency, transactions, etc ==

          • HBase has single-record transactions. Does this mean that HBase storage
            engine will have MyISAM-like characteristics? e.g. if we fail in the
            middle of a multi-row UPDATE, there is no way to go back.
          • How should writes be made? Should UPDATEs/INSERTs use checkAndPut operation
            so that they don't overwrite data that's already there?
          • Are the writes important at all? (e.g. if we've had the first version with
            provide read-only access, would that be useful?)
          psergei Sergei Petrunia added a comment - == Data mapping from HBase to SQL == Hbase table consists of rows, which are identified by row key. Each row has an arbitrary (potentially, very large) number of columns. (Looking at HBase documentation, one can conclude that HBase is specifically targeting two use cases: a column family has few columns a column family has so many columns that "get column by name" operation is specifically optimized for (TODO: exact link!) One can see two ways to map that to SQL tables: === Per-row mapping === Let each row in HBase table be mapped into a row from SQL point of view: SELECT * FROM hbase_table; row-id column1 column2 ------ ------- ------- row1 data data row2 data data row3 data data This is the most straightforward mapping. However, accessing some of the hbase data will be difficult. For example, in hbase, it is natural to have this query given a row-id, get a list of columns in the row (maybe, with their values) in this mapping, result of this query will be one row, and there is no convinient way to return arbitrarily-long list of columns on one row. Table DDL could look like this: CREATE TABLE hbase_tbl_rows ( row_id BINARY(MAX_HBASE_ROWID_LEN), columns BLOB, PRIMARY KEY (row_id) ) ENGINE=hbase_row; TODO: Does Hbase have MAX_HBASE_ROWID_LEN limit? What is it? === Per-cell mapping === HBase shell has 'scan' command, here's an example of its output: hbase(main):007:0> scan 'testtable' ROW COLUMN+CELL myrow-1 column=colfam1:q1, timestamp=1297345476469, value=value-1 myrow-2 column=colfam1:q2, timestamp=1297345495663, value=value-2 myrow-2 column=colfam1:q3, timestamp=1297345508999, value=value-3 Here, one HBase row produces multiple rows in the output. Each output row represents one (row_id, column) combination, so rows with multiple columns (and multiple revisions of column data) are easily represented. CREATE TABLE hbase_tbl_cells ( row_id binary(MAX_HBASE_ROWID_LEN), column_name binary(MAX_HBASE_NAME_LEN), timestamp TIMESTAMP, value BLOB, PRIMARY KEY (row_id, column_name) ) ENGINE=hbase_cell; TODO: Does Hbase have MAX_HBASE_ROWID_LEN limit? What is it? NOTE: It is nice to have SQL table DDLs independent of the content of the backend hbase table. This saves us from the need to synchronize table DDLs between hbase and mysql (NDB cluster had to do this and they have ended up implementing a rather complex system for this). == Consistency, transactions, etc == HBase has single-record transactions. Does this mean that HBase storage engine will have MyISAM-like characteristics? e.g. if we fail in the middle of a multi-row UPDATE, there is no way to go back. How should writes be made? Should UPDATEs/INSERTs use checkAndPut operation so that they don't overwrite data that's already there? Are the writes important at all? (e.g. if we've had the first version with provide read-only access, would that be useful?)

          == Results of discussion with Monty ==

          As first milestone, implement only Hbase_row<-->MySQL row (see above for DDL of
          hbase_tbl_rows). The `columns` blob will contain all of the columns.

          === Partial blob writes/reads ===
          Currently, MySQL uses this approach to work with blobs: If any fragment
          of the query accesses the blob field, then the storage engine will retrieve
          the entire blob and provide it to the SQL layer as a malloc'ed char*. It is not
          possible to read/write a part of the blob.

          We'll need to extend the storage engine API somehow to accomodate working on
          parts of BLOBs.

          TODO: how exactly will the optimizer/hbase-se recognize that we need to readd
          only some particular column? Hook into table condition pushdown?

          === Need column names for Dynamic columns ===

          Dynamic columns currently identify columns by numbers. HBase identifies them by
          string names.

          Two possible approaches

          • Genuinely support arbitrary column names
          • Store names in .frm file << TODO: Monty mentioned this has some advantage
            but I don't have access to where I've written this down.
          psergei Sergei Petrunia added a comment - == Results of discussion with Monty == As first milestone, implement only Hbase_row<-->MySQL row (see above for DDL of hbase_tbl_rows). The `columns` blob will contain all of the columns. === Partial blob writes/reads === Currently, MySQL uses this approach to work with blobs: If any fragment of the query accesses the blob field, then the storage engine will retrieve the entire blob and provide it to the SQL layer as a malloc'ed char*. It is not possible to read/write a part of the blob. We'll need to extend the storage engine API somehow to accomodate working on parts of BLOBs. TODO: how exactly will the optimizer/hbase-se recognize that we need to readd only some particular column? Hook into table condition pushdown? === Need column names for Dynamic columns === Dynamic columns currently identify columns by numbers. HBase identifies them by string names. Two possible approaches Genuinely support arbitrary column names Store names in .frm file << TODO: Monty mentioned this has some advantage but I don't have access to where I've written this down.

          Per Monty's request investigated if Cassandra provides any C/C++ API, and if such an API will be easier to program against compared to HBase.

          Cassandra provides the following three levels of APIs ordered by ease of use:

          • High level clients - most recommended API by Cassandra. Provides clients for many different environments/languages.
            The one for C++ is: http://snapwebsites.org/project/libqtcassandra. It is used only in one project AFAIU, therefore I would not rely on this one much.
          • Internal API "StorageProxy". This is a native Java API. As such it would have to be accessed via JNI. Cassandra documentation advise strongly against it because it is unstable, and low-level.
            (source: http://wiki.apache.org/cassandra/ClientOptions)

          In summary, I see no advantages in using Cassandra with respect to its API. The most reasonable choice seems to be Thrift, however HBase provides a Thrift API as well.
          It may be possible that Cassandra has other advantages, such as richer set of data access primitives, but I did not compare Cassandra and HBase in this respect.

          timour Timour Katchaounov (Inactive) added a comment - Per Monty's request investigated if Cassandra provides any C/C++ API, and if such an API will be easier to program against compared to HBase. Cassandra provides the following three levels of APIs ordered by ease of use: High level clients - most recommended API by Cassandra. Provides clients for many different environments/languages. The one for C++ is: http://snapwebsites.org/project/libqtcassandra . It is used only in one project AFAIU, therefore I would not rely on this one much. Thrift API ( http://wiki.apache.org/cassandra/API ). Thrift can generate C++ client API (same as HBase). Internal API "StorageProxy". This is a native Java API. As such it would have to be accessed via JNI. Cassandra documentation advise strongly against it because it is unstable, and low-level. (source: http://wiki.apache.org/cassandra/ClientOptions ) In summary, I see no advantages in using Cassandra with respect to its API. The most reasonable choice seems to be Thrift, however HBase provides a Thrift API as well. It may be possible that Cassandra has other advantages, such as richer set of data access primitives, but I did not compare Cassandra and HBase in this respect.
          larsgeorge Lars George added a comment -

          Please note that HBase trunk (termed "singularity") is changing the RPC to ProtoBufs, just like Hadoop Common has done. That way it will be really easy to talk straight to the RPC natively. Obviously, this is not yet released, but seems like a good place to start given that work here still seems pending. Maybe a storage driver, one for ThriftHBase, and one later on for ProtoBufHBase if you want not to wait?

          larsgeorge Lars George added a comment - Please note that HBase trunk (termed "singularity") is changing the RPC to ProtoBufs, just like Hadoop Common has done. That way it will be really easy to talk straight to the RPC natively. Obviously, this is not yet released, but seems like a good place to start given that work here still seems pending. Maybe a storage driver, one for ThriftHBase, and one later on for ProtoBufHBase if you want not to wait?

          Lars, thanks for the note.

          Alas, it seems that for now, MDEV-431 "Cassandra storage engine" is satisfying all the demand that there was for MySQL-> cloud-no-sql connectivity.

          psergei Sergei Petrunia added a comment - Lars, thanks for the note. Alas, it seems that for now, MDEV-431 "Cassandra storage engine" is satisfying all the demand that there was for MySQL-> cloud-no-sql connectivity.

          Right now, nobody has this work in their plans. Feel free to reopen if/when that changes.

          psergei Sergei Petrunia added a comment - Right now, nobody has this work in their plans. Feel free to reopen if/when that changes.

          People

            psergei Sergei Petrunia
            ratzpo Rasmus Johansson (Inactive)
            Votes:
            0 Vote for this issue
            Watchers:
            5 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Git Integration

                Error rendering 'com.xiplink.jira.git.jira_git_plugin:git-issue-webpanel'. Please contact your Jira administrators.