[MDEV-3841] LevelDB storage engine Created: 2012-11-08  Updated: 2019-11-26  Resolved: 2013-10-05

Status: Closed
Project: MariaDB Server
Component/s: None
Fix Version/s: None

Type: Task Priority: Major
Reporter: Sergei Petrunia Assignee: Sergei Petrunia
Resolution: Fixed Votes: 0
Labels: None

Issue Links:
PartOf
includes MDEV-4110 LevelDB: Support AUTO_INCREMENT Closed
includes MDEV-4122 LevelDB: store key/index numbers, not... Closed
includes MDEV-4201 LevelDB Storage Engine MS2 Closed
Sub-Tasks:
Key
Summary
Type
Status
Assignee
MDEV-3957 Server crashes on creating a table wi... Technical task Closed Sergei Petrunia  
MDEV-3958 Assertion `kdef' fails in Primary_key... Technical task Closed Sergei Petrunia  
MDEV-3959 Assertion `slice->size() == table->s-... Technical task Closed Sergei Petrunia  
MDEV-3960 Server crashes on running DISCARD TAB... Technical task Closed Sergei Petrunia  
MDEV-3961 Assertion `tablename[0] == '.' && tab... Technical task Closed Sergei Petrunia  
MDEV-3962 SELECT with ORDER BY causes "ERROR 10... Technical task Closed Sergei Petrunia  
MDEV-3963 JOIN or WHERE conditions involving ke... Technical task Closed Sergei Petrunia  
MDEV-3964 Assertion `!pk_descr' fails in ha_lev... Technical task Closed Sergei Petrunia  
MDEV-3965 [recreate] Server hangs or assertion ... Technical task Closed Sergei Petrunia  
MDEV-3966 [recreate?] Assorted server crashes o... Technical task Closed Sergei Petrunia  
MDEV-3967 [recreate] Server hangs on creating s... Technical task Closed Sergei Petrunia  
MDEV-3968 UPDATE produces a wrong result while ... Technical task Closed Sergei Petrunia  
MDEV-3969 Queries where execution plans on Leve... Technical task Closed Sergei Petrunia  
MDEV-3970 A set of assorted crashes on insertin... Technical task Closed Sergei Petrunia  
MDEV-4032 Session value of leveldb_lock_wait_ti... Technical task Closed Sergei Petrunia  
MDEV-4035 LevelDB: SELECT produces different re... Technical task Closed Sergei Petrunia  
MDEV-4036 LevelDB: INSERT .. ON DUPLICATE KEY U... Technical task Closed Sergei Petrunia  
MDEV-4037 LevelDB: REPLACE doesn't work, produc... Technical task Closed Sergei Petrunia  
MDEV-4038 LevelDB: SELECT ... FOR UPDATE does n... Technical task Closed Sergei Petrunia  
MDEV-4039 LevelDB: SELECT .. FOR UPDATE locks a... Technical task Closed Sergei Petrunia  
MDEV-4041 Server crashes in Primary_key_compara... Technical task Closed Sergei Petrunia  
MDEV-4043 LevelDB (Feature request): Produce ER... Technical task Closed Sergei Petrunia  
MDEV-4044 LevelDB: UPDATE or DELETE with ORDER ... Technical task Closed Sergei Petrunia  
MDEV-4046 LevelDB: Multi-table DELETE locks its... Technical task Closed Sergei Petrunia  
MDEV-4047 LevelDB: Assertion `0' fails in Proto... Technical task Closed Sergei Petrunia  
MDEV-4052 LevelDB (Feature request): Add --leve... Technical task Closed Sergei Petrunia  
MDEV-4053 LevelDB: DELETE hangs in state System... Technical task Closed Sergei Petrunia  
MDEV-4054 LevelDB: Reading by PK prefix does no... Technical task Closed Sergei Petrunia  
MDEV-4055 LevelDB: UPDATE/DELETE by a multi-par... Technical task Closed Sergei Petrunia  
MDEV-4059 LevelDB: query waiting for a lock can... Technical task Closed Sergei Petrunia  
MDEV-4060 LevelDB: Assertion `! trx->batch' fai... Technical task Closed Sergei Petrunia  
MDEV-4061 LevelDB: Changes from an interrupted ... Technical task Closed Sergei Petrunia  
MDEV-4062 LevelDB works incorrectly with query ... Technical task Closed Sergei Petrunia  
MDEV-4064 LevelDB: ER_KEY_NOT_FOUND (Can't find... Technical task Closed Sergei Petrunia  
MDEV-4076 LevelDB: Assertion `0' fails in ha_le... Technical task Closed Sergei Petrunia  
MDEV-4077 LevelDB: Wrong result (duplicate row)... Technical task Closed Sergei Petrunia  
MDEV-4078 Wrong result (missing rows) on select... Technical task Closed Sergei Petrunia  
MDEV-4081 LevelDB throws error 122 on an attemp... Technical task Closed Sergei Petrunia  
MDEV-4084 LevelDB: Wrong result on IN subquery ... Technical task Closed Sergei Petrunia  
MDEV-4085 LevelDB: Wrong result on range condit... Technical task Closed Sergei Petrunia  
MDEV-4086 LevelDB does not allow a query with m... Technical task Closed Sergei Petrunia  
MDEV-4089 LevelDB: Extensive memory usage on re... Technical task Closed Sergei Petrunia  
MDEV-4090 LevelDB: Wrong result (duplicate rows... Technical task Closed Sergei Petrunia  
MDEV-4092 LevelDB: Assertion `in_table(pa, a_le... Technical task Closed Sergei Petrunia  
MDEV-4093 LevelDB: IN subquery by secondary key... Technical task Closed Sergei Petrunia  
MDEV-4094 LevelDB: Wrong result on SELECT and E... Technical task Closed Sergei Petrunia  
MDEV-4097 LevelDB: indexes on text/blob fields ... Technical task Closed Sergei Petrunia  
MDEV-4099 LevelDB: Wrong results with index and... Technical task Closed Sergei Petrunia  
MDEV-4196 LevelDB: Autoincrement is not increas... Technical task Closed Sergei Petrunia  
MDEV-4197 LevelDB (Feature request): Improve ou... Technical task Closed Sergei Petrunia  
MDEV-4204 LevelDB: Assertion `pb_end - pb >= LD... Technical task Closed Sergei Petrunia  

 Description   

Implement a LevelDB storage engine. Basic feature list:

  • single-statement transactions
  • secondary indexes
  • HANDLER implementation with extensions to support atomic multi-put (kind of like multi-statement transactions)
  • binlog XA on the master to be crash safe
  • crash-proof slave replication state
  • (almost) non blocking schema change
  • full test coverage via mysql-test-run
  • hot backup
  • possible options to have LevelDB instance per mysqld, per schema or per table

The spec is being worked on here: https://kb.askmonty.org/en/leveldb-storage-engine



 Comments   
Comment by Sergei Petrunia [ 2012-11-08 ]

Put more content into the spec (not finished still).

Comment by Sergei Petrunia [ 2012-11-08 ]

tbleveldb storage engine compiles succesfully, but crashes when one tries to create a table.

It gets a SIGSEGV in leveldb::DBImpl::NewDB, the line

new_db.SetComparatorName(user_comparator()->Name());

has user_comparator() == NULL.

Comment by Sergei Petrunia [ 2012-11-09 ]

More updates to https://kb.askmonty.org/en/leveldb-storage-engine/, it's shaping up

Comment by Sergei Petrunia [ 2012-11-09 ]

More updates to the spec. Will be ready for another call early next week.

Comment by Sergei Petrunia [ 2012-11-12 ]

Discussion with Sergei about possible needed changes in SE API. He has pointed out that TokuDB has similar properties (e.g. INSERT works faster if it does not have to check whether it is inserting a duplicate)

Comment by Sergei Petrunia [ 2012-12-11 ]

Got the basic do-nothing storage engine to work. The tree is on launchpad, at lp:~maria-captains/maria/mysql-5.6-leveldb

Comment by Sergei Petrunia [ 2012-12-12 ]

I've hit an interesting problem when writing key comparator function.

According to the design, the key is "dbname.table_name.$primary_key_value", and $primary_key_value is encoded in MySQL's KeyTupleFormat. It is easy to compare values in KeyTupleFormat when one has the TABLE object for the table in question. But we do not necessarily have it. It is possible that LevelDB invokes the comparator when the table has not been open yet (for example, compaction process may do so).

To be exact, we don't need TABLE object, we need Field_xxx objects that describe the indexed columns. Field objects have int Field::key_cmp(const uchar *a,const uchar *b) function that performs comparisons for all types.

Possible solutions are:
S1. Re-implement key comparison functions for all types. We will need to save descriptions of all primary/secondary indexes (charsets/etc). It will take a lot of code to handle all of MySQL types.
S2. Force the server to open each LevelDB table on startup, so that we have a TABLE_SHARE object for each table.
S3. Store just enough information so that we're able to create the necessary Field objects ourselves.

InnoDB implements S1, for the most part. It stores internally key column attributes like field length, null-ability, whether it is unsigned, etc. It still relies on MySQL to handle some of the types.

I got an idea about S3 when looking at make_field() in sql/field.cc, the one with signature like this:

Field *make_field(TABLE_SHARE *share, uchar *ptr, uint32 field_length,
uchar *null_pos, uchar null_bit,
uint pack_flag, ...

If LeveDB SE stored all of the attributes of the function, it would be able to create Field* objects of its own.

Comment by Sergei Petrunia [ 2012-12-12 ]

Discussed the problem with SergeiG. Results :

There is also S4 /* not listed above but implied */ Convert keys into something that can be compared with memcmp(). The problem: index-only scans will not be possible, there is strxfrm() is available for strings, but not for int/double/decimal/etc.

S3 can be improved: MyISAM/Aria use ha_key_cmp() to compare key values. That function either compares key tuples directly, or it uses the result of _mi_pack_key(). Using that is better than messing with TABLE_SHARE and Field objects.

Comment by Sergei Petrunia [ 2012-12-14 ]

Was implementing S3. Finally came to the following cset (just pushed to launchpad):

MDEV-3841: LevelDB storage engine

  • Correct comparison of rowkeys (aka PRIMARY KEYS) depending on their types:
    = use MyISAM's ha_key_cmp() function for comparison
    = borrow MyISAM's code that constructs HA_KEYDEF descriptors
    = Use a separate LevelDB instance to store the data dictionary (it turned out
    that we can't store data dictionary in the same instance: we get chicken-and-egg
    problem when we need to run recovery on startup)
  • Make INSERT work (for now, it will blindly overwrite data that's already there)
  • Make full table scan work
  • Make primary key lookups work for ref access.

Blob columns are not supported yet.

Comment by Sergei Petrunia [ 2012-12-15 ]

Investigated how storage engine should handle blob columns.

  • InnoDB has some useful comments in row_mysql_store_blob_ref(), row_mysql_read_blob_ref().
  • calc_pack_length() in field.cc shows how many bytes each blob type is using for length.

When blob data is passed down into handler::write_row(), the blob is alloced and owned by SQL layer. Callee copies the data, but doesn't free.
When the data is returned from handler::rnd_next() (or some other function that reads record), blob memory is alloced and owned by the storage engine. In InnoDB, it remains valid until another record read call is made (check out prebuil->blob_heap)

Comment by Sergei Petrunia [ 2012-12-18 ]

Got blob handling to work (at least, basic tests pass).

Comment by Sergei Petrunia [ 2012-12-19 ]

Got UPDATE and DELETE commands to work for simple examples. (They don't work for complex examples because right now all changes are applied immediately)

Comment by Zheng Shao [ 2013-01-12 ]

Hi Sergei, how's going so far?

Comment by Sergei Petrunia [ 2013-01-12 ]

I'm still working on implementing row-locking system and hooking it to ha_leveldb. (More details in the email)

Comment by Aris Setyawan (Inactive) [ 2013-10-31 ]

Hi Sergei,

In which version of MariaDB, LevelDB storage engine will be released?
What is the current status?

Comment by Sergei Petrunia [ 2013-11-06 ]

Hi Aris,

Current status is that a tree (based on mysql-5.6) with the engine is available. The engine has a number of known limitations that make it difficult for it to compete against a more feature-complete engine like InnoDB. Right now, I cannot say which version of MariaDB the engine will be pushed to.

Comment by zhan mu qiao [ 2019-11-24 ]

Hi Sergei
i am a Chineses database developer,i am looking for which version of mysql running storage engines Leveldb.Can you tell me the specific address?Thankyou Looking forward to your reply。

Comment by Sergei Petrunia [ 2019-11-24 ]

Hi Vessel, are you sure you need LevelDB storage engine? It was a prototype which didn't reach the production stage. It was a basis for RocksDB storage engine, which is a part of MariaDB Server starting from version 10.2.

Comment by zhan mu qiao [ 2019-11-24 ]

Hi,Sergei
Yes,i am sure i need LevelDB storage engine.Because my lab developed a new database flashkv on the basis of leveldb ,now i want to load flashkv into mysql as a storage engine. Since the upper layer of leveldb and flashkv is the same ,flashkv can be directly introduced if there is the interface of leveldb.So i am sure i need Leveldb Storage engine.Hope you could help me.Thanks,looking forward to your reply.

Comment by Sergei Petrunia [ 2019-11-24 ]

Ok, I hope I'm not mixing anything up, The latest tree that I can find with LevelDB code is this one: https://code.launchpad.net/~maria-captains/maria/mysql-5.6-leveldb

Comment by Elena Stepanova [ 2019-11-24 ]

Vessel,

Please keep in mind that MariaDB doesn't support this engine, branch or code to any level, there will be no bugfixes, development, consultation or guidance in regard to it.

Comment by zhan mu qiao [ 2019-11-26 ]

Hi Sergei
Thank you,You've been a big help

Comment by zhan mu qiao [ 2019-11-26 ]

Hi Elena
Okay,thanks for your reply.

Generated at Thu Feb 08 06:51:36 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.