[MDEV-9992] Tokudb engine load data will very slow after a delete operation, if the new data is the same as the deleted data Created: 2016-04-26  Updated: 2016-06-26  Resolved: 2016-06-26

Status: Closed
Project: MariaDB Server
Component/s: Storage Engine - TokuDB
Affects Version/s: 10.0.20, 10.0.24
Fix Version/s: N/A

Type: Bug Priority: Critical
Reporter: dennis Assignee: Sergei Golubchik
Resolution: Won't Fix Votes: 0
Labels: load_data
Environment:

OS: centos 6.5



 Description   

1. create table

CREATE TABLE `tas_term_model_level_gprs_mm_2017` (
  `STATIS_MONTH` varchar(6) NOT NULL,
  `CITY_ID` varchar(20) NOT NULL COMMENT '??????????????',
  `COUNTY_ID` varchar(20) NOT NULL,
  `SECTION_ID` varchar(40) NOT NULL COMMENT '???????????',
  `BRAND_ID` int(11) NOT NULL,
  `TERM_MODEL_ID` varchar(64) NOT NULL COMMENT 'Iphone5S\n            ',
  `MON_FLOW_LEVEL_ID` int(11) NOT NULL COMMENT '??????\n            \n            ',
  `USER_NUM` bigint(20) DEFAULT NULL,
  `LASTM_USER_NUM` bigint(20) DEFAULT NULL,
  `ACCE_FLOW` decimal(20,4) DEFAULT NULL,
  `LASTM_ACCE_FLOW` decimal(20,4) DEFAULT NULL,
  PRIMARY KEY (`STATIS_MONTH`,`CITY_ID`,`COUNTY_ID`,`SECTION_ID`,`BRAND_ID`,`TERM_MODEL_ID`,`MON_FLOW_LEVEL_ID`)
) ENGINE=TokuDB DEFAULT CHARSET=utf8 COMMENT='????-  ?????????_GPRS_MM_YYYY' `compression`='tokudb_zlib'

2. do the first load

load data infile '/home/ly/tas_term_model_level_gprs_yyyymm.txt' into table tas_term_model_level_gprs_mm_2017 fields terminated by '\t';
Query OK, 1663513 rows affected (41.64 sec)
Records: 1663513  Deleted: 0  Skipped: 0  Warnings: 0

3. do delete operation

delete from tas_term_model_level_gprs_mm_2017 where STATIS_MONTH = "201603";
Query OK, 1663513 rows affected (1 min 0.79 sec)

4. do the load again

load data infile '/home/ly/tas_term_model_level_gprs_yyyymm.txt' into table tas_term_model_level_gprs_mm_2017 fields terminated by '\t';
^CCtrl-C -- sending "KILL QUERY 111" to server ...
Ctrl-C -- query aborted.
ERROR 1317 (70100): Query execution was interrupted

The load file is the same as the first load.
The load will very slow(900s with only 4000 rows inserted), and the mysqld process use 100% CPU.

show processlist;
+-----+---------+-----------------+-------+---------+------+--------------------------+------------------------------------------------------------------------------------------------------+----------+
| Id  | User    | Host            | db    | Command | Time | State                    | Info                                                                                                 | Progress |
+-----+---------+-----------------+-------+---------+------+--------------------------+------------------------------------------------------------------------------------------------------+----------+
| 111 | dbscale | 127.0.0.1:44821 | tasdb | Query   |  879 | Inserted about 4000 rows | load data infile '/home/ly/tas_term_model_level_gprs_yyyymm.txt' into table tas_term_model_level_gpr |    0.183 |
| 121 | dbscale | 127.0.0.1:53756 | NULL  | Query   |    0 | init                     | show processlist                                                                                     |    0.000 |
+-----+---------+-----------------+-------+---------+------+--------------------------+------------------------------------------------------------------------------------------------------+----------+
2 rows in set (0.00 sec)

I use pstack get the following output:

_{color:#59afe1}#0  0x00007faf8b0d29e0 in bn_data::fetch_klpair(unsigned int, leafentry**, unsigned int*, void**) () from /home/wen/mariadb/lib/plugin/ha_tokudb.so
#1  0x00007faf8b0fb7a8 in ft_search_node(ft_handle*, ftnode*, ft_search*, int, int (*)(unsigned int, void const*, unsigned int, void const*, void*, bool), void*, bool*, ft_cursor*, unlockers*, ancestors*, pivot_bounds const&, bool) () from /home/wen/mariadb/lib/plugin/ha_tokudb.so
#2  0x00007faf8b0fb43c in ft_search_node(ft_handle*, ftnode*, ft_search*, int, int (*)(unsigned int, void const*, unsigned int, void const*, void*, bool), void*, bool*, ft_cursor*, unlockers*, ancestors*, pivot_bounds const&, bool) () from /home/wen/mariadb/lib/plugin/ha_tokudb.so
#3  0x00007faf8b0fb43c in ft_search_node(ft_handle*, ftnode*, ft_search*, int, int (*)(unsigned int, void const*, unsigned int, void const*, void*, bool), void*, bool*, ft_cursor*, unlockers*, ancestors*, pivot_bounds const&, bool) () from /home/wen/mariadb/lib/plugin/ha_tokudb.so
#4  0x00007faf8b0fc41f in toku_ft_search(ft_handle*, ft_search*, int (*)(unsigned int, void const*, unsigned int, void const*, void*, bool), void*, ft_cursor*, bool) () from /home/wen/mariadb/lib/plugin/ha_tokudb.so
#5  0x00007faf8b1666c1 in toku_ft_cursor_set_range(ft_cursor*, __toku_dbt*, __toku_dbt*, int (*)(unsigned int, void const*, unsigned int, void const*, void*, bool), void*) () from /home/wen/mariadb/lib/plugin/ha_tokudb.so
#6  0x00007faf8b14dc9f in c_getf_set_range(__toku_dbc*, unsigned int, __toku_dbt*, int (*)(__toku_dbt const*, __toku_dbt const*, void*), void*) () from /home/wen/mariadb/lib/plugin/ha_tokudb.so
#7  0x00007faf8b08be69 in ha_tokudb::is_val_unique(bool*, unsigned char*, st_key*, unsigned int, __toku_db_txn*) () from /home/wen/mariadb/lib/plugin/ha_tokudb.so
#8  0x00007faf8b08bf79 in ha_tokudb::do_uniqueness_checks(unsigned char*, __toku_db_txn*, THD*) () from /home/wen/mariadb/lib/plugin/ha_tokudb.so
#9  0x00007faf8b09d5c9 in ha_tokudb::write_row(unsigned char*) () from /home/wen/mariadb/lib/plugin/ha_tokudb.so
#10 0x00000000007283f3 in handler::ha_write_row(unsigned char*) ()
#11 0x000000000057d751 in write_record(THD*, TABLE*, st_copy_info*) ()
#12 0x000000000086b6a6 in mysql_load(THD*, sql_exchange*, TABLE_LIST*, List<Item>&, List<Item>&, List<Item>&, enum_duplicates, bool, bool) ()
#13 0x000000000059d555 in mysql_execute_command(THD*) ()
#14 0x000000000059fc86 in mysql_parse(THD*, char*, unsigned int, Parser_state*) ()
#15 0x00000000005a1bb7 in dispatch_command(enum_server_command, THD*, char*, unsigned int) ()
#16 0x00000000005a22f9 in do_command(THD*) ()
#17 0x000000000066c7d4 in do_handle_one_connection(THD*) ()
#18 0x000000000066c912 in handle_one_connection ()
#19 0x0000003e84607851 in start_thread () from /lib64/libpthread.so.0
#20 0x0000003366ce890d in clone () from /lib64/libc.so.6{color}_

It seams that tokudb engine costs lots of time to do the unique check.

5. create a new table and replace the primary key as normal key

create table tas_term_model_level_gprs_mm_2018 like tas_term_model_level_gprs_mm_2017;
Query OK, 0 rows affected (0.34 sec)
 
alter table tas_term_model_level_gprs_mm_2018 drop primary key;
Query OK, 0 rows affected (0.43 sec)
Records: 0  Duplicates: 0  Warnings: 0
 
alter table tas_term_model_level_gprs_mm_2018 add key (`STATIS_MONTH`,`CITY_ID`,`COUNTY_ID`,`SECTION_ID`,`BRAND_ID`,`TERM_MODEL_ID`,`MON_FLOW_LEVEL_ID`)
    -> ;
Query OK, 0 rows affected (0.32 sec)
Records: 0  Duplicates: 0  Warnings: 0

6. do the first load

load data infile '/home/ly/tas_term_model_level_gprs_yyyymm.txt' into table tas_term_model_level_gprs_mm_2018 fields terminated by '\t'; 
Query OK, 1663513 rows affected (14.45 sec)
Records: 1663513  Deleted: 0  Skipped: 0  Warnings: 0

7. do delete operation

delete from tas_term_model_level_gprs_mm_2018 where STATIS_MONTH='201603'
    -> ;
Query OK, 1663513 rows affected (1 min 46.00 sec)

8. do the load again

 load data infile '/home/ly/tas_term_model_level_gprs_yyyymm.txt' into table tas_term_model_level_gprs_mm_2018 fields terminated by '\t';
Query OK, 1663513 rows affected (1 min 28.73 sec)
Records: 1663513  Deleted: 0  Skipped: 0  Warnings: 0

We can see the load data work normally if there is no primary key or unique key after a delete operation.
If there is a primary key or unique key, load the same data as the deleted will be very very
slow.

It is a big bug!



 Comments   
Comment by Risato Stefano [ 2016-05-04 ]

I confirm the bug is present also in version 10.1.13. A query passed from 10s to hours...

Comment by Elena Stepanova [ 2016-06-02 ]

gao1738, steris, did you report it to Percona as well? Do you happen to know if it is (or was) reproducible with Percona server?

Comment by dennis [ 2016-06-21 ]

Hi Elena Stepanova ,

It can be reproduced in percona server.
I have reported it to the percona, but they don't think this problem need to be fixed.
The more detail can be found in
https://bugs.launchpad.net/percona-server/+bug/1575094

I hope the mariadb version can fix it or improve it, if possible.

Comment by Elena Stepanova [ 2016-06-24 ]

I highly doubt that MariaDB has manpower to fix architectural flaws in the FT library, and I personally would vote against it – not because the problem itself is unimportant, but because resources can be used more wisely. But I'll pass it to serg for the final decision.

Comment by Sergei Golubchik [ 2016-06-26 ]

We did not fork TokuDB, we only integrated it into MariaDB.

Percona develops TokuDB, not we, and it's their decision in what direction to develop it and what is considered a bug.

Generated at Thu Feb 08 07:38:51 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.