[MDEV-5901] EITS: killing the server leaves statistical tables in "marked as crashed" state Created: 2014-03-19 Updated: 2014-03-19 Resolved: 2014-03-19 |
|
| Status: | Closed |
| Project: | MariaDB Server |
| Component/s: | None |
| Affects Version/s: | 10.0.9 |
| Fix Version/s: | 10.0.10 |
| Type: | Bug | Priority: | Major |
| Reporter: | Sergei Petrunia | Assignee: | Sergei Petrunia |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | eits | ||
| Description |
|
If one does the following sequence of operations
then any action that attempts read from EITS tables will not be able to open the tables anymore. Opening the table will fail with "table marked as crashed" error. This task is about making EITS tables more resilient to the scenario. There are two things to be done: |
| Comments |
| Comment by Sergei Petrunia [ 2014-03-19 ] | ||||||||||||||||||||||||||||||||||||||
|
Hint from Monty: check out the code in sp.cc:
Note the HA_EXTRA_FLUSH call. We will need to add it to EITS tables. | ||||||||||||||||||||||||||||||||||||||
| Comment by Sergei Petrunia [ 2014-03-19 ] | ||||||||||||||||||||||||||||||||||||||
|
I'm also trying to investigate what is needed for auto-repair. 1. Auto-repair doesn't work for mysql.proc table. mysql> create procedure p4() begin select now(); end // 2. Auto-repair does work for regular tables.
Code-wise, auto-repair happens in open_table() and open_tables(). In open_table, there is this code:
and open_tables() has:
| ||||||||||||||||||||||||||||||||||||||
| Comment by Sergei Petrunia [ 2014-03-19 ] | ||||||||||||||||||||||||||||||||||||||
|
When I try debugging a failure to open a statistical table, I see a difference in this call: Open_table_context::request_backoff_action (this=0x7ffff7e9ede0, action_arg=Open_table_context::OT_REPAIR, Here, (gdb) print action_arg and because of that we don't take any action. | ||||||||||||||||||||||||||||||||||||||
| Comment by Sergei Petrunia [ 2014-03-19 ] | ||||||||||||||||||||||||||||||||||||||
|
The reason is that open_and_lock_tables() is structured like this:
Statistical tables are opened after the regular tables have been opened and locked (I'm wondering why can't we open them at the same time?). Because of that, deadlock prevention logic prevents repair. | ||||||||||||||||||||||||||||||||||||||
| Comment by Sergei Petrunia [ 2014-03-19 ] | ||||||||||||||||||||||||||||||||||||||
|
If I force execution in Open_table_context::request_backoff_action to allow repair, then I get an assertion thd->mdl_context.is_lock_owner(MDL_key::TABLE, table->s->db.str, table->s->table_name.str, MDL_SHARED) in close_thread_table() for table test.t10. | ||||||||||||||||||||||||||||||||||||||
| Comment by Sergei Petrunia [ 2014-03-19 ] | ||||||||||||||||||||||||||||||||||||||
|
It seems, auto-repair (item#2) is difficult to do. I will only implement flushing (item#1), for now. |