[MDEV-27438] Spider: crash in spider_sys_open_table, directly after crash recovery finished Created: 2021-12-09 Updated: 2023-12-19 |
|
| Status: | Open |
| Project: | MariaDB Server |
| Component/s: | Storage Engine - Spider, XA |
| Affects Version/s: | 10.5, 10.6.14 |
| Fix Version/s: | 10.5, 10.6 |
| Type: | Bug | Priority: | Critical |
| Reporter: | Valerii Kravchuk | Assignee: | Valerii Kravchuk |
| Resolution: | Unresolved | Votes: | 1 |
| Labels: | crash | ||
| Issue Links: |
|
||||||||||||||||
| Description |
|
After crash (probably related to
After that the instance continues crashing like that on every startup. |
| Comments |
| Comment by Julien Fritsch [ 2021-12-09 ] | ||||||||||||||||||||||||||||||||||||||||||||
|
nayuta-yanagisawa this looks super bad, can you please have a look tomorrow? | ||||||||||||||||||||||||||||||||||||||||||||
| Comment by Nayuta Yanagisawa (Inactive) [ 2021-12-10 ] | ||||||||||||||||||||||||||||||||||||||||||||
|
There is no bug report including the function spider_xa_rollback_by_xid(). So, this seems to be a new bug. spider_xa_rollback_by_xid() can be called only via hton->rollback_by_xid and it is set only when spider_param_support_xa() returns true. So, a workaround would be to set spider_support_xa to OFF (ON by default), while this is only possible when the user doesn't use XA.
| ||||||||||||||||||||||||||||||||||||||||||||
| Comment by Nayuta Yanagisawa (Inactive) [ 2022-03-16 ] | ||||||||||||||||||||||||||||||||||||||||||||
The stack trace is incomplete but it seems that Spider crashed due to the NULL pointer dereference on the thread pointer, thd. I guess that this happened because the macro current_thd returned NULL and the function spider_sys_open_table() didn't assume the case where the thread pointer is NULL. | ||||||||||||||||||||||||||||||||||||||||||||
| Comment by Roel Van de Paar [ 2022-03-17 ] | ||||||||||||||||||||||||||||||||||||||||||||
|
c++filt version of the stack
I too could not locate any matching bug. | ||||||||||||||||||||||||||||||||||||||||||||
| Comment by Nayuta Yanagisawa (Inactive) [ 2022-03-18 ] | ||||||||||||||||||||||||||||||||||||||||||||
|
If the above guess of me is true, the bug is likely to reproduce on 10.2. | ||||||||||||||||||||||||||||||||||||||||||||
| Comment by Nayuta Yanagisawa (Inactive) [ 2022-06-14 ] | ||||||||||||||||||||||||||||||||||||||||||||
|
I gave some tries to reproduce the issue but got no success. I don't feel that we can solve this problem without a reproducible test case. So, I change the status to NEED_FEEDBACK. Having a reproducible test case is the best but I'd at least like to see the stack of the case spider_support_xa=OFF to understand the problem deeper. | ||||||||||||||||||||||||||||||||||||||||||||
| Comment by Kathryn Sizemore [ 2022-12-09 ] | ||||||||||||||||||||||||||||||||||||||||||||
|
Attached sql to reproduce crash – with the stored procedure call usp_Archiving_AuditRecords_test() | ||||||||||||||||||||||||||||||||||||||||||||
| Comment by Roel Van de Paar [ 2022-12-10 ] | ||||||||||||||||||||||||||||||||||||||||||||
|
Preliminary stack form 10.11 reproduction (debug build):
| ||||||||||||||||||||||||||||||||||||||||||||
| Comment by Roel Van de Paar [ 2022-12-10 ] | ||||||||||||||||||||||||||||||||||||||||||||
|
As the stack does not match the issue description, this is a separate bug. I will create a new ticket for the crash. | ||||||||||||||||||||||||||||||||||||||||||||
| Comment by Roel Van de Paar [ 2022-12-10 ] | ||||||||||||||||||||||||||||||||||||||||||||
|
As for the (close to startup) crash issue/original description, can anyone who sees this please clarify if
were in use (in combination with --plugin-load-add=ha_spider.so)? Please try without these options first to see if the issue remains (though even if it remains, albeit unlikely - but not impossible - it could be due to prior corruption). | ||||||||||||||||||||||||||||||||||||||||||||
| Comment by Roel Van de Paar [ 2022-12-10 ] | ||||||||||||||||||||||||||||||||||||||||||||
|
Created This ticket is reserved for the (close to startup) crash in spider_sys_open_table. | ||||||||||||||||||||||||||||||||||||||||||||
| Comment by Roel Van de Paar [ 2022-12-10 ] | ||||||||||||||||||||||||||||||||||||||||||||
|
| ||||||||||||||||||||||||||||||||||||||||||||
| Comment by Roel Van de Paar [ 2023-09-02 ] | ||||||||||||||||||||||||||||||||||||||||||||
|
c++filt resolved stack of last comment:
| ||||||||||||||||||||||||||||||||||||||||||||
| Comment by Yuchen Pei [ 2023-09-05 ] | ||||||||||||||||||||||||||||||||||||||||||||
|
julien.fritsch Sure, re-assigned myself | ||||||||||||||||||||||||||||||||||||||||||||
| Comment by Michael Widenius [ 2023-09-08 ] | ||||||||||||||||||||||||||||||||||||||||||||
|
ycp Is it not possible to generate a test case where we do an XA commit in spider and then cause it to crash and check if recovery works Have you tried that? | ||||||||||||||||||||||||||||||||||||||||||||
| Comment by Yuchen Pei [ 2023-09-13 ] | ||||||||||||||||||||||||||||||||||||||||||||
|
monty I took a look at the code. The trace looks a bit confusing -
And it looks like the recovery requires replication which I am not | ||||||||||||||||||||||||||||||||||||||||||||
| Comment by Andrei Elkin [ 2023-09-13 ] | ||||||||||||||||||||||||||||||||||||||||||||
|
ycp, these functions work in the normal binlogging server case. binlog.binlog_xa_recover could be instructive for you to understand low-level details. Specifically ha_recover_complete concludes the binlog-based server recovery with either committing of rolling back transactions in doubt (those that are present in binlog, but not committed yet in engine(s)). | ||||||||||||||||||||||||||||||||||||||||||||
| Comment by Yuchen Pei [ 2023-11-07 ] | ||||||||||||||||||||||||||||||||||||||||||||
|
Hi elenst, I noticed that you changed the status to Open. Has a | ||||||||||||||||||||||||||||||||||||||||||||
| Comment by Elena Stepanova [ 2023-11-07 ] | ||||||||||||||||||||||||||||||||||||||||||||
|
Not that I know of; but it doesn't need to be set to "Needs Feedback" if it's already assigned to Valerii from whom you expect the action. This status doesn't do what you may expect it to do. That said, if Valerii wants it to be in this status (e.g. because he is waiting for something from somebody to whom the ticket cannot be assigned), he can of course set the status himself, in this case it will make sense. |