Details
-
Bug
-
Status: Confirmed (View Workflow)
-
Blocker
-
Resolution: Unresolved
-
12.3, 12.3.1
-
FreeBSD 14, FreeBSD 15 on x86-64
-
Can result in hang or crash
-
Q2/2026 Server Maintenance
Description
danblack asked the InnoDB team to check why MariaDB Server 12.3 is crashing on FreeBSD. The root cause of the crash is a movaps instruction that is executing a 16-byte aligned store even though the address is only aligned to 8 bytes, similar to what we had in MDEV-28091.
(gdb) down
|
#7 THD::THD (this=this@entry=0x839610d3598, id=id@entry=0,
|
is_wsrep_applier=false) at /home/marko/server/sql/sql_class.cc:714
|
714 THD::THD(my_thread_id id, bool is_wsrep_applier)
|
(gdb)
|
#6 0x0000000000f33b20 in Sp_caches::Sp_caches (this=0x839610d3748)
|
at /home/marko/server/sql/sql_class.h:2922
|
2922 :m_sp_cache_version(0), sp_proc_cache(NULL), sp_func_cache(NULL),
|
(gdb) i reg rbx
|
rbx 0x839610d3598 9042534413720
|
The THD object had been allocated as follows:
#0 my_malloc (key=<optimized out>, size=27536, my_flags=4120)
|
at /home/marko/server/mysys/my_malloc.c:71
|
#1 0x0000000000f3e0f7 in ilink::operator new (size=27536)
|
at /home/marko/server/sql/sql_list.h:676
|
#2 create_background_thd () at /home/marko/server/sql/sql_class.cc:5488
|
#1 0x00000000012fb35f in innobase_create_background_thd (
|
name=0x4936ea "InnoDB FTS optimizer")
|
at /home/marko/server/storage/innobase/handler/ha_innodb.cc:1681
|
#2 0x000000000132fc67 in fts_optimize_init ()
|
at /home/marko/server/storage/innobase/fts/fts0opt.cc:2796
|
#3 0x0000000001490294 in srv_start (create_new_db=true)
|
at /home/marko/server/storage/innobase/srv/srv0start.cc:1953
|
#4 0x0000000001310d01 in innodb_init (p=<optimized out>)
|
at /home/marko/server/storage/innobase/handler/ha_innodb.cc:4225
|
Both a local build in a qemu based environment and a creative navigation of https://buildbot.mariadb.org pointed out that the problem was introduced in MDEV-28302, which consists of two changes:
4c18d337e58 MDEV-28302 configurable defaults for CHANGE MASTER
|
89bd6b00335 MDEV-37530 Refactor Master & Relay Log info to iterable tuples
|
8857312503a (origin/bb-12.3-hf) fix printing of per-partition engine options
|
I could repeat the problem with 4c18d337e58 and not with 8857312503a in my local environment. There were some compilation errors in the change 89bd6b00335, so I did not test that one.
In the grid view of the pull request that introduced this change, we can see that the FreeBSD build consistently failed. Here is the last build. For some reason, this builder does not report its status to GitHub; hence the failure is not visible when viewing https://github.com/MariaDB/server/pull/4430.
For the parent commit the server was able to start up.
As far as I can tell, the problem is that the alignment of LEX_MASTER_INFO and therefore that of THD was increased from 64 to 128 bits. It is not obvious to me how that happened, but I would suspect that the introduction of some std::function could be related to that.
On a deeper look, I see that the malloc() call inside my_malloc() is returning a 16-byte aligned address. The my_malloc() is adding a 24-byte header on top of that; here, adjusting the allocation size:
0x00000000014f0800 <+64>: lea 0x18(%rbx),%r14
|
0x00000000014f0804 <+68>: mov %r14,%rdi
|
0x00000000014f0807 <+71>: call 0x15bd1b0 <malloc@plt>
|
The following patch would fix this particular crash, but not others of the same type:
malloc.c b/mysys/my_malloc.c
|
index 18272b0cb28..9222410ae1f 100644
|
--- a/mysys/my_malloc.c
|
+++ b/mysys/my_malloc.c
|
@@ -26,7 +26,7 @@ struct my_memory_header
|
PSI_memory_key m_key;
|
};
|
typedef struct my_memory_header my_memory_header;
|
-#define HEADER_SIZE 24
|
+#define HEADER_SIZE 32
|
|
|
#define USER_TO_HEADER(P) ((my_memory_header*)((char *)(P) - HEADER_SIZE))
|
#define HEADER_TO_USER(P) ((char*)(P) + HEADER_SIZE) |
We would hit another movaps on an improperly aligned pointer a little later during the bootstrap:
|
12.3 4c18d337e58750f0670e6e7693849fb1a7bed84b with the above patch |
#5 <signal handler called>
|
#6 0x0000000000fb2bb0 in st_select_lex_node::st_select_lex_node (
|
this=0xa9b8b619918) at /home/marko/server/sql/sql_lex.h:657
|
#7 st_select_lex_unit::st_select_lex_unit (this=0xa9b8b619918)
|
at /home/marko/server/sql/sql_lex.h:787
|
#8 LEX::LEX (this=0xa9b8b619838) at /home/marko/server/sql/sql_lex.cc:4148
|
#9 0x0000000000ed9ccd in st_lex_local::st_lex_local (this=0xa9b8b619838)
|
at /home/marko/server/sql/sql_lex.h:5312
|
#10 sp_lex_local::sp_lex_local (this=0xa9b8b619838, thd=0xa9b795dc2a0,
|
oldlex=0xa9b795e03d0) at /home/marko/server/sql/sql_lex.h:5348
|
#11 sp_head::reset_lex (this=0xa9b8b305038, thd=0xa9b795dc2a0)
|
at /home/marko/server/sql/sp_head.cc:2557
|
#12 0x0000000000fb8493 in LEX::sp_variable_declarations_init (
|
this=0xa9b795e03d0, thd=0x0, nvars=0)
|
at /home/marko/server/sql/sql_lex.cc:6959
|
#13 0x0000000000f3b817 in MYSQLparse (thd=thd@entry=0xa9b795dc2a0)
|
at /home/marko/server/sql/sql_yacc.yy:3537
|
#14 0x0000000000fdae8a in parse_sql (thd=0xa9b795dc2a0,
|
It seems to me that if the my_malloc() was replaced with straight malloc(), these problems would go away. This would require a compilation with PERFORMANCE_SCHEMA=NO.
I wonder if this is somehow related to the following definition that is being made use in some system header files:
|
/usr/include/sys/cdefs.h |
#define __alloc_align(x) __attribute__((__alloc_align__(x))) |
Perhaps FreeBSD assumes that any memory that is returned by a malloc like function is aligned in the way that is guaranteed by the system malloc(3)? I tried the following patch, but it did not prevent even the original crash:
diff --git a/mysys/my_malloc.c b/mysys/my_malloc.c
|
index 18272b0cb28..2509f8f63e2 100644
|
--- a/mysys/my_malloc.c
|
+++ b/mysys/my_malloc.c
|
@@ -66,7 +66,6 @@ void set_malloc_size_cb(MALLOC_SIZE_CB func)
|
|
|
@return A pointer to the allocated memory block, or NULL on failure.
|
*/
|
-ATTRIBUTE_MALLOC
|
void *my_malloc(PSI_memory_key key, size_t size, myf my_flags)
|
{
|
my_memory_header *mh; |
Last, I tested the following:
diff --git a/sql/sql_class.cc b/sql/sql_class.cc
|
index 48a7e358504..a79b01abfdc 100644
|
--- a/sql/sql_class.cc
|
+++ b/sql/sql_class.cc
|
@@ -790,6 +790,9 @@ THD::THD(my_thread_id id, bool is_wsrep_applier)
|
wsrep_wfc()
|
#endif /*WITH_WSREP */
|
{
|
+static_assert(alignof(THD) == 16);
|
+static_assert(alignof(LEX_MASTER_INFO) == 16);
|
+static_assert(alignof(THD) == 8);
|
variables= {};
|
|
|
/* |
The first two static_assert would hold; the last one did not.
It could be simplest to fix LEX_MASTER_INFO so that the following assertion holds:
static_assert(alignof(LEX_MASTER_INFO) == 8 || alignof(LEX_MASTER_INFO) == alignof(void*)); |
Attachments
Issue Links
- is caused by
-
MDEV-28302 Configurable defaults for MASTER_SSL_* settings for CHANGE MASTER
-
- Closed
-