[MDEV-16478] mysql_real_connect() from libmariadbd.so always crash Created: 2018-06-13  Updated: 2018-06-25  Resolved: 2018-06-25

Status: Closed
Project: MariaDB Server
Component/s: Embedded Server
Affects Version/s: 10.3.1, 10.3.2, 10.3.3, 10.3.4, 10.3.5, 10.3.6, 10.3.7, 10.3
Fix Version/s: 10.3.8

Type: Bug Priority: Critical
Reporter: Pali Assignee: Oleksandr Byelkin
Resolution: Fixed Votes: 0
Labels: None

Attachments: File test-connect.c    

 Description   

Function mysql_real_connect() from libmariadbd.so library always crash. In attachment is simple reproducer which tries to connect to MariaDB server via TCP from libmariadbd.so library.

Compile steps:

gcc -g test-connect.c -o test-connect `mysql_config --cflags --libmysqld-libs`

And here is crash backtrace from gdb:

$ gdb --args ./test-connect 127.0.0.1 3306 pali pali
 
Program received signal SIGSEGV, Segmentation fault.
QUERY_PROFILE::new_status (this=0x4, status_arg=0x7ffff6c83527 "Waiting for query cache lock", 
    function_arg=function_arg@entry=0x7ffff6c8da78 <Query_cache::try_lock(THD*, Query_cache::Cache_try_lock_mode)::__FUNCTION__> "try_lock", 
    file_arg=file_arg@entry=0x7ffff6c8d8f8 "mariadb-10.3.6/sql/sql_cache.cc", line_arg=603) at mariadb-10.3.6/sql/sql_profile.cc:312
312       prof->m_seq= m_seq_counter++;
(gdb) bt
#0  QUERY_PROFILE::new_status (this=0x4, status_arg=0x7ffff6c83527 "Waiting for query cache lock", 
    function_arg=function_arg@entry=0x7ffff6c8da78 <Query_cache::try_lock(THD*, Query_cache::Cache_try_lock_mode)::__FUNCTION__> "try_lock", 
    file_arg=file_arg@entry=0x7ffff6c8d8f8 "mariadb-10.3.6/sql/sql_cache.cc", line_arg=603) at mariadb-10.3.6/sql/sql_profile.cc:312
#1  0x00007ffff66610cc in PROFILING::status_change (this=<optimized out>, line_arg=<optimized out>, file_arg=0x7ffff6c8d8f8 "mariadb-10.3.6/sql/sql_cache.cc", 
    function_arg=0x7ffff6c8da78 <Query_cache::try_lock(THD*, Query_cache::Cache_try_lock_mode)::__FUNCTION__> "try_lock", status_arg=<optimized out>) at mariadb-10.3.6/sql/sql_profile.h:312
#2  THD::enter_stage (stage=<optimized out>, stage=<optimized out>, calling_line=<optimized out>, calling_file=0x7ffff6c8d8f8 "mariadb-10.3.6/sql/sql_cache.cc", 
    calling_func=0x7ffff6c8da78 <Query_cache::try_lock(THD*, Query_cache::Cache_try_lock_mode)::__FUNCTION__> "try_lock", this=<optimized out>) at mariadb-10.3.6/sql/sql_class.h:2365
#3  set_thd_stage_info (thd_arg=thd_arg@entry=0x55555576b140, new_stage=<optimized out>, old_stage=old_stage@entry=0x7fffffffce78, 
    calling_func=calling_func@entry=0x7ffff6c8da78 <Query_cache::try_lock(THD*, Query_cache::Cache_try_lock_mode)::__FUNCTION__> "try_lock", 
    calling_file=calling_file@entry=0x7ffff6c8d8f8 "mariadb-10.3.6/sql/sql_cache.cc", calling_line=calling_line@entry=603) at mariadb-10.3.6/sql/sql_class.cc:408
#4  0x00007ffff665a3b3 in Query_cache_wait_state::Query_cache_wait_state (line=603, file=0x7ffff6c8d8f8 "mariadb-10.3.6/sql/sql_cache.cc", func=<synthetic pointer>, thd=0x55555576b140, 
    this=0x7fffffffce70) at mariadb-10.3.6/sql/sql_cache.cc:432
#5  Query_cache::try_lock (this=0x555555770178, this@entry=0x7ffff75502c0 <query_cache>, thd=0x55555576b140, mode=(unknown: 4149543616), mode@entry=Query_cache::WAIT)
    at mariadb-10.3.6/sql/sql_cache.cc:603
#6  0x00007ffff665d6ac in Query_cache::insert (this=0x7ffff75502c0 <query_cache>, thd=<optimized out>, query_cache_tls=0x55555576b410, packet=0x5555557772a8 "\244", length=168, pkt_nr=2)
    at mariadb-10.3.6/sql/sql_cache.cc:1082
#7  0x00007ffff6616211 in net_real_write (net=net@entry=0x555555770178, packet=0x5555557772a8 "\244", len=<optimized out>) at mariadb-10.3.6/sql/net_serv.cc:620
#8  0x00007ffff661658b in net_flush (net=net@entry=0x555555770178) at mariadb-10.3.6/sql/net_serv.cc:377
#9  0x00007ffff65e4b71 in send_client_reply_packet (mpvio=0x7fffffffd680, data=<optimized out>, data_len=<optimized out>) at mariadb-10.3.6/sql-common/client.c:2679
#10 0x00007ffff65e506d in client_mpvio_write_packet (mpv=0x7fffffffd680, pkt=<optimized out>, pkt_len=<optimized out>) at mariadb-10.3.6/sql-common/client.c:2775
#11 0x00007ffff65e25f2 in native_password_auth_client (vio=0x7fffffffd680, mysql=0x555555770178) at mariadb-10.3.6/sql-common/client.c:4702
#12 0x00007ffff65e5322 in run_plugin_auth (mysql=mysql@entry=0x555555770178, data=0x5555557772df "l\365\211\062\254\332dmysql_native_password", data_len=21, data_plugin=0x5555557772f4 "assword", 
    db=db@entry=0x0) at mariadb-10.3.6/sql-common/client.c:2911
#13 0x00007ffff65e70c3 in cli_mysql_real_connect (mysql=mysql@entry=0x555555770178, host=0x7fffffffe1b7 "127.0.0.1", user=<optimized out>, user@entry=0x7fffffffe1c6 "pali", passwd=0x7fffffffe1cb "pali", 
    db=<optimized out>, db@entry=0x0, port=3306, unix_socket=<optimized out>, client_flag=2147614722) at mariadb-10.3.6/sql-common/client.c:3575
#14 0x00007ffff65f627d in mysql_real_connect (mysql=0x555555770178, host=<optimized out>, user=0x7fffffffe1c6 "pali", passwd=<optimized out>, db=0x0, port=<optimized out>, unix_socket=0x0, 
    client_flag=2147614722) at mariadb-10.3.6/libmysqld/libmysqld.c:108
#15 0x0000555555554c86 in main (argc=5, argv=0x7fffffffddf8) at test-connect.c:46

This problem was discovered while developing Perl DBI driver DBD::MariaDB: https://github.com/gooddata/DBD-MariaDB. First MariaDB version in which it was detected is 10.3.1 and it is present also in last 10.3.7 version. Versions 10.3.0 is working fine, also all versions from 10.2, 10.1, 10.0 and 5.5 series.

Due to this problem, when DBD::MariaDB is compiled and linked with affected libmariadbd.so version, libmariadbd.so crashes and segfault whole perl process on every connection attempt. So it makes it fully unusable.

Please, let us know what should we do with these crashes... If there is some workaround or something else.



 Comments   
Comment by Pali [ 2018-06-18 ]

For now we disabled compilation with MariaDB 10.3 series of libmariadbd.so library, see:
https://github.com/gooddata/DBD-MariaDB/commit/79c8949717f93a4bda11e7d0744f7ea4989c318b

I hope that you fix this problem in next MariaDB version, because libmariadbd.so library in this state is unusable.

Comment by Elena Stepanova [ 2018-06-18 ]

Thanks for the report. Reproducible as described.


10.2 fails too, but quite differently:

10.2 352c7e0dfa

mdev16478a: /data/src/10.2/sql/log.cc:1139: void LOGGER::cleanup_base(): Assertion `inited == 1' failed.
 
#0  0x00007f1c68a07fcf in raise () from /lib/x86_64-linux-gnu/libc.so.6
#1  0x00007f1c68a093fa in abort () from /lib/x86_64-linux-gnu/libc.so.6
#2  0x00007f1c68a00e37 in __assert_fail_base () from /lib/x86_64-linux-gnu/libc.so.6
#3  0x00007f1c68a00ee2 in __assert_fail () from /lib/x86_64-linux-gnu/libc.so.6
#4  0x00007f1c6ac252cd in LOGGER::cleanup_base (this=0x7f1c6c5bb5c0 <logger>) at /data/src/10.2/sql/log.cc:1139
#5  0x00007f1c6a8b49dc in clean_up (print_message=false) at /data/src/10.2/libmysqld/../sql/mysqld.cc:2200
#6  0x00007f1c6a8bd98a in end_embedded_server () at /data/src/10.2/libmysqld/lib_sql.cc:647
#7  0x00007f1c6a89b1d1 in mysql_server_end () at /data/src/10.2/libmysqld/libmysql.c:213
#8  0x0000000000400acf in main (argc=5, argv=0x7ffd7ea6a5c8) at mdev16478.c:58

Comment by Oleksandr Byelkin [ 2018-06-20 ]

debugging binaries with safe mutex gave up the problem: there was no correct initialization of QC (maybe the whole server).

Comment by Oleksandr Byelkin [ 2018-06-20 ]

It looks like MYSQL_OPT_USE_REMOTE_CONNECTION make correct cli_mysql_real_connect call but then version of net_real_write called that which build with MYSQL_SERVER and it call QC check which was not initialised of course.

Comment by Oleksandr Byelkin [ 2018-06-20 ]

In 10.2 QC is off in embedded server, there is some problem also with not inited structures

Comment by Oleksandr Byelkin [ 2018-06-21 ]

MYSQL_SERVER is not defined when compiled embedded net_serv.cc

Comment by Vladislav Vaintroub [ 2018-06-21 ]

pali, is there a specific reason to use embedded library in (what seems to be strictly) client-server environment, i.e to access non-embedded server?
Why not use standard client library?

Comment by Pali [ 2018-06-22 ]

It is for DBD::MariaDB, perl's DBI driver (as XS module). Previously it had nasty hack which produced copy of all source code files filtered by some sed script which added suffix "Emb" to driver name and then it compiled it and linked separately with libmysqld.so/libmariadbd.so library. This just complicated lot of things in building XS module, therefore I clean it up and now build process produce only one XS module: DBD::MariaDB. There is no copy/sed operation anymore and at configure time (when running Makefile.PL) it prefers to use libmysqld/libmariadbd if available. Code of driver was changed to fully support connecting to remote database even when compiled with libmysqld.so/libmariadbd.so so there is no functionality loss in DBD::MariaDB driver itself. And so it support also embedded server in DBI, like in "Emb" hack before.

Comment by Vladislav Vaintroub [ 2018-06-22 ]

Thanks! You're aware that if you use embedded for remote connection, it is not the same as official C driver?
It is a different code based on older 10.1 version, which did not have a lot of test coverage since 10.2

Might also be of interest for you : https://mysqlserverteam.com/mysql-8-0-retiring-support-for-libmysqld/ or MDEV-16535 (this one is not scheduled yet, and is not certain when it will be scheduled, but still) . I'm not sure if I would chose embedded as default, given lack of focus in embedded.

Comment by Pali [ 2018-06-22 ]

For DBD::MariaDB driver we have a large test suite with lot of different MySQL and MariaDB versions, see: https://travis-ci.org/gooddata/DBD-MariaDB It uses libmysqlclient.so, libmariadb.so, libmysqld.so, libmariadbd.so... whatever binary installation provides. And each job runs lot of driver tests, including all supported functionality and all tests pass. So I do not see any missing feature (yet). Just only one, reported in this bug: Since MariaDB 10.3.1 libmariadbd.so is unusable. DBD::MariaDB code had lot of problems with memory corruptions and it needed lot of fixed for proper initialization. And once this was done, usage of libmariadbd.so was simple, only one call for MYSQL_OPT_USE_REMOTE_CONNECTION on proper place.

Comment by Oleksandr Byelkin [ 2018-06-23 ]

revision-id: 0882750fb5df5d4cf321ad4f8bc932a3c1248574 (mariadb-10.3.7-58-g0882750fb5d)
parent(s): bcc2100f9d0bd1a2c21acd0de831e9dd1b8a703e
author: Oleksandr Byelkin
committer: Oleksandr Byelkin
timestamp: 2018-06-23 09:47:18 +0200
message:

MDEV-16478: mysql_real_connect() from libmariadbd.so always crash

Returned accidentally removed undefinition of MYSQL_SERVER in net_serv.cc inside embedded server
(embedded server uses real_net_read/write only as a client)

Prevented attempt to clean up embedded server if it was not initialized

Comment by Oleksandr Byelkin [ 2018-06-23 ]

Actually it is 2 bugs fixes, one is present in earlier versions (when library inited with -1 argc parameter then cleanup trying to clean uninitialized embedded server) but I am not sure if it has sens to move fix earlier if nobody complained about it.

Comment by Pali [ 2018-06-23 ]

I spotted also that second bug. I planned to report it, but forgot... In DBD::MariaDB there is a workaround: does not call mysql_server_end()/mysql_library_end() when linked with libmariadbd.so and embedded server was not started. See: https://github.com/gooddata/DBD-MariaDB/commit/64f5c7a251b6fd85b939f08b193c00c5c8e911b3#diff-910bff34a0a9cc85c863b926e3fe02b9R2562

Comment by Pali [ 2018-06-23 ]

And maybe second bug is related to this comment: https://jira.mariadb.org/browse/CONC-336?focusedCommentId=111921&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-111921

Comment by Sergei Golubchik [ 2018-06-24 ]

ok to push

Comment by Oleksandr Byelkin [ 2018-06-25 ]

pali, No it looks like other bug

Comment by Pali [ 2018-06-25 ]

Ok, so then should I report it also for MDEV project?

Comment by Oleksandr Byelkin [ 2018-06-25 ]

pali for me it looks like a server (even if it is embedded) bug so better report it to server

Comment by Pali [ 2018-06-25 ]

Reported: MDEV-16578

Generated at Thu Feb 08 08:29:13 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.