[MDEV-9195] Segmentation fault when using the embedded library Created: 2015-11-26  Updated: 2016-06-05  Resolved: 2016-01-25

Status: Closed
Project: MariaDB Server
Component/s: Embedded Server
Affects Version/s: 10.1.9
Fix Version/s: N/A

Type: Bug Priority: Minor
Reporter: markus makela Assignee: Unassigned
Resolution: Cannot Reproduce Votes: 0
Labels: None
Environment:

Fedora release 22 (Twenty Two)
MaxScale 1.3.0 built with MariaDB 10.1.9 embedded library


Attachments: Text File CMakeLists.txt     File FindMySQL.cmake     File FindPCRE.cmake     File README.md     File data.sql     File lib.cc     File old.php     File test.cc    
Issue Links:
Relates
relates to MXS-487 lost connection to backend server Closed

 Description   

When testing MaxScale with the 10.1.9 embedded library and running the attached PHP script with the data.sql loaded in and after a while MaxScale gets a segmentation fault in mysql_init. I ran it under valgrind and I'm getting first an invalid read then an invalid write:

==25870== Thread 8:
==25870== Invalid read of size 8
==25870==    at 0x59519D: my_malloc_size_cb_func (in /home/markusjm/build/bin/maxscale)
==25870==    by 0x58B46E: my_malloc (in /home/markusjm/build/bin/maxscale)
==25870==    by 0x561A36: mysql_init (in /home/markusjm/build/bin/maxscale)
==25870==    by 0x1CE3423E: parsing_info_init (query_classifier.cc:1406)
==25870==    by 0x1CE320DC: parse_query (query_classifier.cc:158)
==25870==    by 0x1CE34D98: query_classifier_get_operation (query_classifier.cc:1608)
==25870==    by 0x1CC1F1A7: route_single_stmt (readwritesplit.c:2192)
==25870==    by 0x1CC1E91B: routeQuery (readwritesplit.c:2039)
==25870==    by 0x1DE75326: route_by_statement (mysql_client.c:1891)
==25870==    by 0x1DE7304D: gw_read_client_event (mysql_client.c:1092)
==25870==    by 0x54799F: process_pollq (poll.c:915)
==25870==    by 0x547029: poll_waitevents (poll.c:669)
==25870==  Address 0x1b738f38 is 4,344 bytes inside a block of size 20,240 free'd
==25870==    at 0x4C29D6A: free (in /usr/lib64/valgrind/vgpreload_memcheck-amd64-linux.so)
==25870==    by 0x599F99: emb_free_embedded_thd (in /home/markusjm/build/bin/maxscale)
==25870==    by 0x1CE34474: parsing_info_done (query_classifier.cc:1470)
==25870==    by 0x52E537: gwbuf_remove_buffer_object (buffer.c:687)
==25870==    by 0x52CFCA: gwbuf_free (buffer.c:256)
==25870==    by 0x1CC2165F: clientReply (readwritesplit.c:2970)
==25870==    by 0x20C8FBD6: gw_read_backend_event (mysql_backend.c:565)
==25870==    by 0x54799F: process_pollq (poll.c:915)
==25870==    by 0x547029: poll_waitevents (poll.c:669)
==25870==    by 0x5BC5554: start_thread (in /usr/lib64/libpthread-2.21.so)
==25870==    by 0x7614B9C: clone (in /usr/lib64/libc-2.21.so)
==25870== Invalid write of size 8
==25870==    at 0x5951A7: my_malloc_size_cb_func (in /home/markusjm/build/bin/maxscale)
==25870==    by 0x58B46E: my_malloc (in /home/markusjm/build/bin/maxscale)
==25870==    by 0x561A36: mysql_init (in /home/markusjm/build/bin/maxscale)
==25870==    by 0x1CE3423E: parsing_info_init (query_classifier.cc:1406)
==25870==    by 0x1CE320DC: parse_query (query_classifier.cc:158)
==25870==    by 0x1CE34D98: query_classifier_get_operation (query_classifier.cc:1608)
==25870==    by 0x1CC1F1A7: route_single_stmt (readwritesplit.c:2192)
==25870==    by 0x1CC1E91B: routeQuery (readwritesplit.c:2039)
==25870==    by 0x1DE75326: route_by_statement (mysql_client.c:1891)
==25870==    by 0x1DE7304D: gw_read_client_event (mysql_client.c:1092)
==25870==    by 0x54799F: process_pollq (poll.c:915)
==25870==    by 0x547029: poll_waitevents (poll.c:669)
==25870==  Address 0x1b738f38 is 4,344 bytes inside a block of size 20,240 free'd
==25870==    at 0x4C29D6A: free (in /usr/lib64/valgrind/vgpreload_memcheck-amd64-linux.so)
==25870==    by 0x599F99: emb_free_embedded_thd (in /home/markusjm/build/bin/maxscale)
==25870==    by 0x1CE34474: parsing_info_done (query_classifier.cc:1470)
==25870==    by 0x52E537: gwbuf_remove_buffer_object (buffer.c:687)
==25870==    by 0x52CFCA: gwbuf_free (buffer.c:256)
==25870==    by 0x1CC2165F: clientReply (readwritesplit.c:2970)
==25870==    by 0x20C8FBD6: gw_read_backend_event (mysql_backend.c:565)
==25870==    by 0x54799F: process_pollq (poll.c:915)
==25870==    by 0x547029: poll_waitevents (poll.c:669)
==25870==    by 0x5BC5554: start_thread (in /usr/lib64/libpthread-2.21.so)
==25870==    by 0x7614B9C: clone (in /usr/lib64/libc-2.21.so)
==25870== 

This does not occur with 10.0.22.

From MaxScale's point of view, we've ruled out concurrent usage and closing of the THD, it always seems to be a different THD which causes the segfault.



 Comments   
Comment by Elena Stepanova [ 2015-12-26 ]

markus makela,
Can you extract the essential part of the logic into a separate single test file, so whoever investigates it does not have to go through the pain of installing and configuring MaxScale and then digging into its code? I remember doing this before, it was not fun and totally wasn't worth it at the end. Knowing the code and having MaxScale handy, you can do it way faster.

Comment by markus makela [ 2015-12-27 ]

Added a small test which mimics MaxScale's behavior but with it, I wasn't able to reproduce this problem. Due to this, I don't think it's a problem with the embedded server but somehow relates to how MaxScale uses it.

I'll continue investigating if it's reproducible without MaxScale being involved.

Comment by Elena Stepanova [ 2016-01-25 ]

markus makela, any luck?

Comment by markus makela [ 2016-01-25 ]

So far I haven't been able to reproduce it without MaxScale so I'd say it's something related to how MaxScale uses it. I'd close this and once we've managed to reproduce it without MaxScale, we could open it again.

Comment by Yuval Hager [ 2016-06-02 ]

@markus makela, Have you ever found a solution to this? Is this tracked somewhere within MXS? the issue linked here (MXS-487) doesn't seem to be related.

Comment by markus makela [ 2016-06-05 ]

I've actually made some progress and so far the fix to this is to disable the malloc callback function by calling set_malloc_size_cb(NULL) after mysql_library_init.

elenst, I think this can be closed as Not a Bug since it is caused by wrong usage of the library (at least that's my conclusion when I tried to look at the code).

Generated at Thu Feb 08 07:32:51 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.