[MXS-876] MaxScale crash inside qc_sqlite Created: 2016-09-22  Updated: 2016-09-30  Resolved: 2016-09-30

Status: Closed
Project: MariaDB MaxScale
Component/s: qc_sqlite
Affects Version/s: 2.0.0
Fix Version/s: 2.0.1

Type: Bug Priority: Blocker
Reporter: Kurt Pastore (Inactive) Assignee: Johan Wikman
Resolution: Fixed Votes: 0
Labels: None
Environment:

VMWare



 Description   

2016-09-22 11:09:50 error : Write to dcb 0x7f238c044b60 in state DCB_STATE_POLLING fd 651 failed due errno 104, Connection reset by peer
2016-09-22 12:00:33 error : Fatal: MaxScale beta-2.0.0 received fatal signal 11. Attempting backtrace.
2016-09-22 12:00:33 error : Commit ID: 029e6574da1ace10f367a7848cdfb607f7f7ba56 System name: Linux Release string: NAME="CentOS Linux"
2016-09-22 12:00:33 error : /usr/bin/maxscale() [0x403be3]
2016-09-22 12:00:33 error : /lib64/libpthread.so.0(+0xf100) [0x7f23d8a53100]
2016-09-22 12:00:33 error : /usr/lib64/maxscale/libqc_sqlite.so(+0x11abd) [0x7f23d2a15abd]
2016-09-22 12:00:33 error : /usr/lib64/maxscale/libqc_sqlite.so(+0x11d8d) [0x7f23d2a15d8d]
2016-09-22 12:00:33 error : /usr/lib64/maxscale/libqc_sqlite.so(+0x55b57) [0x7f23d2a59b57]
2016-09-22 12:00:33 error : /usr/lib64/maxscale/libqc_sqlite.so(+0x95e79) [0x7f23d2a99e79]
2016-09-22 12:00:33 error : /usr/lib64/maxscale/libqc_sqlite.so(+0x9a1df) [0x7f23d2a9e1df]
2016-09-22 12:00:33 error : /usr/lib64/maxscale/libqc_sqlite.so(+0x9b414) [0x7f23d2a9f414]
2016-09-22 12:00:33 error : /usr/lib64/maxscale/libqc_sqlite.so(+0x74b9f) [0x7f23d2a78b9f]
2016-09-22 12:00:33 error : /usr/lib64/maxscale/libqc_sqlite.so(+0x74f3d) [0x7f23d2a78f3d]
2016-09-22 12:00:33 error : /usr/lib64/maxscale/libqc_sqlite.so(+0x750a8) [0x7f23d2a790a8]
2016-09-22 12:00:33 error : /usr/lib64/maxscale/libqc_sqlite.so(+0x8a74) [0x7f23d2a0ca74]
2016-09-22 12:00:33 error : /usr/lib64/maxscale/libqc_sqlite.so(+0x8e19) [0x7f23d2a0ce19]
2016-09-22 12:00:33 error : /usr/lib64/maxscale/libqc_sqlite.so(+0x87b3) [0x7f23d2a0c7b3]
2016-09-22 12:00:33 error : /usr/lib64/maxscale/libqc_sqlite.so(+0x8827) [0x7f23d2a0c827]
2016-09-22 12:00:33 error : /usr/lib64/maxscale/libqc_sqlite.so(+0xc89f) [0x7f23d2a1089f]
2016-09-22 12:00:33 error : /usr/lib64/maxscale/libmaxscale-common.so.1.0.0(qc_get_type+0x20) [0x7f23d937c263]
2016-09-22 12:00:33 error : /usr/lib64/maxscale/libreadwritesplit.so(+0x4e61) [0x7f23d32e2e61]
2016-09-22 12:00:33 error : /usr/lib64/maxscale/libreadwritesplit.so(+0x4cd5) [0x7f23d32e2cd5]
2016-09-22 12:00:33 error : /usr/lib64/maxscale/libhintfilter.so(+0x131e) [0x7f23d2ed031e]
2016-09-22 12:00:33 error : /usr/lib64/maxscale/libMySQLClient.so(+0x3678) [0x7f23d1bf4678]
2016-09-22 12:00:33 error : /usr/lib64/maxscale/libMySQLClient.so(+0x2c1c) [0x7f23d1bf3c1c]
2016-09-22 12:00:33 error : /usr/lib64/maxscale/libMySQLClient.so(+0x2b85) [0x7f23d1bf3b85]
2016-09-22 12:00:33 error : /usr/lib64/maxscale/libMySQLClient.so(+0x252a) [0x7f23d1bf352a]
2016-09-22 12:00:33 error : /usr/lib64/maxscale/libmaxscale-common.so.1.0.0(+0x4771b) [0x7f23d937e71b]
2016-09-22 12:00:33 error : /usr/lib64/maxscale/libmaxscale-common.so.1.0.0(poll_waitevents+0x5de) [0x7f23d937dfca]
2016-09-22 12:00:33 error : /usr/bin/maxscale(worker_thread_main+0x2a) [0x404a9b]
2016-09-22 12:00:33 error : /lib64/libpthread.so.0(+0x7dc5) [0x7f23d8a4bdc5]
2016-09-22 12:00:33 error : /lib64/libc.so.6(clone+0x6d) [0x7f23d752e28d]



 Comments   
Comment by markus makela [ 2016-09-22 ]

Interpreted stacktrace:

/home/ec2-user/workspace/server/core/gateway.c:374
sigaction.c:?
/home/ec2-user/workspace/_build/sqlite-bld-3110100/sqlite3.c:22673
/home/ec2-user/workspace/_build/sqlite-bld-3110100/sqlite3.c:22776
/home/ec2-user/workspace/_build/sqlite-bld-3110100/sqlite3.c:87836
/home/ec2-user/workspace/_build/sqlite-bld-3110100/sqlite3.c:131923
/home/ec2-user/workspace/_build/sqlite-bld-3110100/sqlite3.c:133749
/home/ec2-user/workspace/_build/sqlite-bld-3110100/sqlite3.c:134937
/home/ec2-user/workspace/_build/sqlite-bld-3110100/sqlite3.c:110207
/home/ec2-user/workspace/_build/sqlite-bld-3110100/sqlite3.c:110305
/home/ec2-user/workspace/_build/sqlite-bld-3110100/sqlite3.c:110369
/home/ec2-user/workspace/query_classifier/qc_sqlite/qc_sqlite.c:392
/home/ec2-user/workspace/query_classifier/qc_sqlite/qc_sqlite.c:504
/home/ec2-user/workspace/query_classifier/qc_sqlite/qc_sqlite.c:294
/home/ec2-user/workspace/query_classifier/qc_sqlite/qc_sqlite.c:320
/home/ec2-user/workspace/query_classifier/qc_sqlite/qc_sqlite.c:2546
??:0
/home/ec2-user/workspace/server/modules/routing/readwritesplit/readwritesplit.c:1909
/home/ec2-user/workspace/server/modules/routing/readwritesplit/readwritesplit.c:1831
/home/ec2-user/workspace/server/modules/filter/hint/hintfilter.c:255
/home/ec2-user/workspace/server/modules/protocol/mysql_client.c:1388
/home/ec2-user/workspace/server/modules/protocol/mysql_client.c:898
/home/ec2-user/workspace/server/modules/protocol/mysql_client.c:849
/home/ec2-user/workspace/server/modules/protocol/mysql_client.c:532
/home/ec2-user/workspace/server/core/poll.c:1004
??:0
/home/ec2-user/workspace/server/core/gateway.c:943
pthread_create.c:?
??:0

Comment by markus makela [ 2016-09-22 ]

The crash happened inside the SQLite3 library in sqlite3DbMallocRawNN.

Comment by Johan Wikman [ 2016-09-23 ]

From #maxscale

Client mentions these crashes occur every 4 hours ... strangely it happens at 4pm then again at 8 then again at midnight.... Client definately sees this pattern

Comment by Johan Wikman [ 2016-09-23 ]

In the query classifier there's a leak that is exposed by e.g. a query like

delete t11.*, t12.* from t11,t12 where t11.a = t12.a;

Comment by Johan Wikman [ 2016-09-29 ]

Another similar

2016-09-29 04:04:30   error  : Fatal: MaxScale 2.0.1 received fatal signal 11. Attempting backtrace.
2016-09-29 04:04:30   error  : Commit ID: 649efb91b5c4a5c8ae83ff3eac4b3752f55f801b System name: Linux Release string: NAME="CentOS Linux"
/home/vagrant/workspace/server/core/gateway.c:374
/lib64/libpthread.so.0(+0xf100) [0x7fd800e68100]
/home/vagrant/workspace/_build/sqlite-bld-3110100/sqlite3.c:22673
/home/vagrant/workspace/_build/sqlite-bld-3110100/sqlite3.c:87085
/home/vagrant/workspace/_build/sqlite-bld-3110100/sqlite3.c:87177
/home/vagrant/workspace/_build/sqlite-bld-3110100/sqlite3.c:132302
/home/vagrant/workspace/_build/sqlite-bld-3110100/sqlite3.c:133749
/home/vagrant/workspace/_build/sqlite-bld-3110100/sqlite3.c:134937
/home/vagrant/workspace/_build/sqlite-bld-3110100/sqlite3.c:110207
/home/vagrant/workspace/_build/sqlite-bld-3110100/sqlite3.c:110305
/home/vagrant/workspace/_build/sqlite-bld-3110100/sqlite3.c:110369
/home/vagrant/workspace/query_classifier/qc_sqlite/qc_sqlite.c:392
/home/vagrant/workspace/query_classifier/qc_sqlite/qc_sqlite.c:504
/home/vagrant/workspace/query_classifier/qc_sqlite/qc_sqlite.c:294
/home/vagrant/workspace/query_classifier/qc_sqlite/qc_sqlite.c:320
/home/vagrant/workspace/query_classifier/qc_sqlite/qc_sqlite.c:2544
/usr/lib64/maxscale/libmaxscale-common.so.1.0.0(qc_get_type+0x20) [0x7fd801791213]
/home/vagrant/workspace/server/modules/routing/readwritesplit/readwritesplit.c:1909
/home/vagrant/workspace/server/modules/routing/readwritesplit/readwritesplit.c:1831
/home/vagrant/workspace/server/modules/filter/hint/hintfilter.c:255
/home/vagrant/workspace/server/modules/protocol/mysql_client.c:1388
/home/vagrant/workspace/server/modules/protocol/mysql_client.c:898
/home/vagrant/workspace/server/modules/protocol/mysql_client.c:849
/home/vagrant/workspace/server/modules/protocol/mysql_client.c:532
/home/vagrant/workspace/server/core/poll.c:1004
/usr/lib64/maxscale/libmaxscale-common.so.1.0.0(poll_waitevents+0x5de) [0x7fd801792f7a]
/home/vagrant/workspace/server/core/gateway.c:943
/lib64/libpthread.so.0(+0x7dc5) [0x7fd800e60dc5]
/lib64/libc.so.6(clone+0x6d) [0x7fd7ff94328d]

The crash is in the same location as in the previous case.

Comment by Johan Wikman [ 2016-09-29 ]

The code in question is

SQLITE_PRIVATE void *sqlite3DbMallocRawNN(sqlite3 *db, u64 n){
#ifndef SQLITE_OMIT_LOOKASIDE
  LookasideSlot *pBuf;
  assert( db!=0 );
  assert( sqlite3_mutex_held(db->mutex) );
  assert( db->pnBytesFreed==0 );
  if( db->lookaside.bDisable==0 ){
    assert( db->mallocFailed==0 );
    if( n>db->lookaside.sz ){
      db->lookaside.anStat[1]++;
    }else if( (pBuf = db->lookaside.pFree)==0 ){
      db->lookaside.anStat[2]++;

Information about the lookaside allocator can be found here: https://www.sqlite.org/malloc.html#lookaside

The effect of not using appeared not to be dramatic. So, we'll try by removing it and using the normal allocator instead.

Comment by Johan Wikman [ 2016-09-30 ]

The problem was not in the allocator.

Instead, the underlying reason was: "C-style escapes using the backslash character are not supported because they are not standard SQL" (from https://www.sqlite.org/lang_expr.html).

In most cases that did not cause any problems, but e.g. a statement like

 insert into t1 values ('\'');

would cause a buffer overrun since the sqlite string dequoter assumed that the last two trailing '' were an encoded ' and not the end of the string. So it would happily overwrite memory until it finally by chance encountered another '.

That broke the lookaside allocator but also regular malloc.

The fix was simply to add support for using backslash as an escape character.

Generated at Thu Feb 08 04:02:35 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.