[MDEV-336] oqgraph 5.5 crashes in buildbot Created: 2012-06-13  Updated: 2012-08-25  Resolved: 2012-08-25

Status: Closed
Project: MariaDB Server
Component/s: None
Affects Version/s: 5.5.24, 5.3.8
Fix Version/s: 5.5.27, 5.3.8

Type: Bug Priority: Major
Reporter: Sergei Golubchik Assignee: Sergei Golubchik
Resolution: Fixed Votes: 0
Labels: None

Issue Links:
Blocks
blocks MDEV-206 OQGraph is missing from 5.5 bintars (... Closed
Relates

 Description   

Now we don't build OQGraph for 5.5 trees, because buildbot builders lack sufficiently new Boost (MDEV-206).

When we tried to update Boost, OQGraph started failing.
For example, on debian6-amd64 builder, with boost_1_49_0, OQGraph fails wih

==============================================================================
 
TEST                                      RESULT   TIME (ms) or COMMENT
--------------------------------------------------------------------------
 
oqgraph.binlog                           [ pass ]     28
oqgraph.basic                            [ fail ]
        Test ended at 2012-06-13 16:19:05
 
CURRENT_TEST: oqgraph.basic
 
 
Server [mysqld.1 - pid: 26902, winpid: 26902, exit: 256] failed during test run
Server log from this test:
----------SERVER LOG START-----------
120613 16:19:03 [Note] Plugin 'InnoDB' is disabled.
120613 16:19:03 [Note] Plugin 'INNODB_RSEG' is disabled.
120613 16:19:03 [Note] Plugin 'INNODB_TRX' is disabled.
120613 16:19:03 [Note] Plugin 'INNODB_LOCK_WAITS' is disabled.
120613 16:19:03 [Note] Plugin 'INNODB_CMP' is disabled.
120613 16:19:03 [Note] Plugin 'INNODB_CMP_RESET' is disabled.
120613 16:19:03 [Note] Plugin 'INNODB_CMPMEM' is disabled.
120613 16:19:03 [Note] Plugin 'INNODB_CMPMEM_RESET' is disabled.
120613 16:19:03 [Note] Plugin 'INNODB_SYS_TABLES' is disabled.
120613 16:19:03 [Note] Plugin 'INNODB_SYS_TABLESTATS' is disabled.
120613 16:19:03 [Note] Plugin 'INNODB_SYS_INDEXES' is disabled.
120613 16:19:03 [Note] Plugin 'INNODB_SYS_COLUMNS' is disabled.
120613 16:19:03 [Note] Plugin 'INNODB_SYS_FIELDS' is disabled.
120613 16:19:03 [Note] Plugin 'INNODB_SYS_FOREIGN' is disabled.
120613 16:19:03 [Note] Plugin 'INNODB_SYS_FOREIGN_COLS' is disabled.
120613 16:19:03 [Note] Plugin 'INNODB_SYS_STATS' is disabled.
120613 16:19:03 [Note] Plugin 'INNODB_TABLE_STATS' is disabled.
120613 16:19:03 [Note] Plugin 'INNODB_INDEX_STATS' is disabled.
120613 16:19:03 [Note] Plugin 'INNODB_BUFFER_POOL_PAGES' is disabled.
120613 16:19:03 [Note] Plugin 'INNODB_BUFFER_POOL_PAGES_INDEX' is disabled.
120613 16:19:03 [Note] Plugin 'INNODB_BUFFER_POOL_PAGES_BLOB' is disabled.
120613 16:19:03 [Note] Plugin 'XTRADB_ADMIN_COMMAND' is disabled.
120613 16:19:03 [Note] Plugin 'partition' is disabled.
120613 16:19:03 [Warning] /home/buildbot/mariadb-5.5.24/sql/mysqld: unknown variable 'loose-feedback-user-info=mysql-test'
120613 16:19:03 [Warning] /home/buildbot/mariadb-5.5.24/sql/mysqld: unknown variable 'loose-debug-sync-timeout=300'
120613 16:19:03 [Note] Event Scheduler: Loaded 0 events
120613 16:19:03 [Note] /home/buildbot/mariadb-5.5.24/sql/mysqld: ready for connections.
Version: '5.5.24-MariaDB-log'  socket: '/home/buildbot/mariadb-5.5.24/mysql-test/var/tmp/mysqld.1.sock'  port: 16000  Source distribution
120613 16:19:03 [ERROR] mysqld got signal 11 ;
This could be because you hit a bug. It is also possible that this binary
or one of the libraries it was linked against is corrupt, improperly built,
or misconfigured. This error can also be caused by malfunctioning hardware.
 
To report this bug, see http://kb.askmonty.org/en/reporting-bugs
 
We will try our best to scrape up some info that will hopefully help
diagnose the problem, but since we have already crashed, 
something is definitely wrong and this may fail.
 
Server version: 5.5.24-MariaDB-log
key_buffer_size=1048576
read_buffer_size=131072
max_used_connections=1
max_threads=153
thread_count=1
It is possible that mysqld could use up to 
key_buffer_size + (read_buffer_size + sort_buffer_size)*max_threads = 61887 K  bytes of memory
Hope that's ok; if not, decrease some variables in the equation.
 
Thread pointer: 0x0x2554d80
Attempting backtrace. You can use the following information to find out
where mysqld died. If you see no messages after this, something went
terribly wrong...
stack_bottom = 0x7f8c9d2cce68 thread_stack 0x48000
mysys/stacktrace.c:247(my_print_stacktrace)[0x9cb909]
sql/signal_handler.cc:170(handle_fatal_signal)[0x69aaaa]
??:0(??)[0x7f8c9cef8ff0]
??:0(??)[0x7f8c90f18929]
??:0(??)[0x7f8c90f17c8c]
??:0(??)[0x7f8c90f17e7d]
??:0(??)[0x7f8c90f162b4]
sql/sql_class.h:4147(handler::ha_index_read_map(unsigned char*, unsigned char const*, unsigned long, ha_rkey_function))[0x6a07e4]
sql/multi_range_read.cc:297(handler::multi_range_read_next(void**))[0x64bdc4]
sql/opt_range.cc:11008(QUICK_RANGE_SELECT::get_next())[0x75efa6]
sql/records.cc:339(rr_quick)[0x77b7a6]
sql/sql_select.cc:16014(sub_select(JOIN*, st_join_table*, bool))[0x5a4d48]
sql/sql_select.cc:15687(do_select)[0x5ab42b]
sql/sql_select.cc:2808(JOIN::exec())[0x5c17a2]
sql/sql_select.cc:3029(mysql_select(THD*, Item***, TABLE_LIST*, unsigned int, List<Item>&, Item*, unsigned int, st_order*, st_order*, Item*, st_order*, unsigned long long, select_result*, st_select_lex_unit*, st_select_lex*))[0x5c354f]
sql/sql_select.cc:311(handle_select(THD*, LEX*, select_result*, unsigned long))[0x5c40c4]
sql/sql_parse.cc:4617(execute_sqlcom_select)[0x57ac44]
sql/sql_parse.cc:2184(mysql_execute_command(THD*))[0x5804f4]
sql/sql_parse.cc:5731(mysql_parse(THD*, char*, unsigned int, Parser_state*))[0x582e59]
sql/sql_parse.cc:1057(dispatch_command(enum_server_command, THD*, char*, unsigned int))[0x58415b]
sql/sql_connect.cc:1253(do_handle_one_connection(THD*))[0x6263d4]
sql/sql_connect.cc:1170(handle_one_connection)[0x62646a]
perfschema/pfs.cc:1018(pfs_spawn_thread)[0x8900e6]
??:0(??)[0x7f8c9cef08ca]
??:0(??)[0x7f8c9ba5086d]
 
Trying to get some variables.
Some pointers may be invalid and cause the dump to abort.
Query (0x25ec9c8): select * from graph where latch = 2 and origid = 1 and weight = 1
Connection ID (thread ID): 2
Status: NOT_KILLED
 
Optimizer switch: index_merge=on,index_merge_union=on,index_merge_sort_union=on,index_merge_intersection=on,index_merge_sort_intersection=off,engine_condition_pushdown=off,index_condition_pushdown=on,derived_merge=on,derived_with_keys=on,firstmatch=on,loosescan=on,materialization=on,in_to_exists=on,semijoin=on,partial_match_rowid_merge=on,partial_match_table_scan=on,subquery_cache=on,mrr=off,mrr_cost_based=off,mrr_sort_keys=off,outer_join_with_cache=on,semijoin_with_cache=on,join_cache_incremental=on,join_cache_hashed=on,join_cache_bka=on,optimize_join_buffer_size=off,table_elimination=on,extended_keys=off
 
The manual page at http://dev.mysql.com/doc/mysql/en/crashing.html contains
information that should help you find out what is causing the crash.
Writing a core file
----------SERVER LOG END-------------
mysqltest failed but provided no output
The result from queries just before the failure was:
< snip >
drop table if exists graph;
Warnings:
Note    1051    Unknown table 'graph'
CREATE TABLE graph (
latch   SMALLINT  UNSIGNED NULL,
origid  BIGINT    UNSIGNED NULL,
destid  BIGINT    UNSIGNED NULL,
weight  DOUBLE    NULL,
seq     BIGINT    UNSIGNED NULL,
linkid  BIGINT    UNSIGNED NULL,
KEY (latch, origid, destid) USING HASH,
KEY (latch, destid, origid) USING HASH
) ENGINE=OQGRAPH;
delete from graph;
insert into graph(origid, destid) values (1,2), (2,1);
insert into graph(origid, destid) values (1,3), (3,1);
insert into graph(origid, destid) values (3,4), (4,3);
insert into graph(origid, destid) values (3,5), (5,3);
insert into graph(origid, destid) values (5,6), (6,5);
select * from graph where latch = 2 and origid = 1 and weight = 1;
 
 
 
 - saving '/home/buildbot/mariadb-5.5.24/mysql-test/var/log/oqgraph.basic/' to '/home/buildbot/mariadb-5.5.24/mysql-test/var/log/oqgraph.basic/'
 - found 'core' (0/5)
 
Trying 'dbx' to get a backtrace
 
Trying 'gdb' to get a backtrace
Compressed file /home/buildbot/mariadb-5.5.24/mysql-test/var/log/oqgraph.basic/mysqld.1/data/core
--------------------------------------------------------------------------
The servers were restarted 1 times
Spent 0.028 of 12 seconds executing testcases
 
Failure: Failed 1/2 tests, 50.00% were successful.
 
Failing test(s): oqgraph.basic



 Comments   
Comment by Arjen Lentz [ 2012-06-18 ]

Sergei, thanks for creating this new dedicated issue.

The bugreport still has logic issues in relation to the observed problem.
We know OQGraph works in earlier versions of MariaDB, and older distro versions, and that the oqgraph codebase is the same for all of these builds.
Therefore,
a) the assertion that oqgraph requires a higher version of boost to build, and
b) the implicit assertion that oqgraph "magically" requires a different/higher version of boost with later mariadb versions,
are both verifiable untrue. Simple logic.

At this point, my hypothesis is that rather than a newer version, that OQgraph codebase effectively requires a specific or older version of Boost to build. If you can please check what version of Boost was used for MariaDB 5.2, I believe it would have been 1.40 or 1.42 ?
As Boost is not used for other purposes on the build environments, it should be possible to have this older version available in the environments, possible using a single package and general apt/yum methods (no hacks). You can pin to a specific version of a package.

Note that earlier the OQGraph codebase included the Boost and Boost Graph libraries to make building easier, however this was removed on knielsen's request. One possible solution would be to put it back in?

We can possibly tweak the codebase to take newer versions into account, but Boost appears to be a bit of a moving target so essentially we can expect it to keep breaking every time you start building in a newer OS version or the Boost libraries are updated in a distro.

What do you reckon might be the best approach to fix the and minimise hassle in the future?

Comment by Sergei Golubchik [ 2012-06-18 ]

Arjen, I didn't say that "oqgraph magically requires a different
version of boost with later mariadb versions"

I know why it does, there is no magic here. We've changed the
requirements (to be at least 1.45, not 1.42 as before)
to solve some build issues (related to rtti, as far as I remember).

This bug report is about the fact that one some platforms OQgraph does
not work with a newer Boost.

I tried to repeat the failure, and it took me quite a while of
experimenting. It does not fail everywhere, so it mostly works.

We don't need to upgrade Boost regularly either, this change (1.42 to
1.45) was caused by changing the build system to CMake, that has its own
set of peculiarities, and required different set of workarounds, as
compared to autotools.I can assure you, we do not plan to change the
build system again anytime soon

Anyway. In a few days we'll install Boost again on all builders, and
then we could see in buildbot what fails where and how.

Regards,
Sergei

Comment by Arjen Lentz [ 2012-06-19 ]

Serg, thanks for making sense in this issue, as MDEV-206 definitely doesn't and indeed refers to magic requirements, as I described in my analysis of the logic breakdown there.

Antony Curtis replied earlier on the MariaDB dev email thread - that should resolve the RTTI issue.
Then there could conceivably still be an issue tying OQGraph to a specific Boost version, but please test with the build change and let me know, as you have all the different environments there. thanks!

— compare/5.5/storage/oqgraph/CMakeLists.txt 2012-06-11 22:12:28.000000000 -0700
+++ compare/5.5-wl820/storage/oqgraph/CMakeLists.txt 2012-06-11 22:21:40.000000000 -0700
@@ -12,7 +12,7 @@
ENDIF()

IF(BOOST_OK)

  • ADD_DEFINITIONS(-DHAVE_OQGRAPH)
    + ADD_DEFINITIONS(-DHAVE_OQGRAPH -DBOOST_NO_TYPEID -DBOOST_NO_RTTI)
    IF(MSVC)
    SET(CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} /EHsc")
    ELSE(MSVC)
Comment by Sergei Golubchik [ 2012-06-20 ]

Still crashes

Comment by Arjen Lentz [ 2012-07-16 ]

Serg - trying to decode the status changes in JIRA for this issue.
Do I understand correctly that you've successfully resolved the problem?
Does that mean OQGraph will be included in builds again from 5.5.25 ?
That's be awesome!
thanks

Regards,
Arjen.

Comment by Patryk Pomykalski [ 2012-07-26 ]

I made some tests under ubuntu 10.04 32b with 2 boost versions:
boost 1.45 - works on debug and release builds
boost 1.49 - works on debug, crashes on release

Comment by Patryk Pomykalski [ 2012-07-27 ]

Adding -fno-strict-aliasing for oqgraph seems to help on the newer boost versions (1.46+)

Comment by Sergei Golubchik [ 2012-07-28 ]

Thanks, Patryk!

I'll try to add this as a workaround, and let Arjen decide whether it'll be a final fix or not (in the new upstream of the OQGraph).

Comment by Arjen Lentz [ 2012-08-01 ]

Serg - just for reference... there is no long a separate upstream of OQGraph maintained, OQGraph's primary copy lives in the MariaDB source tree.
For 5.0 it was separate and then merged in to the OurDelta builds, but since the introduction to MariaDB (5.2) such a construct is no longer necessary. Because of Oracle's policy of not having community engines and the impossibility of sanely producing a plugin storage engine completely independently of their build system, we're just not bothering to build OQgraph for stock MySQL.

Comment by Sergei Golubchik [ 2012-08-01 ]

doesn't crash now. let's reopen the issue if it'll start crashing again

Comment by Patryk Pomykalski [ 2012-08-02 ]

I still get the crash in tests with boost > 1.45

Comment by Sergei Golubchik [ 2012-08-03 ]

okay, I've reopened the issue.
How do I need to compile and test to be able to repeat the crash?

Comment by Patryk Pomykalski [ 2012-08-04 ]

Just tested at home on gentoo 64bit (boost 1.46.1, gcc 4.4.5) - same result:
http://pastebin.com/sy9V3DPz

output from cmake:
http://pastebin.com/WYNuvvur

Comment by Sergei Golubchik [ 2012-08-11 ]

pushed -fno-strict-aliasing workaround.

Comment by Sergei Petrunia [ 2012-08-22 ]

Still crashes in buildbot. Look for failures of oqgraph_basic in 5.3

Comment by Sergei Golubchik [ 2012-08-25 ]

pushed in 5.3 too

Generated at Thu Feb 08 06:27:59 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.