[MCOL-1116] Crash in I_S.CS_FILES when dbroot is offline Created: 2017-12-15  Updated: 2018-01-30  Resolved: 2018-01-30

Status: Closed
Project: MariaDB ColumnStore
Component/s: MDB Plugin
Affects Version/s: 1.0.12, 1.1.2
Fix Version/s: Icebox

Type: Bug Priority: Major
Reporter: Andrew Hutchings (Inactive) Assignee: Andrew Hutchings (Inactive)
Resolution: Fixed Votes: 0
Labels: None

Sprint: 2017-25, 2018-01, 2018-02, 2018-03

 Description   

information_schema.columnstore_files calls getDbrootPmConfig() which fires an exception if a dbroot is offline/missing. We need to catch that exception and handle it appropriately instead of crashing mysqld.



 Comments   
Comment by Andrew Hutchings (Inactive) [ 2017-12-15 ]

Exception is now caught and that entry is skipped if the dbroot is missing/offline (since the file won't be accessible anyway).

For QA: you need a multi-PM setup and need to remove a PM. Then run a select on information_schema.columnstore_files.

Comment by Daniel Lee (Inactive) [ 2018-01-25 ]

Build tests: 1.1.3-1 Github source:

[dlee@master centos7]$ cat mariadb-columnstore-1.1.3-1-centos7.x86_64.bin.tar.txt
/root/columnstore/mariadb-columnstore-server
commit e0ae0d2fecf9941887478d9aa669c8b2d1092090
Merge: 21ec501 2490ddf
Author: benthompson15 <ben.thompson@mariadb.com>
Date: Fri Jan 19 12:39:05 2018 -0600

Merge pull request #84 from mariadb-corporation/MCOL-1159

MCOL-1159 Merge mariadb-10.2.12

/root/columnstore/mariadb-columnstore-server/mariadb-columnstore-engine
commit c74d5de21d6571c0b0e9a12dacaf77856d332e63
Merge: 201813d 63adbd0
Author: benthompson15 <ben.thompson@mariadb.com>
Date: Mon Jan 22 09:42:34 2018 -0600

Merge pull request #375 from mariadb-corporation/dev-1.1-build-fix

Fix missing compiler flag from 1.0 -> 1.1 merge

mysqld is still crashing. The behavior is the same as in 1.1.2-1

1. started a 1um2pm stack
2. ran query on columnstore_files table. ok
3. suspended pm2 vm
4. checked pid for mysqld
[root@localhost ~]# ps -ef |grep mysqld
root 15707 1 0 19:59 ? 00:00:00 /bin/sh /usr/local/mariadb/columnstore/mysql//bin/mysqld_safe --datadir=/usr/local/mariadb/columnstore/mysql/db --pid-file=/usr/local/mariadb/columnstore/mysql/db/localhost.localdomain.pid --ledir=/usr/local/mariadb/columnstore/mysql//bin
root 15718 15694 0 21:19 pts/0 00:00:00 grep --color=auto mysqld
mysql 15891 15707 0 19:59 ? 00:00:11 /usr/local/mariadb/columnstore/mysql//bin/mysqld --basedir=/usr/local/mariadb/columnstore/mysql/ --datadir=/usr/local/mariadb/columnstore/mysql/db --plugin-dir=/usr/local/mariadb/columnstore/mysql/lib/plugin --user=mysql --log-error=/usr/local/mariadb/columnstore/mysql/db/localhost.localdomain.err --pid-file=/usr/local/mariadb/columnstore/mysql/db/localhost.localdomain.pid --socket=/usr/local/mariadb/columnstore/mysql/lib/mysql/mysql.sock --port=3306
5. repeated step 2.
MariaDB [information_schema]> select * from columnstore_files;

ERROR 1053 (08S01): Server shutdown in progress
MariaDB [information_schema]>
MariaDB [information_schema]> select * from columnstore_files;
ERROR 2006 (HY000): MySQL server has gone away
No connection. Trying to reconnect...
ERROR 2002 (HY000): Can't connect to local MySQL server through socket '/usr/local/mariadb/columnstore/mysql/lib/mysql/mysql.sock' (2)
ERROR: Can't connect to the server
unknown [information_schema]> select * from columnstore_files;
No connection. Trying to reconnect...
ERROR 2002 (HY000): Can't connect to local MySQL server through socket '/usr/local/mariadb/columnstore/mysql/lib/mysql/mysql.sock' (2)
ERROR: Can't connect to the server

6. checked pid again. pid changed
[root@localhost ~]# ps -ef |grep mysqld
root 16086 1 0 21:24 ? 00:00:00 /bin/sh /usr/local/mariadb/columnstore/mysql//bin/mysqld_safe --datadir=/usr/local/mariadb/columnstore/mysql/db --pid-file=/usr/local/mariadb/columnstore/mysql/db/localhost.localdomain.pid --ledir=/usr/local/mariadb/columnstore/mysql//bin
mysql 16272 16086 0 21:24 ? 00:00:00 /usr/local/mariadb/columnstore/mysql//bin/mysqld --basedir=/usr/local/mariadb/columnstore/mysql/ --datadir=/usr/local/mariadb/columnstore/mysql/db --plugin-dir=/usr/local/mariadb/columnstore/mysql/lib/plugin --user=mysql --log-error=/usr/local/mariadb/columnstore/mysql/db/localhost.localdomain.err --pid-file=/usr/local/mariadb/columnstore/mysql/db/localhost.localdomain.pid --socket=/usr/local/mariadb/columnstore/mysql/lib/mysql/mysql.sock --port=3306
root 16346 15694 0 21:25 pts/0 00:00:00 grep --color=auto mysqld

Comment by Daniel Lee (Inactive) [ 2018-01-25 ]

Build tested: 1.0.13-1 Github source

/root/columnstore/mariadb-columnstore-server
commit e5b122c427e3adc9b73962b88cdd4754a5b11957
Merge: b435403 7e9e5f7
Author: Andrew Hutchings <andrew@linuxjedi.co.uk>
Date: Tue Jan 23 14:25:41 2018 +0000

Merge pull request #85 from mariadb-corporation/MCOL-1114

MCOL-1114: Change cmake minimum versions.

/root/columnstore/mariadb-columnstore-server/mariadb-columnstore-engine
commit 06979c8d1b06d1d8640dba1f53144f9f53de9a58
Merge: cc2bbe8 def46ca
Author: benthompson15 <ben.thompson@mariadb.com>
Date: Mon Jan 22 14:17:38 2018 -0600

Merge pull request #378 from mariadb-corporation/RELEASE

update version

Performance the same test. I also got:

MariaDB [information_schema]> select * from columnstore_files;
ERROR 1053 (08S01): Server shutdown in progress
MariaDB [information_schema]> quit
Bye

mysqld is gone and never recovered

[root@localhost ~]# ps -ef |grep mysqld
root 7070 6570 0 22:23 pts/0 00:00:00 grep --color=auto mysqld

Comment by Andrew Hutchings (Inactive) [ 2018-01-30 ]

That is testing for the wrong thing. ColumnStore restarts mysqld when you suspend one of the nodes that way and it takes a few minutes. I can't figure out a good way to make a dbroot offline any other way though.

I haven't found a good way to reproduce this and can't remember where it originally came from. The patch does no harm so I'll just close it and remove the fixversion. If this comes up in searches in future the patch went into 1.0.13 and 1.1.3.

Generated at Thu Feb 08 02:26:19 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.