Details
-
Bug
-
Status: Closed (View Workflow)
-
Major
-
Resolution: Cannot Reproduce
-
5.4.1
-
None
Description
I am encountering a problem where ColumnStore is trying to create a directory at the wrong path. It is trying to create a directory in the system's root directory, rather than in one of the dbroot directories.
This problem was seen with a multi-node ColumnStore 5.4.1 cluster that has 3 nodes. The cluster is using GlusterFS, but it is not using S3 storage. Thie cluster is on CentOS 8.
The problem happens when I try to create a table on the primary node (mcs1). On mcs1, I see the following error:
MariaDB [(none)]> CREATE TABLE inventory.products ( |
-> product_name varchar(11) NOT NULL DEFAULT '', |
-> supplier varchar(128) NOT NULL DEFAULT '', |
-> quantity varchar(128) NOT NULL DEFAULT '', |
-> unit_cost varchar(128) NOT NULL DEFAULT '' |
-> ) ENGINE=Columnstore DEFAULT CHARSET=utf8; |
ERROR 1815 (HY000): Internal error: CAL0009: (6)Create table failed due to WE: Error creating column file for oid 3010; Error in creating a directory. |
The syslog on one of the replicas nodes (mcs2) has a couple more details:
Nov 3 01:25:37 mcs2 IDBFile[29632]: 37.564020 |0|0|0| E 35 CAL0002: Failed to create directories: "/000.dir", exception: boost::filesystem::create_directory: Permission denied: "/000.dir"
|
This error message seems to indicate that ColumnStore is trying to create a directory at the path /000.dir, which is in the system's root directory, rather than in one of ColumnStore's dbroot directories.
I wanted to find out more details about this error, so I decided to attach strace to the WriteEngine process on mcs2.
First, I got the WriteEngine's PID on mcs2:
"mcs2": {
|
"timestamp": "2020-11-03 01:17:44.584777",
|
"uptime": 5857,
|
"dbrm_mode": "slave",
|
"cluster_mode": "readonly",
|
"dbroots": [
|
"2"
|
],
|
"module_id": 1,
|
"services": [
|
{
|
"name": "workernode",
|
"pid": 29576
|
},
|
{
|
"name": "PrimProc",
|
"pid": 29587
|
},
|
{
|
"name": "ExeMgr",
|
"pid": 29622
|
},
|
{
|
"name": "WriteEngine",
|
"pid": 29632
|
}
|
]
|
},
|
And then I attached strace to the process:
$ mkdir writeengine_strace
|
$ sudo strace -s 256 -o ./writeengine_strace/strace_out -p 29632 -ff &
|
After reproducing the problem again, I looked at the strace output:
stat("/etc/columnstore/Columnstore.xml", {st_mode=S_IFREG|0644, st_size=19929, ...}) = 0
|
stat("/000.dir/000.dir/011.dir/194.dir/000.dir/FILE000.cdf", 0x7f466a1967d0) = -1 ENOENT (No such file or directory)
|
stat("/etc/columnstore/Columnstore.xml", {st_mode=S_IFREG|0644, st_size=19929, ...}) = 0
|
stat("/000.dir", 0x7f466a1967b0) = -1 ENOENT (No such file or directory)
|
stat("/000.dir", 0x7f466a1963d0) = -1 ENOENT (No such file or directory)
|
stat("/", {st_mode=S_IFDIR|0555, st_size=237, ...}) = 0
|
mkdir("/000.dir", 0777) = -1 EACCES (Permission denied)
|
stat("/000.dir", 0x7f466a196350) = -1 ENOENT (No such file or directory)
|
getpid() = 29632
|
socket(AF_UNIX, SOCK_DGRAM|SOCK_CLOEXEC, 0) = 18
|
connect(18, {sa_family=AF_UNIX, sun_path="/dev/log"}, 110) = 0
|
sendto(18, "<139>Nov 3 01:25:37 IDBFile[29632]: 37.564020 |0|0|0| E 35 CAL0002: Failed to create directories: \"/000.dir\", exception: boost::filesystem::create_directory: Permission denied: \"/000.dir\"\n ", 190, MSG_NOSIGNAL, NULL, 0) = 190
|
close(18) = 0
|
write(10, "7\301\373\24[\0\0\0\4\0\0\0\0\0\0\0%N\0\0\0WE: Error creating column file for oid 3010; Error in creating a directory. \n", 99) = 99
|
The strace output confirms that ColumnStore is trying to create a directory at the path /000.dir, which is in the system's root directory, rather than in one of ColumnStore's dbroot directories.
I also confirmed that the WriteEngine process does not do a chroot by looking at /proc/PID/root:
$ sudo ls -l /proc/29632/root
|
lrwxrwxrwx. 1 mysql mysql 0 Nov 3 01:41 /proc/29632/root -> /
|
I also checked Columnstore.xml to confirm that all 3 dbroots are listed properly:
...
|
<DBRootCount>3</DBRootCount> |
<DBRoot1>/var/lib/columnstore/data1</DBRoot1> |
...
|
<DBRoot2>/var/lib/columnstore/data2</DBRoot2> |
<DBRoot3>/var/lib/columnstore/data3</DBRoot3> |
...
|
Of course, the mysql user can create directories in the dbroot directories without any permissions issues:
$ sudo -u mysql bash
|
$ whoami
|
mysql
|
$ mkdir -p /var/lib/columnstore/data1/000.dir
|
$ ls -ld /var/lib/columnstore/data1/000.dir
|
drwx------. 3 mysql mysql 21 Nov 2 23:51 /var/lib/columnstore/data1/000.dir
|
$ mkdir -p /var/lib/columnstore/data2/000.dir
|
$ ls -ld /var/lib/columnstore/data2/000.dir
|
drwxr-xr-x. 2 mysql mysql 6 Nov 3 01:50 /var/lib/columnstore/data2/000.dir
|
$ mkdir -p /var/lib/columnstore/data3/000.dir
|
$ ls -ld /var/lib/columnstore/data3/000.dir
|
drwxr-xr-x. 2 mysql mysql 6 Nov 3 01:50 /var/lib/columnstore/data3/000.dir
|
Why is ColumnStore trying to create a directory in the system's root directory, rather than in one of the dbroot directories?