[MCOL-1346] Debian9 fails to startup in buildbot Created: 2018-04-17  Updated: 2023-10-26  Resolved: 2019-07-10

Status: Closed
Project: MariaDB ColumnStore
Component/s: ?, PrimProc
Affects Version/s: 1.1.3
Fix Version/s: Icebox

Type: Bug Priority: Major
Reporter: Ben Thompson (Inactive) Assignee: Unassigned
Resolution: Fixed Votes: 0
Labels: None

Issue Links:
Relates
relates to MCOL-1330 Make ColumnStore work under valgrind Closed

 Description   

debian 9 is failing to start from postConfigure

MariaDB ColumnStore Database Platform Starting, please wait ........................................ FAILED
IMPORTANT: There was a system startup failed, once issue has been resolved, rerun postConfigure

From columnstore logging:
Apr 17 15:25:43 ip-10-0-0-136 PrimProc[49692]: 43.406432 |0|0|0| C 28 CAL0000: Error setting file limits, please see non-root install documentation
Apr 17 15:25:44 ip-10-0-0-136 ProcessMonitor[49188]: 44.122328 |0|0|0| C 18 CAL0000: *****Calpont Process Restarting: PrimProc, old PID = 49692
Apr 17 15:25:48 ip-10-0-0-136 PrimProc[49726]: 48.412316 |0|0|0| C 28 CAL0000: Error setting file limits, please see non-root install documentation
Apr 17 15:25:49 ip-10-0-0-136 ProcessManager[49252]: 49.413319 |0|0|0| C 17 CAL0000: startMgrProcessThread Exit with a failure, error returned from startSystemThread
Apr 17 15:31:23 ip-10-0-0-136 controllernode[50903]: 23.030175 |0|0|0| C 29 CAL0000: ExtentMap::save(): got request to save an empty BRM
Apr 17 15:31:23 ip-10-0-0-136 ProcessManager[49252]: 23.033213 |0|0|0| E 17 CAL0000: line: 6451 Error running DBRM save_brm
Apr 17 15:31:34 ip-10-0-0-136 PrimProc[51317]: 34.631635 |0|0|0| C 28 CAL0000: Error setting file limits, please see non-root install documentation
Apr 17 15:31:39 ip-10-0-0-136 ProcessMonitor[49188]: 39.513999 |0|0|0| C 18 CAL0000: *****Calpont Process Restarting: PrimProc, old PID = 51317
Apr 17 15:31:41 ip-10-0-0-136 PrimProc[51379]: 41.565442 |0|0|0| C 28 CAL0000: Error setting file limits, please see non-root install documentation
Apr 17 15:31:46 ip-10-0-0-136 ProcessMonitor[49188]: 46.570833 |0|0|0| C 18 CAL0000: *****Calpont Process Restarting: PrimProc, old PID = 51379
Apr 17 15:31:48 ip-10-0-0-136 PrimProc[51442]: 48.621322 |0|0|0| C 28 CAL0000: Error setting file limits, please see non-root install documentation

the limits.conf file DOES contain the following entries:
/etc/security/limits.conf
buildbot hard nofile 65536
buildbot soft nofile 65536



 Comments   
Comment by Andrew Hutchings (Inactive) [ 2018-04-18 ]

In Debian limits.conf is ignored by non-interactive shells: https://wiki.debian.org/Limits

Not sure on a good fix for this without either hacking Debian's PAM config or implementing MCOL-1330 and using whatever switch we give it.

Comment by David Hill (Inactive) [ 2018-04-25 ]

so it does run on buildbot when run directly, so its related to what Andrew noted in previous comment

sudo -u buildbot ./PrimProc
Locale is : C
Starting PrimitiveServer: st = 1, sq = 10, pw = 128, pq = 10240, nb = 385865, nt = 32, nc = 1, ra = 512, db = 128, mb = 512, rd = 0, tr = 0, ss = 67108864, bp = 32
started 19 high, 10 med, 3 low.
started 5 high, 0 med, 0 low.

Comment by Ben Thompson (Inactive) [ 2018-04-25 ]

The configuration issue is resolved on my end.

For this specific tests the following needs to be done to the worker AMI
Edit /etc/pam.d/common-session-noninteractive
add the line:
session required pam_limits.so

Generated at Thu Feb 08 02:28:03 UTC 2024 using Jira 8.20.16#820016-sha1:9d11dbea5f4be3d4cc21f03a88dd11d8c8687422.