Details
-
New Feature
-
Status: Closed (View Workflow)
-
Major
-
Resolution: Fixed
-
None
-
None
Description
A common problem for the analysis of one-off events is that there's usually not enough information available to make an accurate analysis of the root cause of the problem. From experience, the log_info output greatly speeds up issue resolution. The problem is that the current logging implementation has some severe limitations:
- The log subsystem is serialized and thus it becomes a scaling bottleneck. This can be improved as the locking done by it is somewhat pessimistic and there exist solutions that would remove the need for serialization. The attached patch shows one approach that could be used for log rotation without locks.
- There is only one log file. This means that alerts, errors, warnings and notifications get mixed with the trace logging and everything gets stored into the same file. This forces an all-or-nothing approach for logging which usually means that trace logging with log_info just isn't an option in production setups due to the larger volume of writes and the performance overhead that comes with it.
- The log rotation relies on logrotate. This works perfectly for normal log files but for trace files that would need to be rotated very frequently it doesn't work as well. A built-in rotation/truncation mechanism would be better as it wouldn't require any additional configuration of logrotate.
I've attached a patch to this issue that implements a fixed size rotating log that is intended to be stored on a tmpfs mount. This should make it possible to have the benefit of log_info=true without the most common problems that usually come with it. It's based on 6.4 so there are some fundamental logging inefficiencies in it but it should be a viable option compared to log_info.