tags : Operating Systems, Systems, Linux, Observability
Logs give us context into the long tail that averages and percentils don’t surface.
Writers/Sources
systemd journal
systemd
needs to be theresystemd
services. When they log tostderr
/stdout
,journald
picks them up and writes them in binary format into it’s journal.- Usually stored in
/var/log/journal
in binary, usejournalctl
to view journald
can forward all its messages tosyslog
if needed
syslog
- Usually the case when there’s no
systemd
- Services can write to
/dev/log
in thesyslog
format - A syslog daemon(eg.
rsyslog
,syslog-ng
) picks it and writes it to destinations(could be file/others) - By default writes
files
to/var/log
- Because it’s file based, we’ll probably need to use something like
logrotate
- History
syslog
project first. It started in 1980. Birthed the syslog protocol. UDP only.syslog-ng
came in 1998. TCP support, encryption etc.rsyslog
came in 2004. Extendssyslog-ng
kernel
- Writes its own logs to a ring buffer.
journald
orsyslog
daemons can read logs from this buffer then write tofiles
orjournal
dmesg
can be used to view these- Audit Logs: Special case of kernel messages designed for auditing actions such as file access. Probably can be replaced by BPF.
Applications
Logging to stdout
/ stderr
On systems with systemd…
- if you write to stderr/stdout and is running with systemd it will log to systemd-journal. That’s just the way it works.
- More and more, I think the only time you want your program dealing with log files directly is if you have a need for specific multiple logs. But even then, it seems like adding labels or tags to your log stream would be the better.
- Even in non-containerized apps, it seems writing to stdout/stderr is the way to go these days. If it’s something designed to be daemonized, letting whatever executes the process (systemd for example) can be told where to send the logs, defaulting to a common location like /var/log/syslog. If it’s running inside a docker container or k8s, they have their own way to send logs.
Language specifics
- For Python, std logging goes a long way, additionally I like
structlog
- For Golang, there’s Zap, there logurus and few more. But now we have a standard in Go1.21m its called slog.
- It sort of follows the port-adapter pattern using
Handlers
(See Design Patterns) - slog ships w JSON and logfmt handlers. But 3rd party handlers are also supported: zap/logurus etc
- It sort of follows the port-adapter pattern using
How?
- It needs to be async, don’t block main thread for logging
- Prefer structured logging if not intended for tty
Sampling
- Not sampling means more storage
- Sampling means it’ll cost cpu cycles to sample and trim things down.
- We can adjust the sample rate based on traffic etc.
- Types
Head / Ingestion
: The first service in the request chain can make the decision about whether to trace, which can reduce tracing overhead on services further down.Tail / Index
: The tracing backend can make a determination about whether to persist the trace after the trace has been collected. This has tracing overheads, but allows you to make decisions like “always keep errors”.
Redaction
- Remove sensitive info
- This can be done inside the application but that doesn’t guarantee that it’ll never leak, deps, someone forgetting etc.
- The other way is to redact outside the application with regex or using custom stuff like Ragel etc.)
Libraries and Logging
- There’s some tension around whether libraries should log
- Avoid taking a logger in any libraries you write if you can. Structure them so that the calling code logs, either by only returning errors or providing callbacks.
- If possible, disable logging in the library. If the library will not let you disable logging, consider looking for a new library. If you can’t do that, write nop logger adapter.
Centralized logging
Log collector
journald
syslog-ng
/rsyslogd
(RELP if requires delivery guarantee)
Log routers/ log processors
People call these by different names.
- Examples:
journald-upload
, Fluend, Logstash, Filebeat, Promtail
Central logging/ storage/query
- ELK stack, BigQuery, Graylog, splunk, datadog, Loki etc.
- Sending logs from syslog-ng to Grafana Loki
Practices
Canonical Log Line
- Resist emitting random logs throughout the request as this makes analysis difficult.
- A canonical log line is a log emitted at the end of the request with everything that happened during the request. This makes querying the logs bliss.
- See Fast and flexible observability with canonical log lines