Centralized Docker Logging with Rsyslog

Doug Haber

Sr. Software Engineer

As we grew our Docker infrastructure at Simulmedia, we started finding Docker’s default logging to be inadequate.  We wanted to find a way to harness the features and flexibility of traditional logging systems, and so we began researching alternate paths that could help meet our needs.

Requirements

The primary requirements of  a logging system come from the developers.  The most common use cases involve looking at logs to determine what happened or what is happening on the system.  We want our developers to have easy access to current and historical logs.  They should be able to follow logs, including logs from multiple containers at the same time.  Ideally, logs should be available in standard formats, so that they are easy to search, filter, and follow.  Many developers prefer using standard Unix tools and scripting to access log files, and so a solution that allows that is preferred.

Docker’s default JSON-file style of logging works well for a single container, but there are no great options for watching multiple containers.  If you have 4 instances of a service running under a load balancer, following all their logs requires 4 commands, which may each need to be run on different hosts.

Ideally, logs should be archived for at least a month, if not longer.  This makes it possible to analyze the historical usage of a system and provides a way to get answers to questions about past usage and behavior.  

Another issue with Docker’s logging is that viewing a container’s logs requires access to the Docker socket, which is effectively root access.  In our ideal world, developers should only access production servers via deployment systems. Docker’s default logging makes this type of security standard difficult.

Finding a Solution

In the Docker landscape, each problem seems to have a dizzying number of possible solutions.  Logging is no exception.  A variety of commercial services exist to help these needs be met, as well as open source options, such as FluentD or the ELK stack (which combines Elasticsearch, Logstash, and Kibana.)  

In traditional system administration, powerful and extremely flexible logging systems have existed for decades. Newer tools like the ELK stack provide different feature sets which add some very nice capabilities for log querying and analysis.

We decided at the core that best way of doing logging with Docker would be to use the same proven logging systems that are used elsewhere in our infrastrastructure.  For us, this was rsyslog, a popular implementation of the syslog protocol with support for many extensions, basic scripting, encryption, and a number of reliability features.

While we would use rsyslog to collect and archive the logs, this didn’t mean we were leaving behind the newer solutions and their features.  From rsyslog it is possible to forward the collected logs into other systems like Fluentd or an ELK stack.  Sending logs from rsyslog to an extra endpoint only requires small configuration file changes, either on the central syslog server, or on the clients.  This lets us have a proven and familiar system at the core, while still benefiting from newer log analysis tools.

The Planned Implementation

Docker has various logging drivers that provide mechanisms for forwarding logs.  The recommended way of generating logs is by writing lines of text to STDOUT or STDERR.  Each line is a new logging message.  When using the syslog driver, these messages are converted to the syslog protocol and forwarded to a server. 

For the first step, we opted to have each host that would run Docker containers also running rsyslog locally. Docker containers send their logs to the local rsyslog daemon, which is configured to write them to the host’s local file system.  Each container gets its own logs by name.  We also provide a combined log file that has all containers on a host, so that they can be viewed in-line with one another.

After writing out the logs to local files, the rsyslog server then forwards them to the central logging server.  We forward all logs for the host, rather than just the Docker logs.  The central server then has archived copies of the host’s system logs, which are very useful when researching and diagnosing issues.  On the central server, logs are stored by host, date, and container name.

DockerImage.png

Logging Docker Locally

When starting up, rsyslog reads /etc/rsyslog.conf, which on many systems, such as Debian, includes the directive $IncludeConfig /etc/rsyslog.d/*.conf.  The imported files are processed in order, based on the filename, and so it is a common practice to include a number at the beginning of the filename, as an easy way of defining the order.  

To start, we will create a config file that will handle local Docker logging.

/etc/rsyslog.d/30-docker.conf

$FileCreateMode 0644
template(name="DockerLogFileName" type="list") {
   constant(value="/var/log/docker/")
   property(name="syslogtag" securepath="replace" \
            regex.expression="docker/\\(.*\\)\\[" regex.submatch="1")
   constant(value="/docker.log")
}
if $programname == 'docker' then \
  /var/log/docker/combined.log
if $programname == 'docker' then \
  if $syslogtag contains 'docker/' then \
    ?DockerLogFileName
  else
    /var/log/docker/no_tag/docker.log
$FileCreateMode 0600

When rsyslog creates new log files it uses mode 0600 as the default permissions, meaning that files are readable and writable by root, and nobody else. With this example configuration, we make the log files world readable. In some environments this may be acceptable, but if further controls are needed, group permissions can be set, and logic could be written to set different permission for different services.

To set the permissions for our logs we start with $FileCreateMode 0644, which makes the file readable and writeable by root as well as world and group readable. At the end of the configuration, we set $fileCreateMode 0600, which restores the default for configurations read in after this file.

Templates are used by rsyslog to define formats for filenames, log entries, or writing to other systems. In this case, the DockerLogFileName template is used to define the name of the local Docker log files. Logs are created on the host at /var/log/docker/NAME/docker.log. The NAME is taken from the syslog tag, which internally is docker/TAG. A regex is used to extract only the TAG portion.

When writing out the logs, we chose to write to a file named docker.log within a directory named after the syslog tag. Currently, the docker.log will be the only file in the directory. When writing logs from Docker you can use STDOUT and STDERR. Right now, with the syslog driver, both streams become indistinguishable from one another. Our hope in making this a directory is that at some point in the future it will be possible to tell these apart. When that happens, we can add a docker-error.log into the directory, so that errors can be viewed separately from regular log entries.

After defining the template it is time to write the actual logs. We start by writing logs from any container on the host to /var/log/docker/combined.log, where they can all be viewed in-line. After that we write the container logs. If a name is provided via a syslog tag, we use the template to create the proper file. In cases where the tag isn’t defined, we use no_tag in place of a name, so that the logs aren’t lost.

There may be cases where certain types of events are logged regularly, but serve no value. In these situations, you can filter out log entries to prevent them from creating noise in the logs. Filters can be performed using parameters or regex comparisons. For example, code like this can be placed before lines that write or send logs, to filter out matching entries.

if ($programname startswith "docker"
    and $syslogtag isequal ‘docker/myapp’
    and $msg startswith 'GET /health/ ')  \
then stop

Sending Logs to the Central Server

Next, we will configure the writing of logs to the central server. With this configuration, all system and Docker logs will be sent to the server.

35-remote-logging.conf

template(name="RemoteForwardFormat" type="list") {
    constant(value="<")
    property(name="pri")
    constant(value=">")
    property(name="timestamp" dateFormat="rfc3339")
    constant(value=" ")
    property(name="hostname")
    constant(value=" ")
    property(name="syslogtag")
    property(name="msg" spifno1stsp="on")
    property(name="msg")
}
action(
   type="omfwd"
   protocol="tcp"
   target="syslog.example.com"
   port="514"
   template="RemoteForwardFormat"
   queue.SpoolDirectory="/var/spool/rsyslog"
   queue.FileName="remote"
   queue.MaxDiskSpace="1g"
   queue.SaveOnShutdown="on"
   queue.Type="LinkedList"
   ResendLastMSGOnReconnect="on"
   )


To start, we define a template RemoteForwardFormat. Rsyslog does have a built-in format for syslog forwarding, but it truncates the syslog tag at 32 characters. Our RemoteForwardFormat is identical, except for the removal of that limitation.

An action is used to send the logs to the server. For reliability, a queue is configured. This is helpful if the central server goes down, since messages can remain queued locally until the server comes back. Rsyslog has a lot of options for sending messages. In this example, we are using TCP for reliability and for its improved capability to send large messages. The target syslog.example.com should be updated with the host that will be acting as a centralized syslog server.

There are many more options worth exploring here to further increase reliability. While it is beyond the scope of this article, rsyslog also supports fully encrypted messaging and trusted host verification via the GTLS stream driver, and it could be very worthwhile to enable support for that.

Configuring the Server

With the above configurations, logs will be written locally on the hosts running Docker and sent to the central syslog server. The central syslog server now needs to be configured to receive these logs.

This config file is a bit longer, so we will introduce it one section at a time. The file is named /etc/rsyslog.d/35-incoming-logs.conf.

$ModLoad imtcp
input(type="imtcp" port="514" ruleset="remote")
template(name="PerHostDockerLogFileName" type="list") {
   constant(value="/var/log/remote/")
   property(name="hostname" securepath="replace")
   constant(value="/")
   property(name="$year")
   constant(value="/")
   property(name="$month")
   constant(value="/")
   property(name="$day")
   constant(value="/docker/")
   property(name="syslogtag" securepath="replace" \
   regex.expression="docker/\\(.*\\)\\[" regex.submatch="1")
   constant(value="/docker.log")
}
template(name="PerHostDockerCombinedLogFileName" type="list") {
   constant(value="/var/log/remote/")
   property(name="hostname" securepath="replace")
   constant(value="/")
   property(name="$year")
   constant(value="/")
   property(name="$month")
   constant(value="/")
   property(name="$day")
   constant(value="/docker/combined.log")
}


To start, we load the imtcp module, and define a new input that listens on port 514. This input specifies that it will use a ruleset named remote. This means that when logs come through the input, rather than treating them the same as the local logs, an alternate set of rules will be used.

We create two templates here. The PerHostDockerLogFileName creates files like /var/log/remote/HOST/YYYY/MM/DD/docker/NAME/docker.log. These are used for the logs from individual containers. The PerHostDockerCombinedLogFIleName creates a file that includes logs from all Docker containers on a host, in-lined with each other. This goes into /var/log/remote/HOST/YYYY/MM/DD/docker/combined.log.

At the current time, it is useful for us to organize containers by host. With systems where the host that a container runs on can vary greatly, these rules can be modified to store logs by container name at the base level instead of (or in addition to) doing it by host.

ruleset(name="remote") {
   $FileCreateMode 0644
   $DirCreateMode 0755
   if $programname == 'docker' then \
     ?PerHostDockerCombinedLogFileName
   if $programname == 'docker' then \
     if $syslogtag contains 'docker/' then \
       ?PerHostDockerLogFileName
     else
       /var/log/docker/no_tag/docker.log


Now we begin to define a ruleset for the incoming docker logs. Everything inside the braces of the ruleset will be used when evaluating logs that come in through the TCP socket.

Similar to the client side rules, we change the $FileCreateMode to make the files readable by world. Since we are creating directories as well, we also add in a $DirCreateMode 0755, which allows anyone to access the newly created directories.

Finally, if the program is identifying itself as being Docker, we use the templates defined above to write to the appropriate log files.

The remainder of the configuration is for handling all of the non-docker log files. This allows us to recreate the individual system log files from the remote host.

   $template PerHostAuth,"/var/log/remote/%HOSTNAME::securepath-replace%/%$YEAR%/%$MONTH%/%$DAY%/auth.log"
   $template PerHostCron,"/var/log/remote/%HOSTNAME::securepath-replace%/%$YEAR%/%$MONTH%/%$DAY%/cron.log"
   $template PerHostSyslog,"/var/log/remote/%HOSTNAME::securepath-replace%/%$YEAR%/%$MONTH%/%$DAY%/syslog"
   $template PerHostDaemon,"/var/log/remote/%HOSTNAME::securepath-replace%/%$YEAR%/%$MONTH%/%$DAY%/daemon.log"
   $template PerHostKern,"/var/log/remote/%HOSTNAME::securepath-replace%/%$YEAR%/%$MONTH%/%$DAY%/kern.log"
   $template PerHostLpr,"/var/log/remote/%HOSTNAME::securepath-replace%/%$YEAR%/%$MONTH%/%$DAY%/lpr.log"
   $template PerHostUser,"/var/log/remote/%HOSTNAME::securepath-replace%/%$YEAR%/%$MONTH%/%$DAY%/user.log"
   $template PerHostMail,"/var/log/remote/%HOSTNAME::securepath-replace%/%$YEAR%/%$MONTH%/%$DAY%/mail.log"
   $template PerHostMailInfo,"/var/log/remote/%HOSTNAME::securepath-replace%/%$YEAR%/%$MONTH%/%$DAY%/mail.info"
   $template PerHostMailWarn,"/var/log/remote/%HOSTNAME::securepath-replace%/%$YEAR%/%$MONTH%/%$DAY%/mail.warn"
   $template 
PerHostMailErr,"/var/log/remote/%HOSTNAME::securepath-replace%/%$YEAR%/%$MONTH%/%$DAY%/mail.err"
   $template PerHostNewsCrit,"/var/log/remote/%HOSTNAME::securepath-replace%/%$YEAR%/%$MONTH%/%$DAY%/news.crit"
   $template PerHostNewsErr,"/var/log/remote/%HOSTNAME::securepath-replace%/%$YEAR%/%$MONTH%/%$DAY%/news.err"
   $template PerHostNewsNotice,"/var/log/remote/%HOSTNAME::securepath-replace%/%$YEAR%/%$MONTH%/%$DAY%/news.notice"
   $template PerHostDebug,"/var/log/remote/%HOSTNAME::securepath-replace%/%$YEAR%/%$MONTH%/%$DAY%/debug"
   $template PerHostMessages,"/var/log/remote/%HOSTNAME::securepath-replace%/%$YEAR%/%$MONTH%/%$DAY%/messages"
   auth,authpriv.*         ?PerHostAuth
   *.*;auth,authpriv.none  -?PerHostSyslog
   cron.*                  ?PerHostCron
   daemon.*                -?PerHostDaemon
   kern.*                  -?PerHostKern
   lpr.*                   -?PerHostLpr
   mail.*                  -?PerHostMail
   user.*                  -?PerHostUser
   mail.info               -?PerHostMailInfo
   mail.warn               ?PerHostMailWarn
   mail.err                ?PerHostMailErr
   news.crit               ?PerHostNewsCrit
   news.err                ?PerHostNewsErr
   news.notice             -?PerHostNewsNotice
   *.=debug;\
     auth,authpriv.none;\
     news.none;mail.none   -?PerHostDebug
   *.=info;*.=notice;*.=warn;\
     auth,authpriv.none;\
     cron,daemon.none;\
     mail,news.none        -?PerHostMessages
}


This section creates a template filename for each type of log that we will be writing. In order to conserve space, templates are written in a one line format, rather than the verbose style used earlier.

Once all of the templates are defined, we include a fairly standard breakdown of which types of logs should write to each file. Every line references one of the templates defined above. The end result is that we have the logs broken into the same individual files on both the client and the central server.

Running Docker Containers

When Docker containers are launched, they need to enable logging via the syslog driver. This is done by passing the argument --log-driver=syslog to the docker run command. The syslog tag should also be provided by the argument --log-opt syslog-tag=NAME. In Docker 1.9, the option was changed from syslog-tag to just tag. The NAME will be used as the directory name for a given service. For example, to run the image example_service with the tag set to my_service_name, one of the following commands may be used:

Pre Docker 1.9:
docker run --log-driver=syslog --log-opt syslog-tag=my_service_name example_service

Docker 1.9+:
docker run --log-driver=syslog --log-opt tag=my_service_name example_service

At this point, anything written to STDOUT or STDERR from within the container will result in syslog messages. These messages already include a timestamp from syslog, so it is redundant for an application to print one.

Finishing Touches

Normally, log messages are short. In the past, syslog implementations often had limits around 1k. These days defaults are usually around 2k or 4k, but some systems may still need larger sizes. With rsyslog, the limit is easily raised. For example, placing $MaxMessageSize 16k in config files will allow messages up to 16k. This is very useful for verbose logging, and applications that log JSON. The limit needs to be raised on both the client and server to keep the messages from being truncated.

The configurations presented here will create the log files, but eventually they need to be cleaned to preserve disk space. Logrotate can be used to enable automatic rotation of the log files. For example, here is a configuration that will rotate all docker logs on the client every day:

/etc/logrotate.d/docker-logs

/var/log/docker/*/docker.log {
  rotate 7
  daily
  compress
  missingok
  delaycompress
  copytruncate
}
/var/log/docker/*.log {
  rotate 7
  daily
  compress
  missingok
  delaycompress
  copytruncate
}


The first rule will rotate the container logs for 1 week, compressing the old logs. The second rule similarly rotates the combined.log. After 7 days, local logs are removed. The server side is a little more complicated. Depending on your needs, logrotate may work, but it may be easier to achieve exact behaviors with a little custom scripting.

For simplicity, the examples provided here included directives to make the logs readable by everyone. In many environments, security must be taken further into account. More customizable rules for file and directory permissions can be added, as needed. It may also be wise to use separate servers for services that may log sensitive information, and to have tighter security policies over those servers. Enabling encryption and using certificates to validate host identities is also a good idea.

Conclusion

For several months we have been using centralized Docker logging through rsyslog, and the benefits are clear. When debugging services, time is saved by searching or following the logs on all instances of a load balanced service at once. With this setup it is now very easy to write scripts that parse logs, or to use standard Unix tools.

The rsyslog setup makes the classic way of archiving and accessing logs available in a Docker environment. This approach allows Docker infrastructure to take advantage of the best of both worlds, using a time-tested, scalable, powerful logging system at the core, and sending logs beyond that to newer systems that help with analysis.

Interested in getting the latest from Simulmedia?

News, insights, and events sent straight to your inbox!