Install Graylog using Docker Compose on QNAP Container Station

John Wheeler
10 min readNov 21, 2021

When I first began installing Graylog I tried to use Docker Desktop for mac on one of my Mac Mini’s only to learn that there is no bridge network in Docker Desktop. This means that IP addresses from the source where syslogs would be generated would be rewritten in syslog messages because of the use of NAT. The point of me installing Graylog was to provide visibility to all the equipment attached to my network. The last thing I wanted to do was create more obfuscation that made it harder to determine where a message came from. This led me to follow a host based installation process for installing Graylog on my Mac Mini.

Unfortunately, shortly after I installed Graylog, I realized that my 2012 Mac mini wouldn’t support BigSur. I didn’t want to maintain these applications on hardware that was no longer supported so I’d need to find a new home for all of my applications. I purchased a new QNAP in 2020 and migrated from my old QNAP to the new on. One of the features I noticed was the container station using Docker.

I was able to successfully migrate Influxdb data from my mac mini to Docker on Container Station, and though I investigated options for migrating Graylog data, this seemed like more effort than I was willing to exert so I abandoned my old installation once I was able to get Graylog working on Docker.

Docker Compose File

The file below is my running config. After writing this post I realize, there are a number of things that I should update. The rest of the article describes why directives below were chosen.

Services

I’ve broken down the docker compose file by services, volumes, and networks.

Service: Mongodb

The first service defined is for MongoDB. A few things to note about using the docker image. From the Mongodb docker documentation:

As noted above, authentication in MongoDB is fairly complex (although disabled by default).

The key phrase is disabled by default. I’ve tried to find an easy way to initialize the database and enable authentication, but its non-trivial. Reading both of these issues will give you more background on the complexities.

I did find some solutions like this one that require code and might require modification of the startup.

I definitely think it’s possible to configure users in the docker compose file and enable authentication, I chose for skip it for now. You’ve been warned.

mongo:
image: mongo:4.2
volumes:
- mongodb_data:/data/db
- mongodb_configdb:/data/configdb
networks:
network_qnet-static:
ipv4_address: 192.168.1.25
mac_address: fe:ed:fa:ce:f0:0d
ports:
- 27017:27017

The image I chose is a bit dated 4.2 , but I found lots of reference examples with this version and its supported in the system requirements.

For volumes I referred to the Dockerfile and found two volumes defined.

VOLUME /data/db /data/configdb

I’ve chosen to name those volumes mongodb_data and mongodb_configdb. These names are arbitrary, but can also be use by other containers, so names that help describe what the volume is for is helpful.

The Networks section follows a similar design pattern I’ve previously written about putting influxdb in docker compose. I’ve chosen to statically configure both the IP addresss and the MAC address. The graylog compose file will use the existing network container (below) to attach to the network.

The above compose file creates a network, on my existing 192.168.1.X network, that all of my other containers can attach. The network_qnet-static is the name of the network create from the compose file above.

# docker network ls
NETWORK ID NAME DRIVER SCOPE
f63090b78da0 bridge bridge local
69eaf39d9cd1 host host local
5ae7b24adc66 network_qnet-static qnet local
a31156757812 none null local
9ed4ec3e613c telegraf_default bridge local

The mac_address is set because I wanted to configure a DNS entry for each service that I’m running. To ensure that I could consistently resolve the domain to IP, I statically configured the MAC Address <-> IP Address <-> DNS name in my firewall. When the containers were restarted, the MAC address can change, so I opted to statically configure the mac_address along with the IP address. Finally, in the ports section, 27017 is the default port for mongodb.

Service: Elasticsearch

The next section in the yaml file covers the elastic search service. Another prerequisite for Graylog. Based on the requirements I chose to use version 7.10.2. The quick start section for installation is the source of much of this configuration.

elasticsearch:
image: docker.elastic.co/elasticsearch/elasticsearch-oss:7.10.2
volumes:
- es_data:/usr/share/elasticsearch/data
networks:
network_qnet-static:
ipv4_address: 192.168.1.24
mac_address: fe:ed:fa:ce:f0:1d
ports:
- 9200:9200
environment:
- network.host=192.168.1.24
- http.host=192.168.1.24
- transport.host=192.168.1.24
- discovery.type=single-node
- "ES_JAVA_OPTS=-Xms512m -Xmx512m"
ulimits:
memlock:
soft: -1
hard: -1
deploy:
resources:
limits:
memory: 8g

The image comes from the page sited above and the volumes section follows the guidance for running elastic search in docker. The networks section follows a similar strategy as mongodb, attaching to a static IP on a pre-defined network network_qnet-static . The mac_address also follows a similar design pattern. I want to ensure that my instance always has the same mac address to support the triple MAC<->IP<->FQN necessary in my firewall. The ports are the standard ports except I’m not clustering elastic search (at least, not now) so I’ve omitted 9300.

The environment section defines a number of variables required by elasticsearch. The variable network.host is covered here.

network.host

(Static) Sets the address of this node for both HTTP and transport traffic. The node will bind to this address and will also use it as its publish address. Accepts an IP address, a hostname, or a special value.

I over engineered this and also explicitly set http.host , which, according to this is completely unnecessary.

http.host

(Static) Sets the address of this node for HTTP traffic. The node will bind to this address and will also use it as its HTTP publish address. Accepts an IP address, a hostname, or a special value. Use this setting only if you require different configurations for the transport and HTTP interfaces.

Defaults to the address given by network.host.

And, just for good measure, I also set transport.host , which seems to also be completely unnecessary.

transport.host

(Static) Sets the address of this node for transport traffic. The node will bind to this address and will also use it as its transport publish address. Accepts an IP address, a hostname, or a special value. Use this setting only if you require different configurations for the transport and HTTP interfaces.

Defaults to the address given by network.host.

In short, it looks like you just need network.host. Notice that I set this to the same value as the ipv4_address declaration above. I’ve set the discovery.type to single-node based on documentation here.

The last environment variable value is the Java options for Elasticsearch. I’ve set these quite low given that my QNAP has 36GB of ram.

# free -g
total used free shared buffers cached
Mem: 35 9 25 0 1 1
-/+ buffers/cache: 6 28
Swap: 24 0 24

The default compose file for Elasticserach and subsequent documentation describe disabling swap.

Swapping needs to be disabled for performance and node stability. For information about ways to do this, see Disable swapping.

On way of accomplishing this is with ulimits. The ulimits directive in the compose file isn’t well documented on the docker website. I guess I expected a link off to either the user space equivalent man page or the system level limits set in limits.conf. Setting the memlock value with ulimits to -1 locks all of the requested memory into ram.

All items support the values -1, unlimited or infinity indicating
no limit, except for priority and nice.

The final setting it looks like I don’t really need. The install documentation for graylog doesn’t really explain what this is for and the docker documentation indicate that it’s only for deploying to a swarm.

Specify configuration related to the deployment and running of services. This only takes effect when deploying to a swarm with docker stack deploy, and is ignored by docker-compose up and docker-compose run.

Spies like us “we mock what we don’t understand”

Service: Graylog

The Docker compose values for Graylog largely mirror the setup documentation for a v3 compose file. I’ll highlight the differences.

graylog:
image: graylog/graylog:4.1
networks:
network_qnet-static:
ipv4_address: 192.168.1.23
mac_address: fe:ed:fa:ce:f1:1d
volumes:
- graylog_data:/usr/share/graylog/data
environment:
# CHANGE ME (must be at least 16 characters)!
- TZ=America/Chicago
- GRAYLOG_PASSWORD_SECRET=somepasswordpepper
- GRAYLOG_ROOT_TIMEZONE=America/Chicago
# Password: admin
- GRAYLOG_ROOT_PASSWORD_SHA2=8c6976e5b5410415bde908bd4dee15dfb167a9c873fc4bb8a81f6f2ab448a918
- GRAYLOG_HTTP_EXTERNAL_URI=http://192.168.1.23:9000/
entrypoint: /usr/bin/tini -- wait-for-it 192.168.1.24:9200 -- /docker-entrypoint.sh
restart: always
depends_on:
- mongo
- elasticsearch
ports:
# Graylog web interface and REST API
- 9000:9000
# Syslog TCP
- 1514:1514
# Syslog UDP
- 1514:1514/udp
# GELF TCP
- 12201:12201
# GELF UDP
- 12201:12201/udp

Aligning with the other two services, I statically configure both the IP address and the MAC address with ipv4_address and mac_address respectively. The Dockerfile only exposes one volume.

....
ARG GRAYLOG_HOME=/usr/share/graylog
....
VOLUME ${GRAYLOG_HOME}/data

In the volumes section this is named graylog_data and like the rest of the volumes I let docker manage the volume vs using a bind mount.

A couple of important updates to the environment section. When I first successfully brought up all the services, one of the first things I tested was sending logs to gray log. Initially I thought I had misconfigured something when I ran my first search in Graylog. I couldn’t find the data that was being sent. I soon realized that differences in timezone values was having an impact. Graylog has three timezone values it reports system overview page. Without setting any values my system panel looks like this.

No timezone modifications

By default, the Docker has UTC set as the timezone.

# docker ps --format "{{.ID}}\t{{.Names}}"
ea424d9a116b graylog4_graylog_1
27002052728c graylog4_elasticsearch_1
17754c366ece graylog4_mongo_1
e8ab8b10e5b6 influxdb_influxdb_1
82ea81cb4e1f telegraf
8b16563ee5f1 grafana_grafana_1
8cd08e31f100 network_local_1
# docker container exec -it graylog4_graylog_1 /bin/date
Sun Nov 21 20:06:12 UTC 2021

Setting the environment value TZ to America/Chicago updates the Graylog server time to my current timezone -06:00.

Setting the host or container timezone

The User running the Graylog process still has the wrong timezone. Setting the value GRAYLOG_ROOT_TIMEZONE to America/Chicago resolves this.

Setting both the host and running graylog process timezone.

The last environment value that requires modification is the GRAYLOG_HTTP_EXTERNAL_URI variable. This value has been updated to reflect the IP address of the graylog server defined with the ipv4_address .

The last change needed it so update the entrypoint so that the wait-for-it script waits for the correct endpoint. Graylog depends on the elasticsearch service, but the depends_on directive only provides start order, not service availability.

There are several things to be aware of when using depends_on:

depends_on does not wait for db and redis to be “ready” before starting web — only until they have been started. If you need to wait for a service to be ready, see Controlling startup order for more on this problem and strategies for solving it.

Containers will use wait-for-it to ensure that a service is up and running and the IP address of the elasticsearch was updated to the value defined previously.

The remainder of the service configuration follows the Graylog documentation.

Volumes

The volumes section if fairly unremarkable and simply enumerates the volumes defined in the services section. One thing I’ve found is that if a Dockerfile defines a volume it’s easier to let docker manage that volume. I tried to use bind mounts initially in my compose files and found that I had several unused volumes because they were defined in the Dockerfile.

Networks

The networks section defines the externally configured network described previously. This design pattern seems to work well as I’ve rebooted the QNAP NAS several times for firmware updates. It’s possible I’m getting lucky in the startup and the network is created in time for the dependent services. I’ve done a bit of research on waiting for external dependancies. I suspect I may need to revisit this at some point, but after about half dozen reboots, I haven’t had to intervene in getting services up.

Next steps

I’ve moved all my logging sources to this new instance, it’s been ingesting logs for about 5 months now without issues. I need to go back a revisit a few things.

  • Update the JVM memory settings. As I was writing this, I realized, I’ve really constrained the search capability by limiting the amount of RAM I’m giving Java. I may try to capture some data and see if there are noticeable changes in response.
  • use FQDNS. I’d like to remove some of the hard coded configuration from the config. Specifically, I’d like to remove the IP addresses and MAC address. I’m not sure if that’s possible given some of the other constraints.
  • Add plugins. There are a few plugins that I want to add, I’d like to ensure that the compose file can accommodate these.
  • Setup cluster. I don’t think I really need to have a cluster config, but I’m curious about the performance difference.

--

--

John Wheeler

Security professional, Mac enthusiast, writing code when I have to.