Monitoring your QNAP NAS with Telegraf, Influxdb, and Grafana.

John Wheeler
7 min readFeb 14, 2021
QNAP TS-473 minimal telegraf config

In 2020, I migrated from my 10 year old QNAP TS-419+ to a new QNAP TS-473. After the migration I endeavored to get better visibility into all the components in my network. I setup the Telegraf, Influxdb, Grafana (or TIG) stack for metrics and Graylog for centralized logging.

I’ve setup telegraf on the QNAP NAS and now I can see the TS-473 host data along with other hosts I’m monitoring.

Compiling

I’ve had the opportunity to compile telegraf on an embedded platform before when I built telegraf for the Asus RT-AC68U.

I started with installing go. This assumes that you have installed Entware and have enabled ssh for shell access.

# opkg install go
Installing go (1.15.5-1) to root...
Downloading http://bin.entware.net/x64-k3.2/go_1.15.5-1_x64-3.2.ipk
Configuring go.
Please add /opt/bin/go/bin to your PATH
Please set GOROOT=/opt/bin/go environment variable to use GO compiler

Next install git and the git helper for https

# opkg install git
Installing git (2.27.0-1) to root...
Downloading http://bin.entware.net/x64-k3.2/git_2.27.0-1_x64-3.2.ipk
Installing libopenssl (1.1.1h-1) to root...
Downloading http://bin.entware.net/x64-k3.2/libopenssl_1.1.1h-1_x64-3.2.ipk
Configuring libopenssl.
Configuring git.
# opkg install git-http
Installing git-http (2.27.0-1) to root...
Downloading http://bin.entware.net/x64-k3.2/git-http_2.27.0-1_x64-3.2.ipk
Installing ca-bundle (20200601-1) to root...
Downloading http://bin.entware.net/x64-k3.2/ca-bundle_20200601-1_all.ipk
Installing libcurl (7.72.0-2) to root...
Downloading http://bin.entware.net/x64-k3.2/libcurl_7.72.0-2_x64-3.2.ipk
Configuring ca-bundle.
Configuring libcurl.
Configuring git-http.

I initially just cloned the repo and began to start building

# git clone https://github.com/influxdata/telegraf.git
Cloning into 'telegraf'...
remote: Enumerating objects: 54, done.
remote: Counting objects: 100% (54/54), done.
remote: Compressing objects: 100% (43/43), done.
remote: Total 47303 (delta 16), reused 24 (delta 11), pack-reused 47249
Receiving objects: 100% (47303/47303), 31.08 MiB | 28.24 MiB/s, done.
Resolving deltas: 100% (27106/27106), done.
# cd telegraf
# make
-sh: make: command not found

Oops, I need make still and I should probably head the advice above about setting GOROOT and adding go to my path.

# opkg install make
Installing make (4.3-1) to root...
Downloading http://bin.entware.net/x64-k3.2/make_4.3-1_x64-3.2.ipk
Configuring make.
# export GOROOT=/opt/bin/go
# export PATH=$PATH:/opt/bin/go/bin

I remember I had issues when I tried to run make with no target on my Asus, so I specified the target

# make telegraf
make: go: Permission denied
make: go: Permission denied
make: go: Permission denied
go build -ldflags " -X main.commit=358633bc -X main.branch=master -X main.goos= -X main.goarch=" ./cmd/telegraf
go: downloading github.com/benbjohnson/clock v1.0.3
go: downloading github.com/alecthomas/units v0.0.0-20190717042225-c3de453c63f4
go: downloading github.com/docker/docker v17.12.0-ce-rc1.0.20200916142827-bd33bbf0497b+incompatible
go: downloading github.com/Azure/azure-storage-queue-go v0.0.0-20181215014128-6ed74e755687
go: downloading google.golang.org/api v0.20.0
go: downloading github.com/influxdata/wlog v0.0.0-20160411224016-7c63b0a71ef8
.......
go: downloading github.com/nsqio/go-nsq v1.0.8
go: writing stat cache: write /root/go/pkg/mod/cache/download/github.com/stretchr/testify/@v/v1.6.1.info349528915.tmp: no space left on device
go: writing stat cache: write /root/go/pkg/mod/cache/download/github.com/alecthomas/units/@v/v0.0.0-20190717042225-c3de453c63f4.info963849662.tmp: no space left on device

The first few errors I ignored, but the “no space left on device” I saw when I compiled for my Asus. I need to clean up the mess and set the GOPATH

# cd /root
# rm -rf go
# cd /share/MD0_DATA
# mkdir gopath
# export GOPATH=/share/MD0_DATA/gopath/

Then I could try again….

# make telegraf
make: go: Permission denied
make: go: Permission denied
make: go: Permission denied
go build -ldflags " -X main.commit=358633bc -X main.branch=master -X main.goos= -X main.goarch=" ./cmd/telegraf
go: downloading github.com/benbjohnson/clock v1.0.3
go: downloading github.com/go-logfmt/logfmt v0.4.0
go: downloading go.starlark.net v0.0.0-20200901195727-6e684ef5eeee
go: downloading github.com/influxdata/toml v0.0.0-20190415235208-270119a8ce65
go: downloading github.com/golang/geo v0.0.0-20190916061304-5b978397cfec
go: downloading github.com/gobwas/glob v0.2.3
go: downloading github.com/gosnmp/gosnmp v1.29.0
go: downloading github.com/aws/aws-sdk-go v1.34.34
go: downloading collectd.org v0.3.0
go: downloading github.com/prometheus/client_golang v1.5.1
go: downloading github.com/newrelic/newrelic-telemetry-sdk-go v0.5.1
go: downloading github.com/google/go-cmp v0.5.2
.....
go.opencensus.io/plugin/ochttp/propagation/b3
compile: writing output: write $WORK/b390/_pkg_.a: no space left on device
go build google.golang.org/api/googleapi/transport: write /tmp/go-build211296211/b391/importcfg: no space left on device
go build google.golang.org/api/transport/cert: write /tmp/go-build211296211/b392/importcfg: no space left on device
go build google.golang.org/api/transport/http/internal/propagation: write /tmp/go-build211296211/b393/importcfg: no space left on device
go build github.com/golang/protobuf/ptypes/empty: write /tmp/go-build211296211/b395/importcfg: no space left on device
go build cloud.google.com/go/pubsub/internal/distribution: write /tmp/go-build211296211/b396/importcfg: no space left on device

Same error, different reason. The GOPATH is for go cache. In this case, the TMPDIR needed to move as well. Again, the same issue I had on the Asus.

# mkdir /share/MD0_DATA/gotmp
# export TMPDIR=/share/MD0_DATA/gotmp

Try again….

# make telegraf
make: go: Permission denied
make: go: Permission denied
make: go: Permission denied
go build -ldflags " -X main.commit=358633bc -X main.branch=master -X main.goos= -X main.goarch=" ./cmd/telegraf
[/share/MD0_DATA/telegraf/telegraf] # ls
accumulator.go appveyor.yml cmd/ docker-compose.yml EXTERNAL_PLUGINS.md go.sum LICENSE metric/ output.go processor.go selfstat/
agent/ build_version.txt config/ docs/ filter/ input.go logger/ metric.go plugin.go README.md telegraf*
aggregator.go CHANGELOG.md CONTRIBUTING.md etc/ go.mod internal/ Makefile models/ plugins/ scripts/ testutil/

Success! I have a telegraf* binary. Let’s confirm the version we compiled.

# ./telegraf
2021-01-24T00:22:06Z I! Starting Telegraf
2021-01-24T00:22:06Z E! [telegraf] Error running agent: No config file specified, and could not find one in $TELEGRAF_CONFIG_PATH, /root/.telegraf/telegraf.conf, or /etc/telegraf/telegraf.conf
[/share/MD0_DATA/telegraf/telegraf] # ./telegraf --version
Telegraf unknown (git: master 358633bc)

That’s not helpful. Reading up a big on Telegraf, I decided to pull a specific label for the compile so I nuke the telegraf directory and started over.

# rm -rf telegraf/
[/share/MD0_DATA/telegraf] # git clone --depth 1 --branch v1.17.0 https://github.com/influxdata/telegraf.git
Cloning into 'telegraf'...
remote: Enumerating objects: 1971, done.
remote: Counting objects: 100% (1971/1971), done.
remote: Compressing objects: 100% (1830/1830), done.
remote: Total 1971 (delta 90), reused 1140 (delta 65), pack-reused 0
Receiving objects: 100% (1971/1971), 2.41 MiB | 13.15 MiB/s, done.
Resolving deltas: 100% (90/90), done.
Note: switching to '3f7a54c9a5ad4939a7c3f2e9357f11016bc6c3b3'.
You are in 'detached HEAD' state. You can look around, make experimental
changes and commit them, and you can discard any commits you make in this
state without impacting any branches by switching back to a branch.
If you want to create a new branch to retain commits you create, you may
do so (now or later) by using -c with the switch command. Example:
git switch -c <new-branch-name>Or undo this operation with:git switch -Turn off this advice by setting config variable advice.detachedHead to false

We can ignore the verbose warning as we are only building. Let’s try this one more time.

# cd telegraf/
# make telegraf
make: go: Permission denied
make: go: Permission denied
make: go: Permission denied
go build -ldflags " -X main.commit=3f7a54c9 -X main.branch=HEAD -X main.goos= -X main.goarch= -X main.version=1.17.0" ./cmd/telegraf
go: downloading github.com/newrelic/newrelic-telemetry-sdk-go v0.2.0
go: downloading github.com/Shopify/sarama v1.27.1
go: downloading github.com/nsqio/go-nsq v1.0.7
go: downloading github.com/kardianos/service v1.0.0
# ./telegraf --version
Telegraf 1.17.0 (git: HEAD 3f7a54c9)

Finally! This process, didn’t take that long, I’d say the compile time for this a few minutes. Now let’s connect it to our TIG stack.

Configure Telegraf

My telegraf configuration was pretty simple

[global_tags]
[agent]
interval = "15s"
round_interval = true
metric_batch_size = 1000
metric_buffer_limit = 10000
collection_jitter = "0s"
flush_interval = "10s"
flush_jitter = "0s"
precision = ""
hostname = ""
omit_hostname = false
[[outputs.influxdb]]
urls = ["http://192.168.1.30:8086"]
database = "rpi_monitoring"
username = "rpi"
password = "IMNOTACATIMAMAN"
[[inputs.cpu]]
percpu = true
totalcpu = true
collect_cpu_time = false
report_active = false
[[inputs.disk]]
ignore_fs = ["tmpfs", "devtmpfs", "devfs", "iso9660", "overlay", "aufs", "squashfs"]
[[inputs.mem]]
[[inputs.swap]]
[[inputs.net]]
interfaces = ["eth0"]
[[inputs.netstat]]
[[inputs.kernel]]
[[inputs.system]]
[[inputs.processes]]
[[inputs.diskio]]

I put the contents above into the file into

/opt/etc/telegraf.conf

Relocating the binary

I want to look at packaging telegraf, but that’s another project. For now, I want to locate the binary along with other packages from Entware, so I copied the binary to /opt/bin.

cp /share/MD0_DATA/telegraf/telegraf/telegraf /opt/bin

We should create a proper environment that includes a telegraf user and group. I need to learn how to make these things persistent in QNAP. For now the agent will run as root.

Running the agent

To start the agent

/opt/bin/telegraf --config /opt/etc/telegraf.conf

I have some idea on how to make this persistent based on this thread. My NAS is generally up, I don’t reboot it that often, and if this process doesn’t start it’s not a big impact so I’ll leave this for another project as well.

Rendering in Grafana

I put the metrics from this host into the same database as my raspberry pi monitoring and used this dashboard as a baseline. One of the nice things about this dashboard is it has auto repeating panels for a few of the metrics. For example, IOPS, auto repeats for each disk. The charts below look very similar because all of my disks are in the same RAID 5 group so their behavior is very similar.

Next steps

  • I want to figure out how to build this and deploy it, possibly as a complete entware package including adding appropriate user and group.
  • I want to ensure that the agent can automatically restart after reboot. I think I have a good lead on this, just need to test and verify.
  • The performance metrics are good, but what I really want is the SMART data. I have some idea’s on how I can do this, when I sort it out, I’ll post an update.

--

--

John Wheeler

Security professional, Mac enthusiast, writing code when I have to.