TrueNAS - zbrewer/homelab GitHub Wiki
I use TrueNAS Scale as my primary NAS. While it is capable of hosting some apps (through modified Kubernetes), I think Proxmox and other solutions are better suited to this task. Instead, I use TrueNAS solely for storage, storage access (Samba), and data backups.
The ISO for installation can be downloaded from the TrueNAS website and written to removable media such as a flash drive. Before booting and installing TrueNAS, it might be a good idea to physically disconnect any drives that contain data that you don't want to lose (such as an existing pool). It should also be noted that a single boot drive can be used or TrueNAS can mirror the installation on multiple for redundancy.
First, set up the NUT server (if you haven't already) per the instructions here. Next go to System Settings > Services
in the left hand menu and click on the configure icon next to UPS
.
Set the Identifier
which is the name that was set when configuring NUT in ups.conf
. For example, it would be eaton-5px2200rt
if the command upsc eaton-5px2200rt@localhost
is what it used to get the UPS status on the NUT server. The UPS Mode
should be set to Slave
, the Remote Host
should be set to the IP address of the Raspberry Pi, the Remote Port
should be 3493
, and the Port or Hostname
option should be set to auto
. Finally, the Monitor User
and Monitor Password
should be set to those configured when setting up NUT on the Raspberry Pi.
From here, the shutdown behavior can be configured. I like to set the Shutdown Mode
to UPS goes on battery
, the Shutdown Timer
to 300
(this is measured in seconds), and the Shutdown Command
to /sbin/shutdown -h
.
Save this and, from the Services
page, click the slider next to UPS
to ensure it is running and tick the Start Automatically
box to ensure that it starts at boot.
At this point, the UPS integration should be successfully configured. You can test connectivity and communication with the NUT server by going to System Settings > Shell
and typing upsc ups-name@nut-server-ip
where the UPS name and NUT server IP are the values you configured in the NUT settings page. You should see the UPS data printed out if it is successful. As another way to test, you can also unplug the UPS for the configured duration and wait to see if the TrueNAS box successfully starts to shut down.
Standalone mode in TrueNAS is useful if it is directly connected to the UPS (via USB or other cable) or if using a UPS with its own network card and connecting via SNMP (like the Eaton UPS that I use). If connected via cable, the directions on the NUT page can largely be adapted to TrueNAS. The rest of this section will focus on setting this up with the Eaton UPS I use connected via SNMP.
Just like when setting up in client mode, start by going to System Settings > Services
in the left hand menu and click on the configure icon next to UPS
in order to bring up the UPS options. Set the Identifier
to whatever you'd like to call the UPS locally, the UPS Mode
to Master
, the Driver
to Eaton ups 5 ConnectUPS X/ BD / E Slot Network port (snmp-ups)
(the snmp-ups
is the actual driver and the important part), and the Port or Hostname
to the IP address of the UPS's network card. Also set the Monitor User
and Monitor Password
to whatever you'd like (these would just be used for other NUT clients to connect) and uncheck Remote Monitor
(unless you'd like that feature).
From here, the shutdown behavior can be configured just like in client mode. I like to set the Shutdown Mode
to UPS goes on battery
, the Shutdown Timer
to 120
(this is measured in seconds), and the Shutdown Command
to /sbin/shutdown -h
. Leave the Power Off UPS
box unchecked.
Finally, the Other Options
section must be configured. Set the No Communication Warning Time
if you are getting spurious alerts (to something less than the shutdown timer, such as 30s) and the Description
to something useful.
Finally, set the following in the Auxiliary Parameters (ups.conf)
field:
mibs = auto
snmp_version = v3
secLevel = authNoPriv
secName = <snmp_readonly_username>
authPassword = <snmp_password>
Note that the <snmp_user>
should be replaced with a (read only) username from the CNMP config of your UPS's network card and the <snmp_password>
should be set to the corresponding password.
Save the UPS settings and click the Running
toggle next to UPS
on the Services
page and check the Start Automatically
box. This will start the UPS monitor and make sure it starts at boot. You can then once again test connectivity and communication with the UPS by going to System Settings > Shell
and typing upsc ups-name@nut-server-ip
These are all configured under the Data Protection
tab. In addition to the cloud sync tasks (described above) I have the following set up:
- Scrub task at 00:00 on Sunday (weekly)
- Periodic snapshot at 00:00 every day (keep for 2 weeks)
- S.M.A.R.T. Tests on all disks
- Short test at 00:00 on Saturday
- Long test at 00:00 on day 1 of the month
By default, the Adaptive Replacement Cache (ARC) that ZFS uses is limited to 50% of the installed RAM. While this helps provide some overhead for containers and the system when running in memory-limited environments, this is too low of a setting for a dedicated NAS with a large amount of RAM (128GB in my case). In order to get around this, navigate to System Settings > Advanced
in the TrueNAS UI and click on Add
under the Init/Shutdown Scripts
header. In the dialog that opens up afterwards, type a meaningful description, set the Type
to Command
, set When
to Post Init
, the Timeout
to 10
, make sure it is enabled, and enter the following for the command:
echo 107374182400 > /sys/module/zfs/parameters/zfs_arc_max
Replace the number specified with the max amount of memory (in bytes) that should be usable for ARC (100 GB in this case although up to 90% of the installed memory is likely fine if you aren't running any containers or VMs).
Then, create another script with the same parameters (except for the description) but this time make the command:
echo 8589934592 > /sys/module/zfs/parameters/zfs_arc_sys_free
This tells ZFS to try to always keep 8 GB of RAM free for the system to use.
Restart TrueNAS and confirm that it worked by checking the output of $ cat /sys/module/zfs/parameters/zfs_arc_max
and $ cat /sys/module/zfs/parameters/zfs_arc_sys_free
. These should report the bytes values that were set.
Important caveat: This may not work perfectly with Linux (which TrueNAS Scale is based on) and sounds like it can cause some stability issues. That being said, ensuring that the system has plenty of memory available seems to be a reasonable solution. See this thread for more details and the inspiration for this solution.
I use Samba (SMB/CIFS) as the primary method of sharing my datasets across machines. This is because it supports ACLs and is reasonably well integrated/supported on most systems that I use. The official TrueNAS documentation can be found here.
Before setting up a Samba share, users and groups that govern permissions must first be created. Start by creating groups under Credentials > Local Groups' and pressing the
Addbutton to create a new one. The name can be whatever you would like (as can the GID, generally) but the
Samba Authentication` box should be ticked since we would like to use the group for that purpose.
Next, create necessary users by going to Credentials > Local Users' and pressing the
Addbutton. The full name is just descriptive and the username/password can be whatever you would like (these will be used for Samba credentials). The UID can, again, generally be whatever you would like that is available (generally keeping this the pre-populated default makes sense) and any necessary groups that were created can be selected under
Auxiliary Groups`.
If this user will just exist for Samba authentication, the Home Directory
can be set to /nonexistent
and Create Home Directory
can be unchecked. This will prevent an unused home directory from being created for the user. In this case, the Shell
can also be set to nologin
(/usr/sbin/nologin
) to prevent any local login by the user. This helps increase security as well in case the account is compromised. Finally, ensure the Samba Authentication
box is ticked and click Save
to create the user.
After creating users and groups, the dataset that will be shared needs to be configured to use these for access control. This can be done by going to Datasets
, selecting the dataset to configure, and clicking Edit
next to Permissions
. This will open the ACL Editor which can be used to set permissions for the dataset. One important note about this is that the Read
and Execute
permissions are needed to be able to list dataset/directory contents.
Pre-set helpers exist for setting up home directory permissions and a few other common scenarios; however, another common set up that I had more initial trouble with was my media
dataset. In this case, I wanted a group that allowed full access (read, write, and execute) and another group that just allowed reading (read, execute). I was able to get this to work with the media-owners
and media-readers
groups by configuring it in the following way. I did have to select the Apply permissions recursively
, Apply Owner
, and Apply Group
options when initially setting up the ACL as well. I should also note that, while this provided a working setup, it may not be ideal and may need further tuning in the future.
Finally, creating the shares themselves requires going to the Shares
tab in TrueNAS and clicking the Add
button next to Windows (SMB) Shares
. The Purpose
can generally be left set to Default share parameters
unless one of the other options is a better fit. The path should point to the dataset the share is being created for and the name can be whatever you want it to be.
Finally, navigate to System Settings > Services
and click on the toggle next to SMB
to activate it. The Start Automatically
box should likely be checked as well to make the SMB service start at boot.
The SMB share has now been successfully set up and you can attempt to connect via whatever client you would like to try. Instructions vary based on OS but in Windows you can type \\<ip_address>\<share_name>
in the address bar of File Explorer for a quick test. This should prompt for the username and password and then show the share contents once accepted. To attach this share permanently (and on boot), right click on This PC
in the File Explorer side-panel and select Map Network Drive
. Select the desired drive letter and enter the same address that was used in File Explorer (\\<ip_address>\<share_name>
) under Folder
. Finally, tick the Reconnect at sign-in
and Connect using different credentials
boxes and click Finish
. A prompt will then appear where the login credentials can be entered.
See the instructions on the Plex page for mounting a SMB share in Linux using fstab.
I have a cloud sync task set up to perform automatic cloud backups of my critical (irreplaceable) data. For me, this is my "home" dataset (and sub-datasets) that contains all of my pictures, home movies, documents, etc. Any media (movies, songs, etc.) I generally worry about less and consider replaceable.
I use Backblaze B2 as my cloud backup solution due to the fact that it is affordable and well integrated into TrueNAS Scale. Below are the instructions for setting up backups and restoring them. See also the TrueNAS documentation for more details.
In order to set up an automated backup, first go to Credentials > Backup Credentials
in TrueNAS and click Add
next to Cloud Credentials
. Select Backblaze B2
as the provider and create an application key on the Backblaze website. Copy the key here and verify it/save.
Next, in the Bitbucket UI, create a new bucket (named whatever you want it to be), set it to private, and disable encryption (we'll encrypt locally).
From there, in the TrueNAS UI go to Data Protection
and click Add
next to Cloud Sync Tasks
. Set the Direction
to Push
, the Transfer Mode
to SYNC
, the directory/files to whatever you want to back up (my home directory in my case), select the Credential
we created earlier, select the bucket we just created, and select a folder in the bucket (if desired). Finally, set a schedule (I use daily at 00:00) and scroll down to tick the Use --fast-list
option.
At this point, a basic (unencrypted) backup has been set up. This is convenient in some ways since you can browse files on Backblaze and download a single file if necessary. That being said, this requires trusting Backblaze and is less secure than encrypting the files. This can be done by ticking the Remote Encryption
and Filename Encryption
boxes and specifying an Encryption Password
and Encryption Salt
.
The encryption password and salt must be recorded somewhere safe that is backed up separately. The backed-up data can't be recovered without both the password and the salt.
From here, a dry-run can be completed and the cloud sync task can be saved. The first time it runs it will likely have to upload large amounts of data to create the initial backup but subsequent runs should just upload the changes. Note too that a Bandwidth Limit
can be specified here to reduce the impact that large backup operations have on the network (at the expense of backup times).
First, ensure that TrueNAS is installed and a basic configuration is in place. Also, make sure that there is a pool with a dataset that the backup should be restored into. With this done, the easiest way to create the restore task is to set up the backup task exactly as was done before/described above (except for the fact that a schedule isn't necessary at this time).
Next, click on the Restore
button next to the cloud sync task on the Data Protection
page (looks like a clock with an arrow encircling it). Choose a Transfer Mode
(COPY
or SYNC
) and the dataset the data should be restored into and then click the Restore
button. This creates a new cloud sync task for the restore operation but doesn't run it yet. To do this, press the Run Now
button (looks like a play symbol) next to the backup restore cloud sync task that was just created on the Data Protection
page. This will start the backup restore process.
If it is impractical to recreate the original backup task first or additional control is desired, the restore cloud sync task can be created by hand. First, go to Credentials > Backup Credentials
in TrueNAS and click Add
next to Cloud Credentials
. Select Backblaze B2
as the provider and create an application key on the Backblaze website. Copy the key here and verify it/save.
Next, in the TrueNAS UI go to Data Protection
and click Add
next to Cloud Sync Tasks
. Set the Direction
to Pull
, the Transfer Mode
to SYNC
or COPY
(as desired), the directory/files to the location (dataset) where the backup should be restored, select the Credential
we created earlier, select the bucket containing the backup, and select the folder in the bucket containing the backup (if necessary - the list should auto-populate). A schedule is probably not needed since the backup-restore won't be ongoing but you should scroll down to tick the Use --fast-list
option.
If encryption was set up when the backup was created, the Remote Encryption
and Filename Encryption
boxes must also be ticked and the Encryption Password
and Encryption Salt
must be specified (the same password and salt used to create the encrypted backup in the first place). If the backup is encrypted then this step will be needed before the folders
in the remote bucket can be listed.
Now press the Dry Run
button and, assuming everything checks out, press save. Finally, after saving, press the Run Now
button (looks like a play symbol) next to the backup restore cloud sync task that was just created on the Data Protection
page. This will start the backup restore process.
TrueNAS uses rclone
under the hood for cloud sync tasks. This means that the Backblaze data can be restored even without using TrueNAS if necessary.
In order to do this, first create a rclone.conf
file with the following contents:
[local]
remote = /mnt/tank
[b2]
type = b2
account = <backblaze_account_id>
key = <backblaze_account_key>
[b2-crypt]
type = crypt
remote = b2:<bucket_name>
password = <password>
password2 = <salt>
In this example, /mnt/tank/
is the location the data will be restored to, and all of the bracketed placeholders (<>
) should be replaced with the appropriate credentials. Then, the following command can be run to execute the restore:
$ rclone --config rclone.conf sync b2-crypt local: --fast-list --progress
In addition to the automated cloud sync backup(s) mentioned above, a local removable hard drive can also be used to create an "offline" backup. This is done using a removable drive that can be plugged in, and manually backed up to, on a somewhat infrequent schedule (such as once every 6 months). This is beneficial since it doesn't require downloading from a cloud provider in order to restore data and since it is less susceptible to malware and ransomware attacks (due to the fact that it is physically disconnected most of the time). This also allows the offline backup to be stored offsite (although nearby is still more convenient) such as in a bank safe deposit vault.
The easiest way to set this up is to use TrueNAS local replication tasks. This allows you to take advantage of ZFS features (snapshots, scrubs, etc.) and also enables incremental backups (so that only new data is copied each time). On top of that, it doesn't require any additional software and doesn't involve another computer or the network.
In order to set this up, first plug in an unused HDD to the TrueNAS system. I'm using a portable USB drive and it was recognized immediately. To check that it was, go to the Storage
tab and you should see it listed as unassigned. On this same page, you will now need to set up a new pool containing the drive using the New Pool
option. In the pool creation wizard, give it a name, enable encryption (since physical control of the drive will be reduced when stored offsite), set it up as a single disk stripe (the only option with one disk, although TrueNAS will give you a warning about the fact that this doesn't have any redundancy), and skip all of the special VDEV sections. Once done, the new pool should show up under the Storage
tab dashboard with the capacity of your portable disk. An important note here, TrueNAS will likely prompt you to download the encryption key. Make sure that you do this and back it up. Without the key, you will not be able to access the data on the offline backup in the event that the primary pool is lost.
After the new pool has been created (and the encryption key backed up), go to the Datasets
tab and create new datasets on the new pool that mirror (have the same name as) the datasets on your primary pool(s) that you want to back up. these should inherit options from the root dataset of the new pool, including encryption.
Next, go to the Data Protection
tab and click Add
under Replication Tasks
. Select On this System
for the source and destination locations, set the source to a dataset you want to back up on your main pool, and set the destination to the corresponding dataset on the backup pool. Check the Recursive
box on the source side and the Encryption
and Inherit Encryption
boxes on the destination side. This will ensure that all datasets under the one you selected will be backed up and that the destination will inherit its parent's encryption settings.
Give the task a name (or accept the default) and click Next
. On this page, select Run Once
(since we will be manually running the task when the drive is plugged in), uncheck the option to make the destination dataset readonly, and set the destination snapshot lifetime (setting this to Never Delete
is a reasonable option since they can always be cleaned up manually).
Finally, click Save
which will save the task and begin running it. Repeat this process for any other datasets you wish to back up and wait for them all to complete.
In order to disconnect the drive after performing a backup (replication tasks), go to the Storage
page in TrueNAS and click on the Export/Disconnect
button next to the offline backup pool. On the next dialog, uncheck the Destroy all data on this pool?
and the Delete saved configurations from TrueNAS?
options. Confirm that you'd like to export/disconnect and click the Export/Disconnect
button. At this point, the drive can be physically unplugged.
After plugging the drive back in, go to the Storage
tab in TrueNAS where you should see an indication that a new disk was detected (Unassigned Disks
message). Click on Import Pool
in the upper-right and select your offline backup pool from the dropdown. Click Import
. Once this is done, replication tasks can be re-run (along with other maintenance items, such as scrubs) before disconnecting the drive again (the drive may first need to be unlocked using passphrase or keyfile from the Datasets
page).
Another backup technique that I employ as part of my layered backup scheme is using a second local NAS. This server is smaller (fewer bays) than the primary server and is only booted once a week for long enough to perform the backups and any maintenance items (pool scrubs, S.M.A.R.T. tests, etc.).
Since this backup is updated more frequently than the offline backup, restoring from it results in a shorter period during which new data may have been lost. In addition, this additional NAS is setup with redundancy in its ZFS pool and performs frequent scrubs so it is more able to protect against bitrot.
On the other hand, it doesn't get updated as frequently as the offsite backup (although this is a choice based on power use) but it is easier and quicker to restore from. Not only that, but since the disks are a one-time cost (until they fail), it is more feasible to have large amounts of storage and therefore backup large items (such as VM disk backups and media files) that, while not irreplaceable, would certainly be a pain to recreate.
In order to setup this system, first install TrueNAS on the second server using the instructions above. Install/test your drives and setup a pool.
Next, on your primary NAS (the one you are replicating from), create a new local user under Credentials > Local Users
. This user will be used by the second (backup) NAS to authenticate itself on the primary NAS. Give it whatever username you'd live (such as truenas-backup-replication
) and check the Disable Password
box since only SSH keys will be used. Next, give it a home directory (you may need to create a new dataset in your pool for this), give it a shell (such as bash
), and select the Allow all sudo commands
box (alternatively, add /usr/sbin/zfs
to the Allowed sudo commands
section. Save this user.
With that done, return to the backup NAS and navigate to Credentials > Backup Credentials
. Click Add
next to SSH Connections
and use the Semi-automatic
setup method. Fill in the rest of the fields, making sure to check the Enable passwordless sudo for zfs commands
box. Also note that the Username
is the username of the user that was just created on the primary NAS. The Admin Username
/Admin Password
are the username and password of the normal admin user on the primary NAS that you use to login via the web interface (these credentials are needed so that the connection can be setup automatically on both sides). Once you click Save
you should get a success message and see the connection in the UI. If you get an error message about "self signed certificates", choose to try again ignoring the error and it should succeed.
With the connection setup, create a dataset(s) in the pool on the backup NAS with the same name as the one(s) you would like to backup from the primary NAS. Now go to Data Protection
and click Add
next to Replication Tasks
. Select On a Different System
for the Source Location
and select the SSH connection that was just made. Select Use Sudo for ZFS Commands
on the dialog box that opens (or check the box in the configuration pane). Select the source and destination datasets, give the task a name, and set any other relevant options (recursive, encryption, etc.) that you would like. Finally, save and run the task.
I also setup a power on schedule for the backup server so that it can automatically boot up, run automated tasks (backup replication, pool scrubs, S.M.A.R.T. tests, etc.), and then shut down to save power. My configuration involves using ipmitool
to start the server up since the 10G NIC I'm using doesn't support WoL, but other setups using WoL would work too. In addition, I'm running an LXC container specifically for sending the startup command because the backup server's iDRAC is on the management network so a host there is needed to send the command (or rules to allow the packets to cross the firewall from a different source are needed). If setting up firewall rules or using WoL on the same network segment, the Cron job can be configured on the primary TrueNAS server using the GUI.
In order to set up the Cron job on a separate host using ipmitool
, first set up a Debian LXC as you normally would on the management interface. Then, install ipmitool
using the command apt install ipmitool
. Also, make sure the timezone and time are correctly set by running timedatectl set-timezone America/Denver
(or whatever your timezone is) and then making sure the clock is correct by inspecting the output of the date
command.
Next, configure iDRAC for IPMI management. Enable it for the server as a whole under iDRAC Settings > Network > IPMI Settings
and check the Enable IPMI Over LAN
box. Set the privilege level limit to Operator
and leave the encryption key set to the default (a string of 0s). Apply those settings and open the user management page under iDRAC Settings > User Authentication
and the Local Users
tab. Click on an unused user ID and select Configure User
on the following screen. Give it a username and password and then set its Maximum LAN User Privilege Granted
under IPMI User Privileges
to Operator
. You can leave serial over LAN and IPMI Over serial disabled. Set the iDRAC User Privileges
to Operator
and allow Login
and System Control
. Save the user.
Now, back on the host that will send the power on command, try the following from the command line:
ipmitool -H <iDRAC_IP> -I lanplus -L OPERATOR -U <username> -P <password> sensor list
Where the <iDRAC_IP>
is replaced by the iDRAC IP address, and the <username>
/<password>
are replaced by the username and password you just created in iDRAC for power on. The privilege level (designated by the -L
flag) can also be changed, if necessary.
If everything works correctly and communication is properly established, this will print a list of sensors. Assuming that works, create a file (/root/start_backup_server.sh
in my case) with the following contents (with the variables replaced with the same values as above).
#!/bin/bash
ipmitool -H <iDRAC_IP> -I lanplus -L OPERATOR -U <username> -P <password> chassis power on
Make this script executable (chmod +x /root/start_backup_server.sh
) and test it, if desired, by shutting off the backup server and running the script to see if it boots up.
Once the startup script works correctly, add the following line to the crontab
file of the server that will send the startup command by running crontab -e
to edit it.
30 0 2,9,16,23 * * /root/start_backup_server.sh
This will run the startup script on the 2nd, 9th, 16th, and 23rd of the month at 12:30am. The schedule can be tweaked as necessary for your use case. Make sure that the other jobs (replication tasks, scrubs, etc.) are scheduled a little after this so that the server has a chance to fully boot before they run.
In order to make the backup server hands-off and power saving (only turned on to perform backups and system checks), an automated shutdown mechanism is needed to pair with the auto power on schedule, described above. This can be accomplished by using this script that I wrote.
My script checks to ensure that there are no active S.M.A.R.T. tests, scrubs, or replication tasks and shuts the system down if that is the case. A couple of notes for using this script:
- The list of drives must be updated to include the drives you want to check for S.M.A.R.T. tests.
- The script should be placed somewhere on the backup NAS that is persisted across reboots, such as a home directory.
- The script must be made executable (
chmod +x <path_to_script>
) before it can be used. - The script must be run as the root user (or via
sudo
). It can be tested by running with the actual shutdown line commented out.
After saving the script onto the backup server in the appropriate location, making it executable, and testing it - we just need to set it up to automatically run. I have it do this once an hour by creating a Cron job on the System Settings > Advanced
page and clicking Add
in the Cron Jobs
section. Make sure to specify the script location as an absolute path (such as /mnt/tank/local/auto_shutdown.sh
) in the Command
box and select root
for the Run As User
.
You can make the schedule whatever is appropriate for your use-case but, since I have my backup server set to auto boot at 12:30am and the last scheduled task kicks off at 4:30am, I have the script scheduled to run daily between the hours of 5am and midnight. It is important that it doesn't run during the period where tasks are being started as it could catch the server in-between a replication and a scrub, for example, and shut it down before the scrub can start. The crontab schedule for this is 0 05-23,00 * * *
. You could also setup a separate schedule to run the script between the hours of 1am and 4am on all days but those that the scheduled jobs are set to start. This will ensure that the server shuts off within an hour of tasks being finished, regardless of the time. This extra complexity wasn't worth the savings to me but it is an option as well.
Also note that the script can't be saved anywhere on the boot-pool since it has the noexec
option set. For example, if saved in the admin user's home directory, you could run findmnt --target /home/admin/auto_shutdown.sh
to see that this is part of the boot-pool and that noexec is set. This means that the script must be added to a new dataset in a different pool. For me, this is a local
dataset that is part of my primary tank
pool.
The configuration file can be downloaded for backup by visiting System Settings > General
, clicking on Manage Configuration
in the upper right-hand corner, and selecting Download File
. selecting Export Password Secret Seed
on the subsequent dialog box will allow the configuration to be used on a new boot device if the current device becomes corrupted. That being said, this configuration now contains secrets and should not be stored publicly.
The configuration can be restored using a similar process. Visit System Settings > General
, click on Manage Configuration
in the upper right-hand corner, and select Upload File
. Choose the file (locally) and click Upload
.
TrueNAS can be monitored on a Grafana dashboard by using Telegraf as an intermediary to convert the Graphite metrics format exported by Truenas into a format that can be consumed by InfluxDB (and therefore used by Grafana).
In order to do this, ensure that there is a Telegraf container configured in Docker as part of the Telegraf Stack. Be sure to create a token for this use case in InfluxDB and to set it in the .env file used by the Telegraf stack. Also be sure to put the Truenas Graphite converter Telegraf config in the same directory (specifying the bucket, organization, etc. as necessary) and to reference it from the docker-compose.yaml file. This is what defines the Telegraf container that will receive data in the Graphite format and publish it to InfluxDB. Finally, make sure that the firewall is configured to allow Truenas to access the Docker host where the Telegraf container is running on the TCP port exposed for the container in the docker-compose.yaml file. You will also need to make sure that the firewall allows the Telegraf container to access InfluxDB on the necessary port.
With this configured, go into TrueNAS under the Reporting
tab and select Exporters
in the upper right. Click Add
on the next page and, in the resulting dialog box, enter the name of the exporter (anything works here - this is just so that you can identify it), select GRAPHITE
as the type, enter the IP of the Docker host for the Destination Ip
, enter the port of the Telegraf container for the Destination Port
, and enter anything you'd like as the Namespace
. Save and ensure that the exporter is enabled. At this point, you should start seeing data populated in InfluxDB.
While the above technique exports most necessary metrics, those exported by the zpool_influxdb
tool are not included. This tool includes some useful stats, such as pool usage and fragmentation, but Telegraf (running in a container) can be used to export them to the same InfluxDB bucket. Credit for this setup goes to this Reddit post.
In order to do this, first create a new dataset and directory that will be used for monitoring config files. For me, the dataset name is monitoring
and I created a telegraf
directory inside of it from the shell. My pool name is tank
so the full path is /mnt/tank/monitoring/telegraf
.
Inside of this directory, place all of the truenas telegraf files and modify telegraf.conf with the correct bucket/organization/IP address/key for the InfluxDB instance that will be used. Also, make the rest of the scripts executable. Run the setup.sh
script with the command $ ./setup.sh
in order to create the necessary links and directories.
Next, go to the Apps
tab in Truenas and click the Launch Docker Image
button. This Docker image will need to be configured with a number of settings in order to publish metrics correctly. These are given below:
Container settings:
Setting | Value |
---|---|
Application Name | telegraf |
Image repository | telegraf |
Image Tag | latest* |
* The image tag may have to be changed to a specific (older) version if GLIBC version not found errors prevent the container from launching. These can be found in the logs.
Environment variables:
Name | Value |
---|---|
HOST_ETC | /hostfs/etc |
HOST_PROC | /hostfs/proc |
HOST_SYS | /hostfs/sys |
HOST_VAR | /hostfs/var |
HOST_RUN | /hostfs/run |
HOST_MOUNT_PREFIX | /hostfs |
LD_LIBRARY_PATH | /mnt/zfs_libs |
Port forwarding:
Container Port | Node Port | Protocol |
---|---|---|
8094 | 9094 | TCP |
Storage:
Host path | Mount path | Notes |
---|---|---|
/mnt/tank/monitoring/telegraf/telegraf.conf | /etc/telegraf/telegraf.conf | |
/mnt/tank/monitoring/telegraf/etc | /hostfs/etc | |
/mnt/tank/monitoring/telegraf/proc | /hostfs/proc | |
/mnt/tank/monitoring/telegraf/sys | /hostfs/sys | |
/mnt/tank/monitoring/telegraf/run | /hostfs/run | |
/mnt/tank/monitoring/telegraf/entrypoint.sh | /entrypoint.sh | |
/mnt/tank/monitoring/telegraf/zfs_libs | /mnt/zfs_libs |
Workload details:
Setting | Value |
---|---|
Privilaged Mode | Enabled |
Configure Container User and Group ID | Enabled |
Run Container As User | 0 |
Run Container As Group | 0 |
Ensure that the firewall is configured so that the container running in TrueNAS can access the InfluxDB server on the TCP port it uses (8086
in my case). Start the container and check the logs for errors. You should start seeing data populated in InfluxDB.
With the Docker images for Graphite conversion and Telegraf metrics exporting configured and launched, data should be populated from Truenas into InfluxDB. Now, the Grafana dashboard config can be imported to start displaying the data. Note that the host, pool, and other fields may need to be adjusted in the dashboard for the specific setup being used.