Basic IRIS Server Troubleshooting Commands - SRF-Consulting-Group-Inc/iris GitHub Wiki

This page describes some basic troubleshooting commands that can be used on an IRIS server to verify the server is running, check device connectivity, etc. This guide assumes you are running IRIS on a Fedora-based Linux distribution (e.g. Fedora, Red Hat Enterprise Linux, Rocky Linux, etc.).

This covers only the most basic details. For more information, the IRIS Admin Guide may be helpful. You also may find useful information from the server operating system's official documentation/man pages, or plentiful help/tutorial materials available on the internet.

All commands on this page assume you can connect to the IRIS server over a Secure Shell (SSH) connection with a user account that has administrator (sudo) privileges. If you cannot connect or do not have the appropriate permissions, contact your system administrator.

User Permissions

Many commands in this guide require root/sudo access to run. Generally it is advised to run these commands with sudo as noted in this guide, but it can be useful at times to enter a root shell to avoid having to enter your password as much. To do this, run:

sudo -i

and enter your password. This will put you in a root shell so you can access all resources without restriction. Be very careful with this, as it can be easy to cause problems with an incorrect command or other error.

To leave the root shell, type Ctrl + D.

IRIS Service

IRIS (and most or all other programs on your server) uses the systemd software suite to manage and run services on the system. systemd has many features, but basic service management (performed using the systemctl tool) is simple.

To verify that the IRIS service is running, run:

sudo systemctl status iris.service

This will indicate whether or not the service is running and provide some status information. To restart the IRIS service, run:

sudo systemctl restart iris.service

To verify that IRIS started successfully, run the status command above, or check the IRIS error log file with:

sudo tail /var/log/iris/iris.stderr

This will print the last few lines of the file. If IRIS started successfully, you should see something like:

IRIS 5.42.2 restarted @ Fri Dec 29 13:40:09 CST 2023
Assertions are turned off.
IRIS Server active

Note that the IRIS version indicated may differ, and the time should be the time the service was restarted.

If there was a problem, this file will typically show information about the errors that are preventing IRIS from starting. These may come quickly and span many lines, so you may find it useful to tail the file continuously with:

sudo tail -f /var/log/iris/iris.stderr

This will print the output from the file to the terminal as it is written so you can see the errors as they are logged. To stop this, type Ctrl + C (this will only stop the log file contents from being output; the server will remain active).

You can also open the file with a text editor like nano, e.g. sudo nano /var/log/iris/iris.stderr

When the service is active, systemd will attempt to start/restart IRIS every 2 seconds indefinitely until it runs. If there is an error preventing it from starting, it will simply try and fail repeatedly. To stop the server (for instance, to address any errors), run:

sudo systemctl stop iris.service

Once the errors are addressed, you can start the service again with:

sudo systemctl start iris.service

PostgreSQL

IRIS stores most of its configuration data in a PostgreSQL database running on the same server. For IRIS to run, the PostgreSQL service must be running, and IRIS must be able to connect to it (typically over TCP port 5432). To check the PostgreSQL server, run:

sudo systemctl status postgresql-15.service

Note that you may have to change the version in the command (e.g., sudo systemctl status postgresql-16.service) depending on the version installed).

You can also use the systemctl start|stop|restart commands indicated in the previous section.

IRIS will connect to the PostgreSQL database defined in the db.url attribute of /etc/iris/iris-server.properties, using the user and password specified in db.user and db.password. These should all be configured by the IRIS installation script and should not require changing.

On Fedora-flavored distributions, the PostgreSQL server installation directory will generally be found in /var/lib/pgsql/<version>. For example, for PostgreSQL 15, you would find it in:

/var/lib/pgsql/15/

Configuration files will be found in /var/lib/pgsql/15/data/, and log files in /var/lib/pgsql/15/data/log/. If you suspect something is wrong with your PostgreSQL configuration, consult your IRIS system administrator, the IRIS Admin Guide, or the PostgreSQL documentation.

NGINX

IRIS uses the NGINX web server to (among other things) serve Java Network Launch Protocol (JNLP) files for running the IRIS client. To check the NGINX service, run:

sudo systemctl status nginx.service

The NGINX configuration directory is located in /etc/nginx/. The IRIS installation process should configure NGINX as needed, but if you suspect something needs to be adjusted, consult your IRIS system administrator, the IRIS Admin Guide, or the NGINX documentation.

IRIS Logs

IRIS has extensive logging code that can be enabled by module as needed, with logs placed in the /var/log/iris/ directory. For detailed information on what log files are available, see the Troubleshooting section of the IRIS Admin Guide.

Aside from the iris.stdout and iris.stderr logs, all logging in IRIS is off by default and must first be enabled. Once enabled, a logging module will continue outputting log information until it is explicitly stopped.

To enable a logging module, for instance of DMS scheduled message activities, run:

sudo -u tms touch /var/log/iris/sched

When (and if) IRIS performs activities related to DMS scheduled messages, they will immediately be logged to this file. You may find it useful to watch this file (or other log files) with:

sudo tail -f /var/log/iris/sched

This will print log messages in real-time so you can observe what IRIS is doing as it happens.

To disable logging and remove the log file, run:

sudo rm /var/log/iris/sched

You can also disable logging and keep the log file by renaming it, for example:

sudo mv /var/log/iris/sched /var/log/iris/sched.1

NTCIP Logs

IRIS logs messages for NTCIP-protocol devices (predominantly DMS and ESS) to the /var/log/iris/ntcip file. This file can be useful when diagnosing issues with these devices, however it is shared by all NTCIP devices and can therefore be somewhat difficult to inspect. The grep command line tool can make this process a little easier by allowing you to filter to a specific device.

To filter to a specific device in real-time, run:

sudo tail -f /var/log/iris/ntcip | grep --line-buffered <device_name>

replacing <device_name> with the name of the device in IRIS (which must match exactly). Note that the --line-buffered option is important for real-time monitoring to ensure the output is printed with each line. You can also add more filters if you like, for example:

sudo tail -f /var/log/iris/ntcip | grep --line-buffered <device_name> | grep --line-buffered dmsMessageMultiString

will show the MULTI string attribute that is sent to the device, which can be useful for debugging message content/appearance. You may wish to consult the relevant NTCIP standard document while using IRIS' NTCIP log for troubleshooting.

Historical Logs and Log Rotation

If you leave a log file in place, IRIS will continue outputting messages to that file indefinitely. This allows you to inspect historical log records, which can be useful for diagnosing issues after the fact, but also can consume large amounts of storage space. If you leave logging on, be sure to monitor the storage space used so the partition does not run out of space (which can cause critical issues).

It is also possible to configure the server to rotate log files, which puts a cap on the maximum storage space that will be used (note that this is an implicit cap, so you still need to monitor usage). SRF typically configures log rotation on the IRIS deployments we manage using the logrotate tool, with the configuration defined in /etc/logrotate.d/iris. This configuration will rotate log files up to once per day (less frequently if a log does not get used much), and will compress log files after a period of time to save space.

To inspect a previous log file, you can use a text editor like:

sudo nano /var/log/iris/ntcip-20231229

You can also use the grep tool to filter for a specific device (note that the log file name is the second argument):

sudo grep <device_name> /var/log/iris/ntcip-20231229

Or, alternatively:

sudo cat /var/log/iris/ntcip-20231229 | grep <device_name>

You can also add additional filters to these like with the real-time monitoring above.

Since this will process a large amount of log data, it may be difficult to inspect. The less command can be added to this to allow you to scroll and search through the output. For example:

sudo cat /var/log/iris/ntcip-20231229 | grep <device_name> | less

This will let you scroll through the filtered file contents using the arrow, Page Up/Down, or Home/End keys. To search in the output, type a forward slash character (/), enter the contents to search for, and hit Enter. To exit, type q. For more information on navigation features, run man less.

If you want to inspect an older file, it may be compressed. You won't be able to view it directly with a text editor, but you can use the zcat tool to decompress it on the fly, then inspect it will less. For example:

sudo zcat /var/log/iris/ntcip-20231225.gz | less

Note that this is zcat and not cat. You can also add device/other filters to this with grep, for example:

sudo zcat /var/log/iris/ntcip-20231225.gz | grep <device_name> | less

Device Connectivity Testing

When configuring a new device, or when a device stops working in IRIS, a good first step is to check if the server can communicate with it directly. First, see if you can ping it (replacing with the device IP):

ping 10.1.2.3

Wait for a few seconds, then type Ctrl+C to stop pinging. You should see responses coming back from this. If you don't see any output, or see Destination Host Unreachable, the device is either down, or is otherwise unreachable over the network. You should verify the device is powered up and connected to the network (performing a field visit if needed). If the device is connected via a cellular modem, verify that the modem is working properly, that any port forwards are configured properly, and that the modem has service. If the device is on a fiber optic or other wired network, verify that the network link is functioning as expected and that rest of the network path is working.

If you can ping the device but are still having issues in IRIS, the next step is to run a port scan. Note that you may need to install the nmap tool with sudo dnf install nmap first. This procedure will also not be particularly helpful with UDP ports, in which case you should skip to the next section.

To check TCP port 161 (SNMP/NTCIP), run:

sudo nmap -p 161 10.1.2.3

If the output includes:

PORT    STATE SERVICE
161/tcp open  snmp

the port is open and IRIS should be able to at least connect to the device. If you see:

PORT    STATE   SERVICE
161/tcp closed  snmp

the device is not listening on the port in question. This may be due to a device behind a cell modem port forward being down, or the port may be in use (for some devices). The device may also be malfunctioning and require a reset or other maintenance. If you see:

PORT    STATE     SERVICE
161/tcp filtered  snmp

there is most likely a firewall preventing the server from connecting on the port in question. You should contact your network administrator to check the firewall configuration.

Further Troubleshooting

If the port is open and you are still having issues, you can try connecting with a different program. For NTCIP devices, the snmpwalk command will be useful (you may need to install it first with sudo dnf install net-snmp-utils). For example,

snmpwalk -v1 -c public udp:10.1.2.3:161 1.3.6.1.4.1.1206.4.2.5.2

will traverse the MIB section specified by the OID provided (in this case ESS data), printing the output to the terminal. You can specify any valid OID in this command. If the device is working and can provide the requested attribute, you should see a response.

If you continue having difficulty past this point, you may be able to try a packet capture with the Wireshark command line tool tshark.

Video Proxy Server

SRF IRIS deployments are often paired with a video proxy server, running on a separate server. This is implemented as a systemd service that can be administered like the other services noted above. To check the proxy service, run (on the proxy server):

sudo systemctl status media-mtx.service

If the proxy is not working for some reason, you can try restarting it with:

sudo systemctl restart media-mtx.service

There are also log files that can be viewed in /var/log/mediamtx/

If restarting the proxy server does not fix the problem, contact your IRIS system administrator.

⚠️ **GitHub.com Fallback** ⚠️