Running a test job - BYUHPC/7lbd GitHub Wiki
Previous installation step: Install spank_iso_netns for 7lbd
Next installation step: Finish the Windows VM
Since we have not installed or configured the optional oodproxy mTLS connector, it should be disabled in before.sh.erb.
# before.sh.erb
export guacd_rdp_enabled=1
export ws_console_enabled=1
export tls_proxy_enabled=0
After.sh.erb
The after.sh.erb script has a number of checks running to make sure that the job is healthy before exposing the connection buttons to the user. The after.sh.erb script is making the assumption that the only connector that we really care about is the Guacamole connector. The after.sh.erb script is looking for:
- RDP port 3389 to be up on the VM
- SMB to be up on port 445
- guacd to be up on port 4822
- The Windows VM to be fully booted and to have run the password changing script and send a signal on port 54321
At this point, lets completely disable the after.sh.erb script by renaming it to something else temporarily. (Something like after.sh.erb.disabled.) By disabling the after.sh.erb script completely, two things happen. First, the connection buttons on the job card come up immediately. We can attempt to connect even if everything isn't running correctly. Finally, the after.sh.erb script has a timeout function. If the job doesn't respond to all of the checks in after.sh.erb in a certain amount of time, it kills the job. By disabling the after.sh.erb, we now can look at the job processes and logs without a timer ticking. We will re-enable the after.sh.erb later.
Run a test job
It's time to try and run a test job. Watch the logs as described below...
Job Logs
All of the scripts in the Open OnDemand application output both stdout and stderr to output.log in the job. Each of these scripts should output the name of the script before any output to the log as seen below. The following log is of a successful start launching a job with all three connectors and a samba server running. Your job will not have all 3 connectors or a samba server.
# output.log
BEFORE.SH Before script running...
BEFORE.SH Host: computenode1
BEFORE.SH: Slurm Job ID: 2514118
BEFORE.SH: rdp_credentials created at /home/joestudent/ondemand/data/sys/dashboard/batch_connect/dev/7lbd_ood/output/477435d5-9d12-486c-9cc2-7dc27bfd9f2a/rdp_credentials
BEFORE.SH: Guacd RDP Port: 58765
Script starting...
AFTER.SH: After script running:
SCRIPT.SH: script running
SCRIPT.SH: Starting VM...
Formatting '/tmp/ood_iso_netns_2514118/overlay_image.qcow2', fmt=qcow2 cluster_size=65536 extended_l2=off compression_type=zlib size=68719476736 backing_file=/apps/.vd/latest.qcow2 backing_fmt=qcow2 lazy_refcounts=off refcount_bits=16
SAMBA.SH: Tue Mar 11 10:52:58 AM MDT 2025: Starting smbd...
SAMBA.SH: Preload launching smbd...
GUACD_RDP.SH: Loading modules for guac...
2025.03.11 10:52:58 LOG5[ui]: stunnel 5.71 on x86_64-redhat-linux-gnu platform
2025.03.11 10:52:58 LOG5[ui]: Compiled/running with OpenSSL 3.0.7 1 Nov 2022
2025.03.11 10:52:58 LOG5[ui]: Threading:PTHREAD Sockets:POLL,IPv6,SYSTEMD TLS:ENGINE,FIPS,OCSP,PSK,SNI
2025.03.11 10:52:58 LOG5[ui]: Reading configuration from file /tmp/.oodproxy-IS3Xcl/stunnel.conf
2025.03.11 10:52:58 LOG5[ui]: UTF-8 byte order mark not detected
2025.03.11 10:52:58 LOG5[ui]: FIPS mode enabled
2025.03.11 10:52:58 LOG5[ui]: Configuration successful
2025.03.11 10:52:58 LOG5[per-day]: Updating DH parameters
smbd version 4.19.4 started.
Copyright Andrew Tridgell and the Samba Team 1992-2023
AFTER.SH: Waiting for RDP port 3389...
AFTER.SH: Waiting for SMB port 445...
AFTER.SH: Waiting for guacd port 4822...
AFTER.SH: Waiting for VM ready signal on port 54321...
GUACD_RDP.SH: launching guacd:...
GUACD_RDP.SH: Starting Guacd RDP Connector...
Successfully created /home/joestudent/ondemand/data/sys/dashboard/batch_connect/dev/7lbd_ood/output/477435d5-9d12-486c-9cc2-7dc27bfd9f2a/smbios_data.bin with proper permissions
SCRIPT.SH: Launching VM
INFO: gocryptfs not found, will not be able to use gocryptfs
WS_CONSOLE.SH: Loading required modules
guacd[4158270]: INFO: Guacamole proxy daemon (guacd) version 1.5.5 started
guacd[4158270]: INFO: Listening on host 0.0.0.0, port 4822
AFTER.SH: Waiting for RDP port 3389...
AFTER.SH: Waiting for guacd port 4822...
AFTER.SH: Waiting for VM ready signal on port 54321...
AFTER.SH: Waiting for VM ready signal on port 54321...
GUACD_CONNECTOR: (guacd_rdp.json) SPANK_ISO_NETNS_LISTENING_FD_0: 13
GUACD_CONNECTOR: (guacd_rdp.json) SPANK_ISO_NETNS_LISTENING_PORT_0: 58765
GUACD_CONNECTOR: (guacd_rdp.json) script_path: /home/joestudent/ondemand/data/sys/dashboard/batch_connect/dev/7lbd_ood/output/477435d5-9d12-486c-9cc2-7dc27bfd9f2a
GUACD_CONNECTOR: (guacd_rdp.json) Attempting to load credentials from: /home/joestudent/ondemand/data/sys/dashboard/batch_connect/dev/7lbd_ood/output/477435d5-9d12-486c-9cc2-7dc27bfd9f2a/rdp_credentials
GUACD_CONNECTOR: (guacd_rdp.json) Credentials loaded successfully
GUACD_CONNECTOR: (guacd_rdp.json) Attempting to load configuration from: /home/joestudent/ondemand/data/sys/dashboard/batch_connect/dev/7lbd_ood/output/477435d5-9d12-486c-9cc2-7dc27bfd9f2a/guacd_rdp.json
GUACD_CONNECTOR: (guacd_rdp.json) Connection configuration loaded successfully
GUACD_CONNECTOR: (guacd_rdp.json) Server is running on the passed-in socket.
AFTER.SH: Waiting for VM ready signal on port 54321...
WS_CONSOLE: WebSockify FD: 14
WS_CONSOLE: WebSockify PORT: 45709
AFTER.SH: Waiting for VM ready signal on port 54321...
WebSocket server settings:
- Listen for inetd connections
- No SSL/TLS support (no cert file)
- proxying from inetd to targets generated by TokenFile
AFTER.SH: Waiting for VM ready signal on port 54321...
AFTER.SH: Waiting for VM ready signal on port 54321...
AFTER.SH: Waiting for VM ready signal on port 54321...
AFTER.SH: Waiting for VM ready signal on port 54321...
AFTER.SH: All required services are up and VM signaled ready!
Generating connection YAML file...
If there is a problem with any of the code launching your job, it should be logged in the output.log.
Check the VM
Attempt to connect using the Web connector. If this works, then the Windows powershell startup script and the OOD code to get the password to the browser is working. You can see the temporary password in the rdp_credentials file in the job folder. If the Guacamole web connector is not working, see the password troubleshooting information below.
Next, test to make sure that SMB is working. The shares created in the previous steps should be there.
Troubleshooting password issues
There are a few steps in delivering a new password to the VM.
- before.sh.erb creates a random password and saves it to rdp_credentials in the job folder
- smbios.sh reads the rdp_credentials file and creates a smbios_data.bin file in the job folder
- script.sh.erb runs qemu with the -smbios parameter, inserting the username and password into the VM's bios
- the win_userconfig.ps1 startup script runs in Windows, reading the information from SMBIOS and running the password change command
Check the rdp_credentials file and the smbios_data.bin in the job folder file first. Look for output from the smbios.sh script in the log.
If the problem seems to be in the Windows VM, then we will need to see if the problem is with SMBIOS or the startup powershell script. To check to see if the data is in the smbios, download dmidecode for Windows on a machine with Internet: https://gnuwin32.sourceforge.net/packages/dmidecode.htm. Then access dmidecode.exe from a samba share.
Run dmidecode and see if the SMBIOS information is making it into your VM. The output should contain something like this:
OEM Strings
String 1: user=joestudent
String 2: password="X8asd3afsdASd3VX"
If the OEM Strings are not there, make sure that
-smbios file=${script_path}/smbios_data.bin \
is in the script.sh.erb. IF the smbios file is in the script.sh.erb and the OEM Strings are not showing up, check the output.log for errors from smbios.sh