03.04 preset projects - advantech-EdgeAI/edge_agent GitHub Wiki
The Smoke and Fire Detection project utilizes a Vision Language Model (VLM) to analyze image footage in real-time to determine the presence of smoke or fire. Upon detection of either hazard, the system is configured to trigger an immediate alarm, including visual notifications and voice alerts. This project is crucial for enhancing safety in various environments by providing early warnings for potential fire-related incidents.
To run this project, you can load one of the following presets:
-
SMOKE_ALERT_WEBRTC_ADVAN
(for smoke detection) -
FIRE_ALERT_WEBRTC_ADVAN
(for fire detection)
Note:
- Ensure the Edge Agent is running and accessible via your web browser.
- The demonstration video files must be located in the
/ssd/jetson-containers/data/videos/
directory on your Jetson device.- For smoke detection:
Smoke_advan.mp4
- For fire detection:
Fire_advan.mp4
- For smoke detection:
![]() |
Figure 3.2 — Pipeline overview: smoke detection |
---|
![]() |
Figure 3.3 — Pipeline overview: fire detection |
---|
This project utilizes the following key nodes connected in a pipeline:
-
VideoSource
: Provides the video input (e.g.,Smoke_advan.mp4
orFire_advan.mp4
). -
RateLimit
: Controls the frame processing rate (e.g., 15 fps) for the VLM. -
AutoPrompt_ICL
: Formats the input for the VLM by combining the video frame with a specific question about smoke or fire. -
Llama-3-VILA-1.5-8B
(loaded via NanoLLM_ICL Node): The Vision Language Model that analyzes the image and prompt to detect smoke or fire. -
VideoOverlay
: Displays the VLM's response or alert status directly on the video feed. -
VideoOutput
: Shows the final video stream with overlays. -
One_Step_Alert
: Processes the VLM's output to trigger an alarm if smoke/fire is confirmed ("yes"). -
PiperTTS
module (Preset): A pre-configured set of nodes for text-to-speech voice alerts. -
TextStream
: Displays the raw text output from the VLM.
Data Flow: The VideoSource
sends frames to RateLimit
. The limited frames go to AutoPrompt_ICL
, which combines them with a question and sends it to the Llama-3-VILA-1.5-8B
VLM. The VLM's partial text output is sent to VideoOverlay
to be shown on the VideoOutput
. The VLM's final text output is sent to One_Step_Alert
(to trigger alarms) and TextStream
(for display). If One_Step_Alert
triggers, it sends a warning message to the PiperTTS
module for an audible alarm.
Primary customization for this project involves the AutoPrompt_ICL
node for phrasing the detection question, the Llama-3-VILA-1.5-8B
(NanoLLM_ICL) node for the VLM settings, and the One_Step_Alert
node for alarm conditions.
-
AutoPrompt_ICL
Node Settings:-
Template (for Smoke Detection):
<reset><image><image><image>Does the image show any smoke ? Return "Yes" if you see any, or "No" if there is no visible smoke.
(The multiple<image>
tags are used to provide context frames to the VLM). -
Template (for Fire Detection):
<reset><image><image><image>Does the image show any fire ?Return "Yes" if you see any, or "No" if there is no visible fire.
-
seq_replace_mode
: Set totrue
. -
Roi
&Roi Coordinates
: Typically set tofalse
as the whole image is analyzed.
-
Template (for Smoke Detection):
-
Llama-3-VILA-1.5-8B
(NanoLLM_ICL Node) Settings:-
Model Selection:
Efficient-Large-Model/Llama-3-VILA-1.5-8B
. -
API Selection:
MLC
(for enhanced inference speed). -
Quantization Setting:
q4f16_ft
(default). -
Chat Template:
llama-3
. - System Prompt: "You are a helpful and friendly Al assistant."
-
Drop inputs
: Set toTrue
.
-
Model Selection:
-
One_Step_Alert
Node Settings:-
Check Time
: Default is 5 seconds (timeframe to determine status based on VLM outputs). -
Alert
: Set totrue
to enable alert functionality. -
Alert Keyword
: Set to"yes"
(this is the VLM's expected affirmative response if smoke/fire is detected). -
Normal Keyword
: Set to"no"
. -
Warning Message Text
(for smoke):"Warning: The smoke is rising."
(Ensure the period "." is at the end). -
Warning Message Text
(for fire):"Warning: The fire is rising."
(Ensure the period "." is at the end). -
Drop inputs
: Set toTrue
.
-
- Launch the Edge Agent UI in your browser.
- Load the appropriate preset (
SMOKE_ALERT_WEBRTC_ADVAN
orFIRE_ALERT_WEBRTC_ADVAN
):- Click the "Agent" menu in the top-right corner.
- Select "Load."
- Choose the desired preset
.json
file from the list.
- The pipeline will appear in the Node Editor.
- Ensure the
VideoSource
node's "Input" parameter is correctly set to the path of the smoke or fire video (e.g.,/data/videos/Smoke_advan.mp4
). - The project should start running automatically.
- Observe the
VideoOutput
panel:- You will see the video playing.
- The
VideoOverlay
will display the VLM's response (e.g., "Yes") or the warning message fromOne_Step_Alert
if smoke/fire is detected.
- Listen for audio alerts: If smoke or fire is confirmed by the VLM and
One_Step_Alert
, thePiperTTS
module will announce the warning message. - Check the
TextStream
node's display (if you open its grid widget) to see the direct textual output from the VLM.
-
Visual Output: The
VideoOutput
panel will show the video. If smoke or fire is detected, text such as "Yes" (from the VLM viaVideoOverlay
) or the full warning message (e.g., "Warning: The smoke is rising.") will be overlaid on the video. -
Audio Output: If the
One_Step_Alert
node triggers an alarm state (based on the VLM outputting "yes" consistently for theCheck Time
), thePiperTTS
module will voice the configuredWarning Message Text
. -
Textual Output: The
TextStream
node will display the VLM's direct responses (e.g., "Yes" or "No"). TheOne_Step_Alert
node also outputs a JSON message containing the state and check time.
-
No Detection / Incorrect VLM Response:
- Verify the
AutoPrompt_ICL
template exactly matches the required format and question. Typos can significantly affect VLM performance. - Ensure the correct
Llama-3-VILA-1.5-8B
model is selected and its parameters (API, Quantization, Chat Template, System Prompt) are configured as specified. - Check the
TextStream
output to see the raw VLM response. If it's not "yes" or "no," theOne_Step_Alert
node might not trigger correctly.
- Verify the
-
Alerts Not Triggering:
- Confirm the
Alert Keyword
inOne_Step_Alert
is set to"yes"
(or whatever the VLM's affirmative response is). - Check the
Check Time
inOne_Step_Alert
. If it's too long, or the VLM responses are intermittent, an alert might not trigger. - Ensure the
Warning Message Text
inOne_Step_Alert
is correctly formatted (ending with a period).
- Confirm the
-
No Audio Alerts:
- Check that the
PiperTTS
module is correctly connected to the output of theOne_Step_Alert
node. - Verify system audio is working and unmuted.
- Check that the
-
Performance Issues:
- VLMs can be resource-intensive. Monitor system resources (CPU/GPU) in the Edge Agent UI.
- Adjust the
RateLimit
node if necessary, though 15 fps is given as an example.