03.05 preset projects - advantech-EdgeAI/edge_agent GitHub Wiki
The Clearance Space Detection project is designed to monitor designated areas, such as safety zones or operational clearance spaces, to ensure they remain unobstructed. Using a Vision Language Model (VLM), the system continuously checks if any items are placed within these marked zones for a specified duration. If an obstruction is detected, an alarm is triggered to alert personnel. This project is particularly useful for maintaining safety standards and operational efficiency in environments where clear spaces are critical.
To run this project, load the preset named:
FORBIDDEN_ZONE_ALERT_WEBRTC_ADVAN
Note:
- Ensure the Edge Agent is running and accessible via your web browser.
- The demonstration video file (
Forbidden_zone_advan.mp4
) must be located in the/ssd/jetson-containers/data/videos/
directory on your Jetson device. - The reference image for In-Context Learning (ICL) (e.g.,
forbidden_zone2.png
) and its containing folder (forbidden_zone
) must be located in the/ssd/jetson-containers/data/images/
directory. TheAutoPrompt_ICL
node will need the correct path to this image (e.g.,/data/images/forbidden_zone/forbidden_zone2.png
).
![]() |
Figure 3.4 — Pipeline overview |
---|
This project utilizes the following key nodes connected in a pipeline:
-
VideoSource
: Provides the video input (e.g.,Forbidden_zone_advan.mp4
). -
RateLimit
: Controls the frame processing rate (e.g., 10 fps) for the VLM. -
AutoPrompt_ICL
: Formats the input for the VLM. It uses In-Context Learning (ICL) with a reference image of the clear zone and focuses on a specific Region of Interest (ROI) for analysis. It then prompts the VLM about obstructions in the current frame's ROI. -
VILA-1.5-13B
(loaded via NanoLLM_ICL Node): The Vision Language Model that analyzes the ROI of the image based on the prompt and ICL reference. -
VideoOverlay
: Displays the VLM's response or alert status directly on the video feed. -
VideoOutput
: Shows the final video stream, often focused on the defined ROI, with overlays. -
One_Step_Alert
: Processes the VLM's output to trigger an alarm if the clearance space is obstructed. -
PiperTTS
module (Preset): A pre-configured set of nodes for text-to-speech voice alerts.
Data Flow: The VideoSource
sends frames to RateLimit
. The limited frames and the ICL reference image path go to AutoPrompt_ICL
, which defines the ROI and poses the question to the VILA-1.5-13B
VLM. The VLM's partial text output is sent to VideoOverlay
to be shown on the VideoOutput
(which also applies the ROI). The VLM's final text output is sent to One_Step_Alert
. If One_Step_Alert
triggers due to an obstruction, it sends a warning message to the PiperTTS
module for an audible alarm.
Customization primarily involves the AutoPrompt_ICL
for defining the monitored zone (via ROI and ICL image) and the VLM prompt, the VILA-1.5-13B
settings, and the One_Step_Alert
for alarm conditions.
-
AutoPrompt_ICL
Node Settings:-
Template:
<reset>'"' /data/images/forbidden_zone/forbidden_zone2.png "" In the above image, there is a red X-shaped area marked with tape on the ground. In the following image, check if any part of the red X shape is obstructed by an object, even partially. In below image: <image> Can you see the entire X shape pattern?
(Ensure the path to your ICL image is correct). -
seq_replace_mode
: Set totrue
. -
Roi
: Set totrue
. -
Roi Coordinates
: Set to0.75, 0.25, 1, 0.73
for the demo, or adjust to match your specific clearance zone within the camera's view. These are normalized coordinates[x_min, y_min, x_max, y_max]
.
-
Template:
-
VILA-1.5-13B
(NanoLLM_ICL Node) Settings:-
Model Selection:
Efficient-Large-Model/VILA-1.5-13B
. -
API Selection:
MLC
. -
Quantization Setting:
q4f16_ft
(default). -
Chat Template:
llava-v1
. - System Prompt: "A chat between a curious human and an artificial intelligence assistant. The assistant gives helpful, detailed, and polite answers to the human's questions."
-
Drop inputs
: Set toTrue
.
-
Model Selection:
-
VideoOutput
Node Settings:-
ROI
: Set totrue
. -
ROI Coordinates
: Set to the same values as inAutoPrompt_ICL
(e.g.,0.75, 0.25, 1, 0.73
) to focus the output display on the monitored area.
-
-
One_Step_Alert
Node Settings:-
Check Time
: Default is 5 seconds (timeframe to determine status based on VLM outputs). -
Alert
: Set totrue
. -
Alert Keyword
: Set to"no"
(since the VLM is asked "Can you see the entire X shape pattern?", a "no" response indicates an obstruction, triggering the alert). -
Normal Keyword
: Set to"yes"
. -
Warning Message Text
:"Warning: Stacking things in forbidden zone."
(Ensure the period "." is at the end). -
Drop inputs
: Set toTrue
.
-
- Launch the Edge Agent UI in your browser.
- Load the
FORBIDDEN_ZONE_ALERT_WEBRTC_ADVAN
preset:- Click the "Agent" menu in the top-right corner.
- Select "Load."
- Choose
FORBIDDEN_ZONE_ALERT_WEBRTC_ADVAN.json
from the list.
- The pipeline will appear in the Node Editor.
- Verify the
VideoSource
input path (/data/videos/Forbidden_zone_advan.mp4
). - Verify the ICL image path in the
AutoPrompt_ICL
node's template (e.g.,'"'/data/images/forbidden_zone/forbidden_zone2.png'"'
). - Confirm the
Roi Coordinates
in bothAutoPrompt_ICL
andVideoOutput
nodes are correctly set for your target area. - The project should start running automatically.
- Observe the
VideoOutput
panel:- You will see the video playing, likely focused on the ROI.
- The
VideoOverlay
will display the VLM's response regarding the visibility of the clearance space pattern. If an obstruction is detected, the warning message fromOne_Step_Alert
may also be displayed.
- Listen for audio alerts: If the VLM indicates an obstruction (answers "no" to the prompt) consistently for the
Check Time
inOne_Step_Alert
, thePiperTTS
module will announce the warning: "Warning: Stacking things in forbidden zone."
-
Visual Output: The
VideoOutput
panel will show the video, focused on the defined ROI. Text overlays will indicate the VLM's assessment of the clearance space. For example, if the prompt asks "Can you see the entire X shape pattern?" and it's obstructed, the VLM might respond "No, the red X-shaped area is partially obstructed by a shelf," and the warning message will be displayed. -
Audio Output: If the
One_Step_Alert
node confirms an obstruction based on the VLM's responses (e.g., consistently "no"), thePiperTTS
module will voice the configured warning. - Alert Logic: The system triggers an alarm if items are placed in the monitored zone for a specified duration, based on the VLM's interpretation of the scene guided by the ICL image and prompt.
-
No Detection / Incorrect VLM Response:
-
ICL Image: Ensure the path to the reference image in the
AutoPrompt_ICL
template is correct and the image accurately represents the clear state of the zone. -
Prompt: Verify the prompt in
AutoPrompt_ICL
is precise and clearly asks about obstructions in the defined area. -
ROI Coordinates: Double-check that the
Roi Coordinates
inAutoPrompt_ICL
andVideoOutput
accurately define the clearance space you want to monitor. Misaligned ROIs can lead to incorrect analysis. -
VLM Settings: Ensure the
VILA-1.5-13B
model parameters are correctly set.
-
ICL Image: Ensure the path to the reference image in the
-
Alerts Not Triggering or False Alerts:
-
Alert Keyword
: Confirm theAlert Keyword
inOne_Step_Alert
(e.g.,"no"
) correctly corresponds to the VLM's response when an obstruction is present. -
Check Time
: Adjust theCheck Time
inOne_Step_Alert
. Too short might cause false alarms; too long might delay necessary alerts. - Lighting/View Changes: Significant changes in lighting or camera angle might affect VLM performance if not accounted for in the ICL image or prompt.
-
-
No Audio Alerts:
- Check the connection from
One_Step_Alert
to thePiperTTS
module. - Verify system audio.
- Check the connection from