Running inference - UCSB-VRL/multimodal_vrl_camera_net GitHub Wiki
Inference Functions:
- RGB|d|t human detection (
process/human_detector.py
) - YOLO object detection with thermal interaction confirmation (
process/interaction_detector.py
) - Tri-stream (RGB|d|t) activity detection CNN (initial stages)
human_detector
To run the human detector on a frame, use human_detector(rgb, depth, ir)
with rgb, depth, and ir being the raw numpy frames without homography applied. It returns bounding boxes for a strong match (HOG postive from rgb and thermal), bounding boxes from medium match (depth with either rgb or thermal HOG), bounding boxes from HOG rgb detector, bounding boxes from HOG thermal detector, bounding boxes from depth&ir human detection, and an rgb image with homography applied and bounded boxes printed on. This detector has three detection methods:
- RGB: HOG person detection
- t: HOG person detection
- d|t: depth map segmentation and thermal human confirmation (checks if depth segment contains a significant heat signature which could likely be human)
This detector was designed to be able to attempt improve detection over plain RGB detection under dark conditions by using depth and thermal.
interaction_detector
To run the interaction detector on a frame, use interaction_detector(rgb, depth, ir)
with rgb, depth, and ir being the raw numpy frames without homography applied. It returns the rgb image with homography applied and YOLO bounding boxes printed on. This detector runs YOLO on the rgb image to discover humans and objects in the image. If there is a human and an object close together it will check the depth data to see if they have similar average depths and then check the thermal data to see if there is a significant thermal footprint on the object from interaction. This detector will print whether there has been any interactions with objects.
activity_detector
CNN with temporal memory for activity recognition. Not yet implemented.