get started with image data - acfr/snark GitHub Wiki
This tutorial shows various ways that snark can be used to manipulate image data. For low level image manipulation, snark has a lot of functionality.
This tutorial covers:
- live camera data (visualisation, logging, requires a webcam or similar)
- conversion to image files
- conversion to video files (e.g. avi/mp4, requires avconv - formerly known as ffmpeg)
- image manipulation (e.g. rotation, resizing, adding printed timestamps, bayer to rgb conversion)
- selecting temporal windows of data
- undistortion
- stereo processing (output disparity images, output point coloured point clouds)
The examples run natively on Linux. On Windows, it has largely been tested in cygwin and many things work in the native command prompt too.
Accompanying this tutorial is a single dataset where a sensor vehicle is driven in a circular trajectory. The Shrimp perception research ground vehicle has a variety of sensors covering many different modalities, and this was used to gather data at the Australian Centre for Field Robotics' (ACFR) Marulan test facility, in New South Wales, Australia. For this tutorial we provide the first ten seconds of video from each camera on Shrimp (just to keep the file sizes down). This includes: monocular colour (GigE), thermal LWIR, stereo Bumblebee and panospheric Ladybug). Localisation and LiDAR data is also available for this dataset as part of the get started with data streams tutorial.
The data are split into separate downloads:
- http://perception.acfr.usyd.edu.au/data/samples/circle/config.tar.gz: 80MB
- http://perception.acfr.usyd.edu.au/data/samples/circle/bumblebee.bin.gz: 484MB
- http://perception.acfr.usyd.edu.au/data/samples/circle/gige.bin.gz: 258MB
- http://perception.acfr.usyd.edu.au/data/samples/circle/ir.bin.gz: 16MB
- http://perception.acfr.usyd.edu.au/data/samples/circle/ladybug.bin.gz: 327MB
Once downloaded, unzip the data:
tar -xvf config.tar.gz gunzip gige.bin.gz gunzip ir.bin.gz gunzip ladybug.bin.gz gunzip bumblebee.bin.gz
...or replace all instances of 'cat' below with 'zcat' directly on the .gz files.
With the multi-sensor dataset, we have included a config file to describe it.
The main utility we will use is 'cv-cat', which has many options. Have a quick look at the help:
cv-cat -h cv-cat -h -v
cv-cat is able to process any video feed that is supported by opencv, including usb, firewire and other cameras. Plug in a web cam.
View the live feed from the camera:
cv-cat --id=0 "view;null" cv-cat --id=0 "view" > /dev/null
...both statements are identical. Access the opencv camera with id=0, view it, output the raw data to /dev/null.
Note: interfaces other than opencv are supported for cameras. See 'fire-cat' for dc1394 firewire support and 'gige-cat' for Prosilica GigE cameras.
cv-cat uses the concept of filters to process the data. We've already seen two: "view;null".
View a live flipped image:
cv-cat --id=0 "flip;view;null"
Make the image twice as large, view the result, then add a timestamp overlay and view that too:
cv-cat --id=0 "resize=2;view;timestamp;view;null"
Multiple views are handy when debugging a filter chain. However, if you have a very high res camera, the images won't fit on your screen. You could resize, but that resizes the data through the remaining filters. Instead, use 'thumb' to display a small thumbnail without altering the data:
cv-cat --id=0 "thumb;flip;thumb;flop;thumb;flip;thumb;null"
Without null termination, the data is output in a simple binary format, that cv-cat or other utilities can understand:
cv-cat --id=0 "view;timestamp" | cv-cat "view;null"
Let's analyse the data:
cv-cat --id=0 --output="header-only;fields=t,cols,rows,type" | csv-from-bin t,3ui
For my old webcam I see the following stream:
20131023T074525.587721,160,120,16 20131023T074525.598234,160,120,16 20131023T074525.629360,160,120,16 20131023T074525.673855,160,120,16
The resolution is 160x120, leading to an (rgb) image size of 160x120x3 = 57600 bytes. So let's manipulate the full raw data to double check what's in the stream:
cv-cat --id=0 | csv-from-bin t,3ui,57600ub | cut -d, -f1
...in the line above we treat the data as a raw binary stream, convert it to ascii, and cut the first column, which is time:
20131023T074655.614252 20131023T074655.636586 20131023T074655.659381 20131023T074655.704033
So there's nothing different about how we treat image data. It's a regular binary stream of data.
So at this point you know the exact format, you could write your own apps or loaders for matlab/python etc. However, the next section will show you how to create more useful outputs, like png, mp4, etc.
First, let's log some data from out camera:
cv-cat --id=0 > webcam.bin
...or we can view the output and log at the same time:
cv-cat --id=0 "view" > webcam.bin
...and just for fun, let's throw in the standard linux utility 'pv' to track the file size:
cv-cat --id=0 "view" | pv > webcam.bin
Note: on linux, "sudo apt-get install pv", or use cygwin package manager if on windows. This is one of many examples that shows off the benefit of streams and pipes. We can leverage third party tools, or tools from our other libraries, programming languages, scripts, without having to recompile a thing.
Check the file by playing it back:
cat webcam.bin | cv-cat "view;null"
...it will play as fast as possible, try using csv-play to play back the data, as seen in get started with data streams:
cat webcam.bin | csv-play --binary="t,3ui,57600ub" | cv-cat "view;null"
This section and beyond assumes you've downloaded one or more of the image datasets above, or you've followed the previous step to log data from your own webcam.
Convert a raw log file to a set of timestamped png files:
cat ir.bin | cv-cat "file=png;null"
...dumps all the images as iso-string-time.png in the current directory.
cat ir.bin | cv-cat "flip;timestamp;view;file=jpg;null"
...flips the images, adds the timestamp overlay, visualises, and writes them as iso-string-time.jpg files.
You may want an accompanying ascii files with the timestamps for processing:
cat ir.bin | cv-cat --output="header-only;fields=t,cols,rows,type" | csv-from-bin t,3ui > ir-images.csv
...but then once you have the png files, you could achieve a similar thing with native commands:
ls -1 *.png | cut -d. -f1,2 > ir-images.csv
Create an mp4 (requires avconv, formerly ffmpeg)
cat ir.bin | cv-cat "timestamp;encode=jpg" --output=no-header | avconv -y -f image2pipe -r 25 -c:v mjpeg -i pipe: -c:v libx264 -threads 0 -b 2000k -r 25 -pre slow -crf 22 ir.mp4
... where -r 25 (both instances) refers to the frame rate.
We've already seen how cv-cat filters work, including flip,flop,resize,timestamp and also thumb,view and null. Here we'll cover a few more that allow us to deal with more complicated cameras, such as PointGrey Bumblebee and Ladybug which encode data from multiple cameras into one stream.
View the raw ladybug stream (6 cameras in one. Press 'esc' to quit):
cat ladybug.bin | cv-cat "view;null"
...that probably occupied more than your whole screen. Let's resize to see what we are dealing with:
cat ladybug.bin | cv-cat "resize=0.1;view;null"
We could follow prior instructions to find the image size, but it's in the config file too. Play the data in pseudo real-time:
LB_BIN=$(cat config/shrimp.json | name-value-get ladybug/binary) cat ladybug.bin | csv-play --binary=$LB_BIN | cv-cat "resize=0.1;view;null"
Convert from bayer, shape it correctly, and inspect every step:
cat ladybug.bin | csv-play --binary=$LB_BIN | cv-cat "thumb;bayer=1;thumb;transpose;thumb;flop;thumb;timestamp;thumb;null"
The default thumbs are a bit small for this long thin image. Let's check the final stage with view:
cat ladybug.bin | csv-play --binary=$LB_BIN | cv-cat "bayer=1;transpose;flop;resize=0.2;timestamp;view;null"
...note: thumb doesn't change the data stream, resize does. The stream above is good to debug, but the output will be reduced in size.
For such large data, a resized output is great for smaller mp4s:
cat ladybug.bin | cv-cat "bayer=1;resize=0.17;transpose;flop;timestamp;encode=jpg" --output=no-header | avconv -y -f image2pipe -r 5 -c:v mjpeg -i pipe: -c:v libx264 -threads 0 -b 2000k -r 5 -pre slow -crf 22 ladybug.mp4
...note: resize=0.2 causes avconv to complain due to an odd image size. 0.17 works better:
cat ladybug.bin | cv-cat "bayer=1;resize=0.2;transpose;flop" --output="header-only;fields=rows,cols" | csv-from-bin 2ui | head -n1 323,1478 csv-from-bin: interrupted by signal cat ladybug.bin | cv-cat "bayer=1;resize=0.17;transpose;flop" --output="header-only;fields=rows,cols" | csv-from-bin 2ui | head -n1 274,1256 csv-from-bin: interrupted by signal
I prefer to guarantee the aspect ratio is preseved as above, but we could be explicit about the resolution:
cat ladybug.bin | cv-cat "bayer=1;resize=274,1256;transpose;flop;timestamp;encode=jpg" --output=no-header | avconv -y -f image2pipe -r 5 -c:v mjpeg -i pipe: -c:v libx264 -threads 0 -b 2000k -r 5 -pre slow -crf 22 ladybug.mp4
I frequently follow steps similar to those above to 'work out' a new camera stream if I'm not familiar with it.
Pull out one single camera (camera 0 is facing the front) from the six and save png files:
pv ladybug.bin | cv-cat "bayer=1;crop-tile=0,0,1,6;transpose;flop;timestamp;file=camera-0.png;null"
... optionally use 'pv' instead of 'cat' to see progress, filenames include 'camera-0' so we don't get mixed up.
This is not image specific, as it is based purely on streams. However, it's a useful example of an image workflow. For example, I typically create a lower res mp4 with the timestamp overlay to get a simple view of all of my data. Then if I find something interesting I use the timestamps to select the corresponding raw data. Below we'll see the syntax for image data, yet the same syntax works for all our (laser, navigation, etc) data too. This is how I chopped the first 10 seconds of the full image dataset to keep the downloads small.
View images from 00:41:24 to 00:41:28 (a 4 second window):
LB_BIN=$(cat config/shrimp.json | name-value-get ladybug/binary) cat ladybug.bin | cv-cat | csv-select --binary=$LB_BIN --fields=t, "t;from=20120502T004124;to=20120502T004128" | cv-cat "view;null"
Same png files in the same window:
IR_BIN=$(cat config/shrimp.json | name-value-get ir/binary) cat ir.bin | cv-cat | csv-select --binary=$IR_BIN --fields=t, "t;from=20120502T004124;to=20120502T004128" | cv-cat "timestamp;file=ir.png;null"
...look at the first and last png, to check they are within these bounds.
We have included undistortion map files as used in opencv:
cat ladybug.bin | cv-cat "bayer=1;crop-tile=0,0,1,6;thumb;undistort=config/shrimp.ladybug.camera-0.bin;thumb;transpose;flop;thumb;null"
For this tutorial we will use the bumblebee three camera stereo log file 'bumblebee.bin'.
View the data:
cat bumblebee.bin | cv-cat "thumb;null"
...in this log we see three separated grey images, similar to the ladybug example above.
Some logs are encoded as a colour image with the three bayer images stored separately in the 'r', 'g' and 'b' channels. In this case, use the filter 'split' to separate the channels:
cat bumblebee.bin | cv-cat "split;thumb;null"
...but it is not meaningful for this particular log.
Convert to colour, separate left and right images, (do and don't undistort), write png files:
BB_BAYER=$(cat config/shrimp.json | name-value-get bumblebee/bayer) BB_CROP_RIGHT=$(cat config/shrimp.json | name-value-get bumblebee/camera-right/crop-tile) BB_CROP_LEFT=$(cat config/shrimp.json | name-value-get bumblebee/camera-left/crop-tile) cat bumblebee.bin | cv-cat "bayer=$BB_BAYER;crop-tile=$BB_CROP_RIGHT;view;file=right.png;null" cat bumblebee.bin | cv-cat "bayer=$BB_BAYER;crop-tile=$BB_CROP_LEFT;view;file=left.png;null" cat bumblebee.bin | cv-cat "bayer=$BB_BAYER;crop-tile=$BB_CROP_RIGHT;undistort=config/shrimp.bumblebee-right.bin;view;file=right-undistored.png;null" cat bumblebee.bin | cv-cat "bayer=$BB_BAYER;crop-tile=$BB_CROP_LEFT;undistort=config/shrimp.bumblebee-left.bin;view;file=left-undistored.png;null"
Keep the first left and right image (and undistorted pair) and discard the rest:
mkdir images mv $(ls -1 | head -n4) images rm *.png ls images
...check you have two left and right images with identical timestamps.
Produce a disparity image from the first two png files (rectifies images internally):
stereo-to-points --left images/20120502T004122.936617.left.png --right images/20120502T004122.936617.right.png --config config/shrimp.json --left-path bumblebee/camera-left --right-path bumblebee/camera-right --disparity | cv-cat "file=disp.png;null"
...Note: it's getting a lot of information out of the config file here.
Produce a disparity image from the first two png files that were already rectified:
stereo-to-points --left images/20120502T004122.936617.left-undistored.png --right images/20120502T004122.936617.right-undistored.png --config config/shrimp.json --left-path bumblebee/camera-left --right-path bumblebee/camera-right --disparity --input-rectified | cv-cat "file=disp.png;null"
Produce a disparity image stream from the raw data:
cat bumblebee.bin | cv-cat "bayer=4" | stereo-to-points --config config/shrimp.json --left-path bumblebee/camera-left --right-path bumblebee/camera-right --roi 0,1920,0,0,1280,960 --disparity | cv-cat "resize=0.5;view;null"
...this can be quite slow to process.
Produce a stream of rectified image pairs:
cat bumblebee.bin | cv-cat "bayer=4" | stereo-to-points --config config/shrimp.json --left-path bumblebee/camera-left --right-path bumblebee/camera-right --roi 0,1920,0,0,1280,960 --output-rectified | cv-cat "resize=0.5;view;null"
...you could use that to write rectified image pairs as png files for processing elsewhere.
Produce a coloured point cloud stream:
cat bumblebee.bin | cv-cat "bayer=4" | stereo-to-points --config config/shrimp.json --left-path bumblebee/camera-left --right-path bumblebee/camera-right --roi 0,1920,0,0,1280,960 --binary t,3d,3ub,ui --min-disparity=10 | view-points --fields t,x,y,z,r,g,b,block --binary t,3d,3ub,ui
...also quite slow.
Finally, let's condense the contents from shrimp.json that are required for stereo-to-points.
Put the following in a file called 'bumblebee.json'
{ "camera-right": { "image-size": "1280,960", "focal-length": "1604.556763,1604.556763", "centre": "645.448181,469.367188", "distortion": "-0.44416,0.23526,0.00127,-0.00017,0.00000", "translation": "-0.239928,0,0", "rotation": "0,0,0", "map": "config/shrimp.bumblebee-right.bin" }, "camera-left": { "image-size": "1280,960", "focal-length": "1604.556763,1604.556763", "centre": "645.448181,469.367188", "distortion": "-0.43938,0.21826,-0.00001,0.00076,0.00000", "translation": "0,0,0", "rotation": "0,0,0", "map": "config/shrimp.bumblebee-left.bin" } }
Now view the coloured points again:
cat bumblebee.bin | cv-cat "bayer=4" | stereo-to-points --config bumblebee.json --left-path camera-left --right-path camera-right --roi 0,1920,0,0,1280,960 --binary t,3d,3ub,ui --min-disparity=10 | view-points --fields t,x,y,z,r,g,b,block --binary t,3d,3ub,ui