Example: Reconstructing Wave in Tank Using HDFS and FFFS - songweijia/fffs GitHub Wiki
To create these three movies, we simulated a wave propagating in a fish-tank and generated 100 10x10 image streams, as if each 2x2 cells in the mesh were monitored by a distinct camera. Each image is associated with an instant in the simulation time. To emulate sensors, we copied this simulated data to 10 nodes in a cloud-hosted data center. We use 100 threads running on 10 the nodes to replay data as follows: each data source counts down to the time for the next image, using its local clock (synchronized with NTP), then sends its image to a cloud-hosted data collection program running on some other machine, over TCP1. The 100 data collectors write this data into files. Finally, we created a movie: extract data for each frame, combine the cells, output the image, and then we repeat this process.
HDFS | FFFS(server time) | FFFS(user time) |
---|---|---|
The image on the left used HDFS to store the files. HDFS has a time-based snapshot feature; we employed it to snapshot our files once every 100 milliseconds, yielding the 10 frames/second needed in the movie. In the middle, we see our Freeze Frame FS (FFFS), but configured it to ignore sensor time; instead, FFFS assumed that each update occurred at the time the data reached our file system storage node. On the right, FFFS extracts the time for each update from the image that was written (in effect, it trusts the sensor time). As it obvious from the figures, small temporal errors can hugely distort time-sensitive computations. We notice that not only are the left and middle ones distorted in a frame by frame sense, the errors aren’t systematic. The only way to correct them is to build applications that understand the time signals in the data.