best practices - hassony2/inria-research-wiki GitHub Wiki
Working with a new dataset
Keep the original dataset untouched Have a subfolder with all the scripts and modified outputs
sequoia/data2/dataset/original datasets ! then in yhasson/datasets name_of_dataset -- scripts -- soft link_to_dataset
Downscale the original dataset to the maximal resolution you will use
I deem something like 320*240 sufficient for videos
This will speed-up all following steps such as data loading during training, extraction of other representations (flows, ...)
Listing files when large number of files
By default ls sorts files and waits until sorted before displaying
--> use ls -f
to remove sorting
ls path/to/folder | less
to have non-blocking ls on folder with many small files.
Experimenting
Be lean in your experiments : do not scale up before knowing if you will actually need to scale up.
Do not anticipate too much although it is tempting !
First construct a minimum viable experiment which validates your claim.
This will save a LOT of time.