best practices - hassony2/inria-research-wiki GitHub Wiki

Working with a new dataset

Keep the original dataset untouched Have a subfolder with all the scripts and modified outputs

sequoia/data2/dataset/original datasets ! then in yhasson/datasets name_of_dataset -- scripts -- soft link_to_dataset

Downscale the original dataset to the maximal resolution you will use

I deem something like 320*240 sufficient for videos

This will speed-up all following steps such as data loading during training, extraction of other representations (flows, ...)

Listing files when large number of files

By default ls sorts files and waits until sorted before displaying

--> use ls -f to remove sorting

ls path/to/folder | less to have non-blocking ls on folder with many small files.

Experimenting

Be lean in your experiments : do not scale up before knowing if you will actually need to scale up.

Do not anticipate too much although it is tempting !

First construct a minimum viable experiment which validates your claim.

This will save a LOT of time.