Useful code snippets - Konnsy/REAML2022-hackathon GitHub Wiki

Useful code snippets (pytorch)

Preprocessing

To preprocess a time window of size w in a very simple way you can take the mean of the first half of the window and subtract it from the mean of the second half of the frame. For an input tensor of shape (b, w, x, y) with batch size b, window size w and image sizes x and y, you can calculate these difference values using:

x1 = torch.mean(x[:,:w//2, :, :], 1)
x2 = torch.mean(x[:,w//2:, :, :], 1)
x = (x2-x1).unsqueeze(1)

In this way, you will receive a result of the shape (b, 1, x, y).

Visualization

Write preprocessed file as an image to check the data characteristics

from helper import scale
import cv2
img_out = (scale(img, (0,1))*255).view(img.shape[-2:]).cpu().numpy().astype(np.uint8)
cv2.imwrite("filename.png", img_out)

Write detected boxes to a preprocessed image file

Boxes must be a list of (x1,y1,x2,y2) values
from helper import scale, writeBoxesToImage
import cv2
img_out = ... (like above)
img_out = writeBoxesToImage(img_out, boxes)
cv2.imwrite("filename.png", img_out)

Global max pooling (get from arbitrary shapes to fixed ones)

If you want to become agnostic of image sizes at the end of your model, you can use, e.g., global max pooling as a method to transform a tensor of the shape (b, c, x, y) with c input features per data point to a tensor of the shape (b, c):

x = torch.max(x.view(*x.shape[:2], -1), dim=2)[0]

Sampling a fixed number of entries from a one-dimensional tensor (e.g., for time analyses)

generate a fixed number (here: 100) of indices in the available range of a tensor x
idcs = torch.linspace(start=0, end=x.shape[0]-1, steps=100).to(torch.long)
then sample values
x[idcs]

Save and load a pytorch model

This can be helpful to save an intermediate state of training or for submitting the weights of a trained model. torch.save(model.cpu().state_dict(), PATH)
model.load_state_dict(torch.load(PATH))

Measuring Resource Consumptions

Below you can find examples to measure the time it takes to compute torch calculations and to measure the used memory.

Example to measure execution times

import torch
from torch.profiler import profile, record_function, ProfilerActivity
model = nn.Sequential(nn.Conv2d(10, 128, 3), nn.Conv2d(128, 1, 3))
inputs = torch.randn(1, 10, 256, 256)\

with profile(activities=[ProfilerActivity.CPU], record_shapes=True) as prof:
with record_function("model_inference"):
result = model(inputs)\

print(prof.key_averages().table(sort_by="cpu_time_total", row_limit=10))

Example to measure the used memory

import torch
import torchvision.models as models
from torch.profiler import profile, record_function, ProfilerActivity

model = nn.Sequential(nn.Conv2d(10, 128, 3), nn.Conv2d(128, 1, 3))
inputs = torch.randn(1, 10, 256, 256)

with profile(activities=[ProfilerActivity.CPU],
profile_memory=True, record_shapes=True) as prof:
result = model(inputs)

print(prof.key_averages().table(sort_by="self_cpu_memory_usage", row_limit=10))

Caching files in order to avoid unnecessary calculations

Remember to delete old cache files when you change your input data. An example of writing preprocessed data to a cache and extracting a random example can be seen in cache_example.py