Resources - mariodiasbatista/object-detection GitHub Wiki

What to choose

The resources are a very important factor if we want to achieve optimal result in training or inference like i mention here.

But GPU cost depending of your pricing plan or of your hosted GPU server time it can be very high.

So i created this section to present some options to choose depending of your project objectives:

  • For training and inference with GPU cloud options

    • Google Colab
      • Free and paid version Colab Pro not so expensive.
    • Runpod
      • Several options to different project objectives and different prices
    • DigitalOcean
      • Several droplets to different project objectives and different prices
    • Other options on the market always appearing
  • For inference without GPU

    • Use CPU but is slower on the results
  • Local Device

    • As the prices ramp up if you take for too long a GPU server in active cloud session maybe you should think to buy a physical edge device. There are the most know options until this date :
      • Jetson Orion Nano 8gb - Ideal for edge AI/Robotics in training and inference
      • Jetson Nano B01 - Ideal for low power small AI models in training and inference

TOPS (Tera Operations Per Second)

It is a unit of measurement that indicates how many trillion operations a processing unit (like a CPU, GPU, or AI accelerator) can perform in one second.

It is commonly used to measure the performance of hardware designed for machine learning tasks.

Higher TOPS generally means faster and more efficient processing of AI models.

For example, an AI chip with 100 TOPS can perform 100 trillion operations every second, which is crucial for real-time applications like autonomous driving, robotics, or edge AI.

So TOPS refer to inference speed on the other side FLOPS (floating-point operations per second) refer to training power.

Here is an example regarding some GPUS:

Accelerator INT8 TOPS (Inference) FP16 TOPS (Training) Notes
A100 GPU (40GB) 624 ~19.5 FP16 High-performance for AI training
v5e-1 TPU 300 ~90 FP16 More efficient TPU for training & inference
v2-8 TPU 180 ~45 FP16 Older TPU, TensorFlow optimized
L4 GPU 99 ~16 FP16 Optimized for AI inference
T4 GPU 65 ~8.1 FP16 Efficient for inference, lower power
Jetson Orin Nano 8GB 40 ~20 FP16 Best Jetson Nano for edge AI
Jetson Orin Nano 4GB 20 ~10 FP16 Lower RAM version of Orin Nano
Jetson Nano B01 (4GB) 0.5 0.25 FP16 Very limited, suitable for small AI models