Resources - mariodiasbatista/object-detection GitHub Wiki
What to choose
The resources are a very important factor if we want to achieve optimal result in training or inference like i mention here.
But GPU cost depending of your pricing plan or of your hosted GPU server time it can be very high.
So i created this section to present some options to choose depending of your project objectives:
-
For training and inference with GPU cloud options
- Google Colab
- Free and paid version Colab Pro not so expensive.
- Runpod
- Several options to different project objectives and different prices
- DigitalOcean
- Several droplets to different project objectives and different prices
- Other options on the market always appearing
- Google Colab
-
For inference without GPU
- Use CPU but is slower on the results
-
Local Device
- As the prices ramp up if you take for too long a GPU server in active cloud session maybe you should think to buy a physical edge device.
There are the most know options until this date :
- Jetson Orion Nano 8gb - Ideal for edge AI/Robotics in training and inference
- Jetson Nano B01 - Ideal for low power small AI models in training and inference
- As the prices ramp up if you take for too long a GPU server in active cloud session maybe you should think to buy a physical edge device.
There are the most know options until this date :
TOPS (Tera Operations Per Second)
It is a unit of measurement that indicates how many trillion operations a processing unit (like a CPU, GPU, or AI accelerator) can perform in one second.
It is commonly used to measure the performance of hardware designed for machine learning tasks.
Higher TOPS generally means faster and more efficient processing of AI models.
For example, an AI chip with 100 TOPS can perform 100 trillion operations every second, which is crucial for real-time applications like autonomous driving, robotics, or edge AI.
So TOPS refer to inference speed on the other side FLOPS (floating-point operations per second) refer to training power.
Here is an example regarding some GPUS:
Accelerator | INT8 TOPS (Inference) | FP16 TOPS (Training) | Notes |
---|---|---|---|
A100 GPU (40GB) | 624 | ~19.5 FP16 | High-performance for AI training |
v5e-1 TPU | 300 | ~90 FP16 | More efficient TPU for training & inference |
v2-8 TPU | 180 | ~45 FP16 | Older TPU, TensorFlow optimized |
L4 GPU | 99 | ~16 FP16 | Optimized for AI inference |
T4 GPU | 65 | ~8.1 FP16 | Efficient for inference, lower power |
Jetson Orin Nano 8GB | 40 | ~20 FP16 | Best Jetson Nano for edge AI |
Jetson Orin Nano 4GB | 20 | ~10 FP16 | Lower RAM version of Orin Nano |
Jetson Nano B01 (4GB) | 0.5 | 0.25 FP16 | Very limited, suitable for small AI models |