YOLOv8 - baqwas/yolov8 GitHub Wiki
Introduction
You Only Look Once (YOLO) is a phenomenal improvement in object detection using computer vision that is developed by Ultralytics. As the name implies, YOLO examines an object once and then efficiently presents its findings in a very fast manner. Its use in real-time applications is having a profound impact in many disciplines. The analogy to a faster camera that can spot and label objects is not far off the mark.
Algorithm
YOLOv8 looks at the entire image in one operation and then proceeds to the examination of the content.
Key Features
- Backbone Network
- Neck Architecture
- YOLO Head
- Training Techniques
- Model Variants
- Performance
Backbone Network
The backbone network is the foundation for YOLOv8. It extracts hierarchical features from the input image to provide a comprehensive representation of the data. The backbone uses CSPDarknet53 which is a modified version of the Darknet architecture. This improves the learning capacity and efficiency during operation.
Neck Architecture
This architecture combines multiscale information, which improves the model's ability to detect objects of varying sizes.
YOLO Head
The Head uses the features extracted by the backbone network and neck architecture to provide predictions with metrics such as scores and probabilities for each anchor box associated with a grid cell.
Training Techniques
Models
The supported models are:
Results
The attributes for the results object are:
Method | Return Type | Description |
---|---|---|
update() | None | Update the boxes, masks, and probs attributes of the Results object. |
cpu() | Results | Return a copy of the Results object with all tensors on CPU memory. |
numpy() | Results | Return a copy of the Results object with all tensors as numpy arrays. |
cuda() | Results | Return a copy of the Results object with all tensors on GPU memory. |
to() | Results | Return a copy of the Results object with tensors on the specified device and dtype. |
new() | Results | Return a new Results object with the same image, path, and names. |
plot() | numpy.ndarray | Plots the detection results. Returns a numpy array of the annotated image. |
show() | None | Show annotated results to screen. |
save() | None | Save annotated results to file. |
verbose() | str | Return log string for each task. |
save_txt() | None | Save predictions into a txt file. |
save_crop() | None | Save cropped predictions to save_dir/cls/file_name.jpg. |
tojson() | str | Convert the object to JSON format. |
Here is an example of the results object:
[ultralytics.engine.results.Results object with attributes:
boxes: ultralytics.engine.results.Boxes object
keypoints: None
masks: None
names: {0: 'person', 1: 'bicycle', 2: 'car', 3: 'motorcycle', 4: 'airplane', 5: 'bus', 6: 'train', 7: 'truck', 8: 'boat', 9: 'traffic light', 10: 'fire hydrant', 11: 'stop sign', 12: 'parking meter', 13: 'bench', 14: 'bird', 15: 'cat', 16: 'dog', 17: 'horse', 18: 'sheep', 19: 'cow', 20: 'elephant', 21: 'bear', 22: 'zebra', 23: 'giraffe', 24: 'backpack', 25: 'umbrella', 26: 'handbag', 27: 'tie', 28: 'suitcase', 29: 'frisbee', 30: 'skis', 31: 'snowboard', 32: 'sports ball', 33: 'kite', 34: 'baseball bat', 35: 'baseball glove', 36: 'skateboard', 37: 'surfboard', 38: 'tennis racket', 39: 'bottle', 40: 'wine glass', 41: 'cup', 42: 'fork', 43: 'knife', 44: 'spoon', 45: 'bowl', 46: 'banana', 47: 'apple', 48: 'sandwich', 49: 'orange', 50: 'broccoli', 51: 'carrot', 52: 'hot dog', 53: 'pizza', 54: 'donut', 55: 'cake', 56: 'chair', 57: 'couch', 58: 'potted plant', 59: 'bed', 60: 'dining table', 61: 'toilet', 62: 'tv', 63: 'laptop', 64: 'mouse', 65: 'remote', 66: 'keyboard', 67: 'cell phone', 68: 'microwave', 69: 'oven', 70: 'toaster', 71: 'sink', 72: 'refrigerator', 73: 'book', 74: 'clock', 75: 'vase', 76: 'scissors', 77: 'teddy bear', 78: 'hair drier', 79: 'toothbrush'}
obb: None
orig_img: array([[[132, 138, 137],
[133, 139, 138],
[134, 139, 140],
...,
[ 17, 10, 13]]], dtype=uint8)
orig_shape: (481, 640)
path: '/root/projects/yolov8/basic/images/clocks2.jpg'
probs: None
save_dir: 'runs/detect/predict6'
speed: {'preprocess': 95.9782600402832, 'inference': 12243.300914764404, 'postprocess': 241.5597438812256}]