YOLOv8 - baqwas/yolov8 GitHub Wiki

Introduction

You Only Look Once (YOLO) is a phenomenal improvement in object detection using computer vision that is developed by Ultralytics. As the name implies, YOLO examines an object once and then efficiently presents its findings in a very fast manner. Its use in real-time applications is having a profound impact in many disciplines. The analogy to a faster camera that can spot and label objects is not far off the mark.

Algorithm

YOLOv8 looks at the entire image in one operation and then proceeds to the examination of the content.

Key Features

  • Backbone Network
  • Neck Architecture
  • YOLO Head
  • Training Techniques
  • Model Variants
  • Performance

Backbone Network

The backbone network is the foundation for YOLOv8. It extracts hierarchical features from the input image to provide a comprehensive representation of the data. The backbone uses CSPDarknet53 which is a modified version of the Darknet architecture. This improves the learning capacity and efficiency during operation.

Neck Architecture

This architecture combines multiscale information, which improves the model's ability to detect objects of varying sizes.

YOLO Head

The Head uses the features extracted by the backbone network and neck architecture to provide predictions with metrics such as scores and probabilities for each anchor box associated with a grid cell.

Training Techniques

Models

The supported models are:

Results

The attributes for the results object are:

Method Return Type Description
update() None Update the boxes, masks, and probs attributes of the Results object.
cpu() Results Return a copy of the Results object with all tensors on CPU memory.
numpy() Results Return a copy of the Results object with all tensors as numpy arrays.
cuda() Results Return a copy of the Results object with all tensors on GPU memory.
to() Results Return a copy of the Results object with tensors on the specified device and dtype.
new() Results Return a new Results object with the same image, path, and names.
plot() numpy.ndarray Plots the detection results. Returns a numpy array of the annotated image.
show() None Show annotated results to screen.
save() None Save annotated results to file.
verbose() str Return log string for each task.
save_txt() None Save predictions into a txt file.
save_crop() None Save cropped predictions to save_dir/cls/file_name.jpg.
tojson() str Convert the object to JSON format.

Here is an example of the results object:

[ultralytics.engine.results.Results object with attributes:

boxes: ultralytics.engine.results.Boxes object
keypoints: None
masks: None
names: {0: 'person', 1: 'bicycle', 2: 'car', 3: 'motorcycle', 4: 'airplane', 5: 'bus', 6: 'train', 7: 'truck', 8: 'boat', 9: 'traffic light', 10: 'fire hydrant', 11: 'stop sign', 12: 'parking meter', 13: 'bench', 14: 'bird', 15: 'cat', 16: 'dog', 17: 'horse', 18: 'sheep', 19: 'cow', 20: 'elephant', 21: 'bear', 22: 'zebra', 23: 'giraffe', 24: 'backpack', 25: 'umbrella', 26: 'handbag', 27: 'tie', 28: 'suitcase', 29: 'frisbee', 30: 'skis', 31: 'snowboard', 32: 'sports ball', 33: 'kite', 34: 'baseball bat', 35: 'baseball glove', 36: 'skateboard', 37: 'surfboard', 38: 'tennis racket', 39: 'bottle', 40: 'wine glass', 41: 'cup', 42: 'fork', 43: 'knife', 44: 'spoon', 45: 'bowl', 46: 'banana', 47: 'apple', 48: 'sandwich', 49: 'orange', 50: 'broccoli', 51: 'carrot', 52: 'hot dog', 53: 'pizza', 54: 'donut', 55: 'cake', 56: 'chair', 57: 'couch', 58: 'potted plant', 59: 'bed', 60: 'dining table', 61: 'toilet', 62: 'tv', 63: 'laptop', 64: 'mouse', 65: 'remote', 66: 'keyboard', 67: 'cell phone', 68: 'microwave', 69: 'oven', 70: 'toaster', 71: 'sink', 72: 'refrigerator', 73: 'book', 74: 'clock', 75: 'vase', 76: 'scissors', 77: 'teddy bear', 78: 'hair drier', 79: 'toothbrush'}
obb: None
orig_img: array([[[132, 138, 137],
        [133, 139, 138],
        [134, 139, 140],
        ...,   
        [ 17,  10,  13]]], dtype=uint8)
orig_shape: (481, 640)
path: '/root/projects/yolov8/basic/images/clocks2.jpg'
probs: None
save_dir: 'runs/detect/predict6'
speed: {'preprocess': 95.9782600402832, 'inference': 12243.300914764404, 'postprocess': 241.5597438812256}]

References

YOLOv8 Algorithm
Colab Tutorial