Training on the COCO DIY dataset
This section describes how to extract some of the labels and data from the COCO dataset for training using the scripting tool provided by Petoi.
COCO (Common Objects in Context) dataset is a widely used computer vision dataset for tasks such as object detection, segmentation, and keypoint detection. Developed by Microsoft Research, the goal of the COCO dataset is to provide a rich dataset with high-quality annotations to advance computer vision technology.
Key features of COCO dataset:
Data content:
Number of images: contains over 330,000 images.
Classes: Covering 80 object classes, e.g., people, cars, animals, furniture, etc.
Annotation: each image is annotated with objects, supporting tasks such as object detection, segmentation (instance segmentation) and key point detection (e.g., human pose estimation).
Data type:
Object detection: the bounding box (bounding box) of the object is labelled.
Instance segmentation: pixel-level segmentation mask (mask) for each object instance.
Key point detection: the position of the key points of the body is labelled.
Scene description: each image also contains textual information describing the content of the image.
Data Distribution:
Training set: contains about 118,000 images.
Validation set: contains about 5,000 images.
Test set: contains about 20,000 images (for competition and evaluation).
Data set structure:
Image data: Stored in the image folder.
Annotation data: stored in JSON format files, including information such as object bounding boxes, segmentation masks, keypoints, and so on.
Usage Scenario:
Object detection: identify objects in the image and determine their locations.
Example segmentation: segment each object in the image at pixel level.
Human body pose estimation: detects the position of key points of the human body in the image.
Scene understanding: analyses the image content and generates a description.
The COCO dataset is widely used in the field of computer vision because it provides a large and diverse set of annotated data that helps in training and evaluating different vision models.COCO Dataset Download




https://github.com/ultralytics/ultralytics/blob/main/ultralytics/cfg/datasets/coco.yaml