CelebA - xyfJASON/image-datasets GitHub Wiki
Links
Official website | Papers with Code | Google drive | Baidu drive (password: rp0s)
Brief introduction
Copied from official website.
CelebFaces Attributes Dataset (CelebA) is a large-scale face attributes dataset with more than 200K celebrity images, each with 40 attribute annotations. The images in this dataset cover large pose variations and background clutter. CelebA has large diversities, large quantities, and rich annotations, including
- 10,177 number of identities,
- 202,599 number of face images, and
- 5 landmark locations, 40 binary attributes annotations per image.
The dataset can be employed as the training and test sets for the following computer vision tasks: face attribute recognition, face recognition, face detection, landmark (or facial part) localization, and face editing & synthesis.
Statistics
Numbers: 202,599
Splits: 162,770 / 19,867 / 19,962 (train / valid / test)
Resolution:
- Aligned: 178×218
- In-the-wild: varies from 200+ to 2000+
Annotations:
- 10,177 number of identities
- 5 landmark locations per image
- 40 binary attributes annotations
Files
- Anno
- identity_CelebA.txt
- list_attr_celeba.txt
- list_bbox_celeba.txt
- list_landmarks_align_celeba.txt
- list_landmarks_celeba.txt
- Eval
- list_eval_partition.txt
- Img
- img_celeba.7z (In-The-Wild Images)
- img_align_celeba_png.7z (Align&Cropped Images, PNG Format)
- img_align_celeba.zip (Align&Cropped Images, JPG Format)
- README.txt
Usage
Notes: The authors provide two versions of dataset: aligned and in-the-wild. torchvision
only supports loading the aligned version.
File structure
Please organize the downloaded dataset in the following file structure:
root
└── celeba
├── identity_CelebA.txt
├── list_attr_celeba.txt
├── list_bbox_celeba.txt
├── list_eval_partition.txt
├── list_landmarks_align_celeba.txt
├── list_landmarks_celeba.txt
└── img_align_celeba
├── 000001.jpg
├── ...
└── 202599.jpg
Example
from torchvision.datasets import CelebA
train_set = CelebA(root='~/data/CelebA', split='train')
valid_set = CelebA(root='~/data/CelebA', split='valid')
test_set = CelebA(root='~/data/CelebA', split='test')
all_set = CelebA(root='~/data/CelebA', split='all')
print(len(train_set)) # 162770
print(len(valid_set)) # 19867
print(len(test_set)) # 19962
print(len(all_set)) # 202599