CelebA‐HQ - xyfJASON/image-datasets GitHub Wiki

Links

Official website | Papers with Code

Brief introduction

Copied from paperswithcode.

The CelebA-HQ dataset is a high-quality version of CelebA that consists of 30,000 images at 1024×1024 resolution.

Statistics

Numbers: 30,000 (a subset of CelebA)

Splits (following CelebA's original splits): 24,183 / 2,993 / 2,824 (train / valid / test)

Resolution: 1024×1024

Usage

Generate the dataset (official)

Download CelebA dataset and delta files, then generate images with dataset_tool.py. See official repo for more information.

Generate the dataset (recommended)

Download CelebAMask-HQ dataset, then map the filenames back to original id based on CelebA-HQ-to-CelebA-mapping.txt. The mapping script is provided at scripts/celebahq_map_filenames.py.

python celebahq_map_filenames.py --root ROOT

File structure

Please organize the dataset in the following file structure:

root
├── CelebA-HQ-img
│   ├── 000004.jpg
│   ├── ...
│   └── 202591.jpg
└── CelebA-HQ-to-CelebA-mapping.txt

API Reference

CelebAHQ(root: str, split: str = 'train', transforms: Optional[Callable] = None)
  • root: Root directory of dataset.
  • split: One of {'train', 'valid', 'test', 'all'}.
  • transforms: A function/transform that takes in an PIL image and returns a transformed version.

Example

from image_datasets import CelebAHQ

root = '~/data/CelebA-HQ/'  # path to the dataset
train_set = CelebAHQ(root=root, split='train')
valid_set = CelebAHQ(root=root, split='valid')
test_set = CelebAHQ(root=root, split='test')
all_set = CelebAHQ(root=root, split='all')
print(len(train_set))  # 24183
print(len(valid_set))  # 2993
print(len(test_set))   # 2824
print(len(all_set))    # 30000
print(train_set[0])    # <PIL.JpegImagePlugin.JpegImageFile image mode=RGB size=1024x1024 at 0x7F6AE3628A90>