Pedestrian Attribute Recognition - person-in-hangang/HanRiver GitHub Wiki

reference https://github.com/valencebond/Strong_Baseline_of_Pedestrian_Attribute_Recognition13

PAR - Pedestrian Attribute Recognition

Pedestrian attribute recognition is to predict multiple attributes of pedestrian images as semantic descriptions in video surveillance, such as age, gender and clothing.

Dataset Info

PA100K[Paper][Github]

PA-100K is a recent-proposed large pedestrian attribute dataset, with 100,000 images in total collected from outdoor surveillance cameras. It is split into 80,000 images for the training set, and 10,000 for the validation set and 10,000 for the test set. This dataset is labeled by 26 binary attributes. The common features existing in both selected dataset is that the images are blurry due to the relatively low resolution and the positive ratio of each binary attribute is low.

You just download annotation file.

./dataset/pa100k/annotation.mat

Pretrained Models

Pretrained models are provided now at Google Drive.

./dataset/pa100k_ckpt_max.pth

How it's implemented?

1. Using bounding box information in the picture, only people are cut off.

    x_min, y_min, width, height = int(temp[1]), int(temp[2]), int(temp[3]), int(temp[4])
    # received bbox
    bbox_split = [x_min, y_min, width, height](/person-in-hangang/HanRiver/wiki/x_min,-y_min,-width,-height)
    image = bytes(image)
    encoded_img = np.fromstring(image, dtype=np.byte)
    img = cv2.imdecode(encoded_img, cv2.IMREAD_COLOR)
    img_trim = img[y_min:y_min + height, x_min:x_min + width]

2. Load pretrained model, label list

def get_reload_weight(model_path, model):
    checkpoint = torch.load(model_path)
    model.load_state_dict(checkpoint['state_dicts'])
    return model



def get_labels(dataset='PA100k'):
    if dataset is 'PA100k':
        mat = scipy.io.loadmat('./annotation.mat')
        list = []

        for i in mat['attributes']:
            list.append(i[0][0])
        return list

3. Predict attribute from trimed images

def PAR(img):
    backbone = resnet50()
    classifier = BaseClassifier(nattr=26)
    model = FeatClassifier(backbone, classifier)
    if torch.cuda.is_available():
        model = torch.nn.DataParallel(model).cuda()
    model_path = './pa100k_ckpt_max.pth'
    model = get_reload_weight(model_path, model)

    list = get_labels()
    model.eval()
    with torch.no_grad():
        output = model(img[None, ...]).float()
        predict = torch.sigmoid(output)