Project DFKI Augmented Vision

Category-Agnostic Pose Estimation

Topic

In this project, we will create a text-promptable pose estimation model that can be used to perform skeleton-agnostic human pose estimation. The goal is to dynamically modify the number of predicted keypoints at inference time in a zero-shot manner.

Tasks

Prepare a text-image dataset suitable for training a zero-shot CAPE model tailored for humans only
Model implementation and training
Detailed performance comparisons

Expected Skills

Strong programming skills + PyTorch (required)
Experience with human pose estimation (highly preferred) and MMPose (preferred)

[1] X-Pose
[2] PoseAnything
[3] CapeFormer

Category-Agnostic Pose Estimation

Topic

Tasks

Expected Skills

Related Literature