DanceBits segments dance videos through a multi-stage pipeline to make choreography learning more accessible and efficient. Developed during the Data Science Retreat bootcamp in 2024, it helps dancers learn by automatically segmenting choreographies and providing real-time feedback.
This repository contains the data preprocessing pipeline and is structured as follows:
- Video preprocessing to extract pose keypoints
- Audio feature extraction
- Label processing and ground truth generation
- Feature fusion and dataset creation
# Clone repository
git clone https://github.com/your-username/dancebits-preprocessing.git
cd dancebits
# Create and activate conda environment
conda create --name dancebits-preprocessing python=3.8
conda activate dancebits-preprocessing
# Install dependencies
pip install -r requirements.txt
# Run preprocessing pipeline
python main.py preprocess \
--labels-path data/raw/video/dataset_X/labels.csv \
--files-dir data/raw/video/dataset_X \
--dataset-id your_dataset_namedancebits-preprocessing/
├── config/ # Configuration files
│ ├── main.yaml # Main configuration
│ └── data/ # Data processing configs
├── data/
│ ├── raw/ # Original videos and labels
│ ├── interim/ # Extracted features
│ └── processed/ # Final training datasets
├── src/
│ ├── data/ # Data processing pipelines
│ └── features/ # Feature extraction code
├── tests/ # Unit and integration tests
└── main.py # CLI entry point
-
Video Processing
- Extracts pose keypoints using MediaPipe
- Generates bone vectors for movement analysis
- Outputs: keypoint CSVs and bone vector CSVs
-
Audio Processing
- Extracts mel spectrograms from video audio using Librosa
- Detects tempo and beat information
- Outputs: .npy files with audio features
-
Label Processing
- Processes manual annotations of movement boundaries
- Generates probability distributions for segment transitions
- Outputs: Frame-level segmentation labels
Tutorials on the individual steps of the pipeline are provided in notebooks.
The DanceBits system leverages the following technologies:
- Python for core development
- FastAPI for API services
- PyTorch for deep learning models
- MediaPipe for pose estimation
- Librosa for audio processing
Our implementation of the model based on the research from [1] has demonstrated strong performance in segmenting both basic and advanced choreographies. Testing has shown particularly effective results with structured dance routines, enabling significantly more efficient practice sessions compared to traditional video learning methods.
For more project motivation, details, and outcomes, please access the blog posts of the team members:
The deployment-ready app repository can be found here.
# Run preprocessing on a sample dataset
python main.py preprocess
# Run preprocessing tests
python main.py test_preprocess
# Run preprocessing on a new dataset
python main.py preprocess \
--labels-path data/raw/video/dataset_X/labels.csv \
--files-dir data/raw/video/dataset_X \
--dataset-id your_dataset_nameThe pipeline is configured through YAML files in the config/ directory:
main.yaml: Global settings and pathsdata/: Data processing parametersmodel/: Model architecture and training settings
Example configuration:
data_config: dataset_config.yaml
model_config: model_v1.yaml
paths:
raw_data: data/raw/video
features: data/interim
processed: data/processedDeveloped at Data Science Retreat - Cohort 39
Authors: Paras Mehta, Cristina Melnic, and Arpad Dusa
[1] Endo et al. 2024, "Automatic Dance Video Segmentation for Understanding Choreography"
[2] Tsuchida et al. 2019, "AIST DANCE VIDEO DATABASE: Multi-genre, Multi-dancer, and Multi-camera Database for Dance Information Processing"
[3] Lugaresi et al. 2019, "MediaPipe: A Framework for Building Perception Pipelines"
[4] McFee et al. 2015, "librosa: Audio and Music Signal Analysis in Python"
This project was made possible thanks to:
- Training data and segmentation labels from Endo et al. (2024), based on the AIST Dance Video Database
- Project supervision by Antonio Rueda-Toicen at Data Science Retreat