This repository contains the pipeline to process, structure, and analyze satellite image time series (SITS) from the MULTISENGE dataset. It handles the extraction of pixel-level time series, organizes them into machine learning-ready cross-validation folds, calculates global statistics, and performs temporal correlation analyses.
The code relies on the MULTISENGE dataset. You must download the necessary raw data (Sentinel-2 images, Ground Reference, and Labels) from Zenodo:
To run this pipeline, ensure you have the following components extracted locally:
- Sentinel-2 Images (
s2_images) - Ground Reference (GeoTIFF masks)
- Labels (JSON files describing the patches)
To simplify execution, this project uses a pyproject.toml configuration.
-
Clone the repository:
git clone [https://github.com/your-username/multisenge-processing.git](https://github.com/your-username/multisenge-processing.git) cd multisenge-processing -
Install in editable mode: Run the following command to install dependencies and register the
multisengeexecutable on your system:pip install -e .
The project uses Hydra for configuration management. All settings are defined in config/hydra_config.yaml.
Before running any command, open config/hydra_config.yaml and update the dataset.paths section with the actual locations of your downloaded data.
You must provide valid paths for these 4 variables:
folds_csv: Directory containing your split CSVs (fold_0.csv, etc.).labels: Directory containing the JSON label files.ground_reference: Directory containing the GeoTIFF masks.s2_images: Directory containing the Sentinel-2 images.