System Info:
OS: Linux (Ubuntu/Debian based)
GPU: NVIDIA GeForce RTX 3090 (24GB VRAM)
CPU: 14 Cores
Environment: CUDA 11.8, Python 3.9, PyTorch 2.0.1
Description: I am following the "Getting Started" guide to reconstruct hair from a monocular video (raw.mp4). While running the run.sh script, the process gets stuck for over 14 hours at the preprocessing stage, specifically during the patch splitting logic (likely within calc_orientation_maps.py).
Symptoms:
Logs: The console continuously prints Trying to split into [ID] followed by a vector of zeros [0 0 0 ... 0 0 0].
Resource Usage: The Python process consumes 100% CPU, but GPU utilization remains at 0%.
Stagnation: After 14 hours and processing over 1,000,000 patches, the script still hasn't moved to the reconstruction (training) stage.
Questions for the Authors:
Is the calc_orientation_maps.py script designed to run only on CPU?
Why are the confidence/orientation vectors outputting as all zeros for so many patches?
Is there a way to optimize this preprocessing step or multi-thread the patch splitting logic to utilize all available CPU cores?
System Info:
OS: Linux (Ubuntu/Debian based)
GPU: NVIDIA GeForce RTX 3090 (24GB VRAM)
CPU: 14 Cores
Environment: CUDA 11.8, Python 3.9, PyTorch 2.0.1
Description: I am following the "Getting Started" guide to reconstruct hair from a monocular video (raw.mp4). While running the run.sh script, the process gets stuck for over 14 hours at the preprocessing stage, specifically during the patch splitting logic (likely within calc_orientation_maps.py).
Symptoms:
Logs: The console continuously prints Trying to split into [ID] followed by a vector of zeros [0 0 0 ... 0 0 0].
Resource Usage: The Python process consumes 100% CPU, but GPU utilization remains at 0%.
Stagnation: After 14 hours and processing over 1,000,000 patches, the script still hasn't moved to the reconstruction (training) stage.
Questions for the Authors:
Is the calc_orientation_maps.py script designed to run only on CPU?
Why are the confidence/orientation vectors outputting as all zeros for so many patches?
Is there a way to optimize this preprocessing step or multi-thread the patch splitting logic to utilize all available CPU cores?