Article: Zijian Zhao*, Dian Jin, Zijing Zhou, "Zero-Effort Image-to-Music Generation: An Interpretable RAG-based VLM Approach", ACM ICMR 2026
amaai-lab/MidiCaps · Datasets at Hugging Face
Please rename the train.json as meta.txt.
The data process part is based on the code of jwdj/EasyABC: EasyABC (github.com).
python main.py@article{zhao2025zero,
title={Zero-Effort Image-to-Music Generation: An Interpretable RAG-based VLM Approach},
author={Zhao, Zijian and Jin, Dian and Zhou, Zijing},
journal={arXiv preprint arXiv:2509.22378},
year={2025}
}
Some websites provide the service for abc2midi and midi2abc:
