- Release project page
- Upload arXiv paper
- Release evaluation protocal
- Release code of Color-Turbo
- Release online demo of Color-Turbo
Outline:
The most straightforward ones, however limited by poor controllability.
| Title | Abbr. | Feature | Code | Pub. & Date |
|---|---|---|---|---|
| Deep Colorization | - | the first time CNN-based recolor model | - | CVPR, 2015 |
| Colorful Image Colorization | CIC | classification loss for more saturated colors | code | ECCV, 2016 |
| Instance-aware Image Colorization | InstColor | Involving object bounding boxes as auxiliary priors | code | CVPR, 2020 |
| Disentangled Image Colorization via Global Anchors | DISCO | spatial coarse-to-fine model | code | SIGGRAPH Asia 2022 |
| Title | Abbr. | Feature | Code | Pub. & Date |
|---|---|---|---|---|
| Colorization Transformer | ColTran | spatial coarse-to-fine model | code | ICLR, 2021 |
| ColorFormer: Image Colorization via Color Memory assisted Hybrid-attention Transformer | ColorFormer | enhanced model design | code | ECCV, 2022 |
| CT2: Colorization Transformer via Color Tokens | CT2 | learnable color tokens | code | ECCV 2022 |
| DDColor: Towards Photo-Realistic and Semantic-Aware Image Colorization via Dual Decoders | DDColor | query-based transformer | code | ICCV, 2023 |
| MultiColor: Image Colorization by Learning from Multiple Color Spaces | MultiColor | extend DDColor with multiple color spaces | (No Code/project page) | ACM MM 2024 |
| Title | Abbr. | Feature | Code | Pub. & Date |
|---|---|---|---|---|
| DeOldify | DeOldify | asynchronous GAN training | code | Github project, 2018 |
| ChromaGAN: Adversarial Picture Colorization with Semantic Class Distribution | ChromaGAN | classification-based GAN loss | code | WACV, 2020 |
| Focusing on Persons: Colorizing Old Images Learning from Modern Historical Movies | HistoryNet | dataset and recolor old films | code | ACM MM, 2021 |
| Towards Vivid and Diverse Image Colorization with Generative Color Prior | ToVivid (GCP-Colorization) | GAN inversion | code | ICCV, 2021 |
| BigColor: Colorization using a Generative Color Prior for Natural Images | BigColor | pretrained BigGAN w/o inversion | code | ECCV, 2022 |
Most of the following method can establish an intuitive bridge with text prompts by virtue of diffusion models.
| Title | Abbr. | Feature | Code | Pub. & Date |
|---|---|---|---|---|
| Palette: Image-to-image diffusion models | Palette | retrained diffusion for linear degradtion | code (No Code) | SIGGRAPH, 2022 |
| Multimodal semantic-aware automatic colorization with diffusion prior | ColorDiff | segmentation maps as additional prior | code | arXiv, 24.04 |
| Automatic Controllable Colorization via Imagination | Imagine-Colorization | Ensembled pretrained ControlNet | code (No Code) | CVPR, 2024 |
The core challenge is to construct an accurate cross-modal correspondence between grayscale images and color texts.
| Title | Abbr. | Feature | Code | Pub. & Date |
|---|---|---|---|---|
| L-CoDe: Language-based Colorization using Color-object Decoupled Conditions | L-CoDe | Dense human annotation & two-stage alignment | code | AAAI, 2022 |
| L-CoDer: Language-based Colorization with Color-object Decoupling Transformer | L-CoDer | Dense human annotation & two-stage alignment & BERT for text embedding | code | ECCV, 2022 |
| L-CoIns: Language-based Colorization with Instance Awareness | L-CoIns | adaptive alignment & luminance augmentation | code | CVPR, 2023 |
| Title | Abbr. | Feature | Code | Pub. & Date |
|---|---|---|---|---|
| UniColor: A Unified Framework for Multi-Modal Colorization with Transformer | UniColor | sliding window alignment based on CLIP | code | SIGGRAPH Asia, 2022 |
| Title | Abbr. | Feature | Code | Pub. & Date |
|---|---|---|---|---|
| Improved Diffusion-based Image Colorization via Piggybacked Models | Piggybacked-Color | trainable copy of full U-Net for fine-tune | project (No Code) | arXiv, 23.04 |
| DiffColor: Toward High Fidelity Text-Guided Image Colorization with Diffusion Models | DiffColor | contrasitive CLIP guidance | (No Code/project page) | arXiv, 23.08 |
| Control Color: Multimodal Diffusion-based Interactive Image Colorization | CtrlColor | ControlNet-based multi-modal condition | code | arXiv 24.02 |
| Diffusing Colors: Image Colorization with Text Guided Diffusion | Diffusing Colors | rescheduled diffusion model with varied saturation | project (No Code) | SIGGRAPH Asia, 2023 |
| L-CAD: Language-based Colorization with Any-level Descriptions using Diffusion Priors | L-CAD | finetune SD & bypassed VAE | code | NeurIPS, 2023 |
| COCO-LC: Colorfulness Controllable Language-based Colorization | COCO-LC | coarse-to-fine ControlNet-based & colorfulness control | code | ACM MM, 2024 |
TL;NR: Highly efficient LoRA-finetuned SD with GAN loss, taking gray images as input to recolor in one step. Colorfulness is controlled by a flexible scaling factor. Skip connection is established for more precise details preserved.


