Language-based Image Colorization: A Benchmark and Beyond

TODO:

Outline:

Review and Benchmark
- Automatic Image Colorization
- Language-based Image Colorization
Color-Turbo: Single-step Diffusion for Colorfulness Controllable Colorization

📖 Review and Benchmark

Automatic Image Colorization

The most straightforward ones, however limited by poor controllability.

CNN-based methods:

Title	Abbr.	Feature	Code	Pub. & Date
Deep Colorization	-	the first time CNN-based recolor model	-	CVPR, 2015
Colorful Image Colorization	CIC	classification loss for more saturated colors	code	ECCV, 2016
Instance-aware Image Colorization	InstColor	Involving object bounding boxes as auxiliary priors	code	CVPR, 2020
Disentangled Image Colorization via Global Anchors	DISCO	spatial coarse-to-fine model	code	SIGGRAPH Asia 2022

Transformer-based methods:

Title	Abbr.	Feature	Code	Pub. & Date
Colorization Transformer	ColTran	spatial coarse-to-fine model	code	ICLR, 2021
ColorFormer: Image Colorization via Color Memory assisted Hybrid-attention Transformer	ColorFormer	enhanced model design	code	ECCV, 2022
CT2: Colorization Transformer via Color Tokens	CT2	learnable color tokens	code	ECCV 2022
DDColor: Towards Photo-Realistic and Semantic-Aware Image Colorization via Dual Decoders	DDColor	query-based transformer	code	ICCV, 2023
MultiColor: Image Colorization by Learning from Multiple Color Spaces	MultiColor	extend DDColor with multiple color spaces	(No Code/project page)	ACM MM 2024

GAN-based methods:

Title	Abbr.	Feature	Code	Pub. & Date
DeOldify	DeOldify	asynchronous GAN training	code	Github project, 2018
ChromaGAN: Adversarial Picture Colorization with Semantic Class Distribution	ChromaGAN	classification-based GAN loss	code	WACV, 2020
Focusing on Persons: Colorizing Old Images Learning from Modern Historical Movies	HistoryNet	dataset and recolor old films	code	ACM MM, 2021
Towards Vivid and Diverse Image Colorization with Generative Color Prior	ToVivid (GCP-Colorization)	GAN inversion	code	ICCV, 2021
BigColor: Colorization using a Generative Color Prior for Natural Images	BigColor	pretrained BigGAN w/o inversion	code	ECCV, 2022

Diffusion-based methods:

Most of the following method can establish an intuitive bridge with text prompts by virtue of diffusion models.

Title	Abbr.	Feature	Code	Pub. & Date
Palette: Image-to-image diffusion models	Palette	retrained diffusion for linear degradtion	code (No Code)	SIGGRAPH, 2022
Multimodal semantic-aware automatic colorization with diffusion prior	ColorDiff	segmentation maps as additional prior	code	arXiv, 24.04
Automatic Controllable Colorization via Imagination	Imagine-Colorization	Ensembled pretrained ControlNet	code (No Code)	CVPR, 2024

Language-based Image Colorization

The core challenge is to construct an accurate cross-modal correspondence between grayscale images and color texts.

From-scratch cross-modality:

Title	Abbr.	Feature	Code	Pub. & Date
L-CoDe: Language-based Colorization using Color-object Decoupled Conditions	L-CoDe	Dense human annotation & two-stage alignment	code	AAAI, 2022
L-CoDer: Language-based Colorization with Color-object Decoupling Transformer	L-CoDer	Dense human annotation & two-stage alignment & BERT for text embedding	code	ECCV, 2022
L-CoIns: Language-based Colorization with Instance Awareness	L-CoIns	adaptive alignment & luminance augmentation	code	CVPR, 2023

CLIP-based cross-modality:

Title	Abbr.	Feature	Code	Pub. & Date
UniColor: A Unified Framework for Multi-Modal Colorization with Transformer	UniColor	sliding window alignment based on CLIP	code	SIGGRAPH Asia, 2022

Stable-Diffusion based cross-modality:

Four Representative Condition Insertion Paradigm

Title	Abbr.	Feature	Code	Pub. & Date
Improved Diffusion-based Image Colorization via Piggybacked Models	Piggybacked-Color	trainable copy of full U-Net for fine-tune	project (No Code)	arXiv, 23.04
DiffColor: Toward High Fidelity Text-Guided Image Colorization with Diffusion Models	DiffColor	contrasitive CLIP guidance	(No Code/project page)	arXiv, 23.08
Control Color: Multimodal Diffusion-based Interactive Image Colorization	CtrlColor	ControlNet-based multi-modal condition	code	arXiv 24.02
Diffusing Colors: Image Colorization with Text Guided Diffusion	Diffusing Colors	rescheduled diffusion model with varied saturation	project (No Code)	SIGGRAPH Asia, 2023
L-CAD: Language-based Colorization with Any-level Descriptions using Diffusion Priors	L-CAD	finetune SD & bypassed VAE	code	NeurIPS, 2023
COCO-LC: Colorfulness Controllable Language-based Colorization	COCO-LC	coarse-to-fine ControlNet-based & colorfulness control	code	ACM MM, 2024

🚀 Color-Turbo: Single-step Diffusion for Colorfulness Controllable Colorization

TL;NR: Highly efficient LoRA-finetuned SD with GAN loss, taking gray images as input to recolor in one step. Colorfulness is controlled by a flexible scaling factor. Skip connection is established for more precise details preserved.

Name		Name	Last commit message	Last commit date
Latest commit History 18 Commits
imgs		imgs
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Language-based Image Colorization: A Benchmark and Beyond

TODO:

📖 Review and Benchmark

Automatic Image Colorization

CNN-based methods:

Transformer-based methods:

GAN-based methods:

Diffusion-based methods:

Language-based Image Colorization

From-scratch cross-modality:

CLIP-based cross-modality:

Stable-Diffusion based cross-modality:

Four Representative Condition Insertion Paradigm

🚀 Color-Turbo: Single-step Diffusion for Colorfulness Controllable Colorization

⏱️ The implementation will be released upon publish.

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Folders and files

Latest commit

History

Repository files navigation

Language-based Image Colorization: A Benchmark and Beyond

TODO:

📖 Review and Benchmark

Automatic Image Colorization

CNN-based methods:

Transformer-based methods:

GAN-based methods:

Diffusion-based methods:

Language-based Image Colorization

From-scratch cross-modality:

CLIP-based cross-modality:

Stable-Diffusion based cross-modality:

Four Representative Condition Insertion Paradigm

🚀 Color-Turbo: Single-step Diffusion for Colorfulness Controllable Colorization

⏱️ The implementation will be released upon publish.

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Packages