Skip to content

jameschapman19/cca_zoo

Repository files navigation

CCA-Zoo

CCA-Zoo

Multiview Canonical Correlation Analysis in Python

PyPI Python CI codecov DOI License: MIT

CCA-Zoo is a Python library implementing a wide range of Canonical Correlation Analysis (CCA) and related multiview learning methods. It follows the scikit-learn estimator API: every model exposes fit, transform, fit_transform, and score.


Installation

pip install cca-zoo

Install optional extras as needed:

pip install cca-zoo[deep]          # DCCA variants (requires PyTorch + Lightning)
pip install cca-zoo[probabilistic] # Probabilistic CCA (requires NumPyro + JAX)
pip install cca-zoo[all]           # Everything above

Quick start

import numpy as np
from cca_zoo.datasets import JointData
from cca_zoo.linear import CCA

# Generate correlated two-view data from a linear latent variable model
data = JointData(
    n_views=2,
    n_samples=200,
    n_features=[50, 50],
    latent_dimensions=2,
    signal_to_noise=2.0,
    random_state=0,
)
train_views = data.sample()
test_views  = data.sample()

# Fit CCA and evaluate
model = CCA(latent_dimensions=2).fit(train_views)
print(model.score(test_views))     # canonical correlations, shape (2,)

# Project views into the shared latent space
z1, z2 = model.transform(test_views)  # each shape (200, 2)

Available methods

cca_zoo.linear

Class Description Views
CCA Standard CCA (Hotelling 1936) 2
rCCA Regularised CCA / canonical ridge 2
PLS Partial Least Squares 2
MCCA Multiset CCA — pairwise sum objective ≥2
GCCA Generalised CCA — shared latent projection ≥2
TCCA Tensor CCA — higher-order cross-moment ≥2
CCA_EY Stochastic Eckart-Young CCA (Riemannian GD) 2
PLS_EY Stochastic Eckart-Young PLS (Riemannian GD) 2
MCCA_EY Multiview Eckart-Young CCA (Riemannian GD) ≥2
SCCA_PMD Sparse CCA via PMD (Witten 2009) ≥2
SCCA_ADMM Sparse CCA via ADMM (Suo 2017) ≥2
SCCA_IPLS Sparse CCA via iterative PLS (Mai & Zhang 2019) ≥2
SCCA_Span SpanCCA (Asteris 2016) ≥2
ElasticCCA Elastic net regularised CCA (Waaijenborg 2008) ≥2
ParkhomenkoCCA Soft-threshold sparse CCA (Parkhomenko 2009) ≥2
PLS_ALS ALS variant of PLS (power iteration) ≥2

cca_zoo.nonparametric

Class Description
KCCA Kernel CCA
KGCCA Kernel Generalised CCA
KTCCA Kernel Tensor CCA

cca_zoo.deep (requires [deep])

Class Reference
DCCA Andrew et al. 2013 — pluggable objective
DCCA_EY Eigengame / Eckart-Young objective
DCCA_NOI Wang et al. 2015 — non-linear orthogonal iterations
DCCA_SDL Chang et al. 2018 — stochastic decorrelation loss
DCCAE Wang et al. 2015 — with autoencoder reconstruction
DVCCA Wang et al. 2016 — variational
DTCCA Wong et al. 2021 — deep tensor CCA
SplitAE Split autoencoder baseline
BarlowTwins Zbontar et al. 2021
VICReg Bardes et al. 2022

cca_zoo.probabilistic (requires [probabilistic])

Class Reference
ProbabilisticCCA Bach & Jordan 2005; Wang 2007 — MCMC via NumPyro

Documentation

Full documentation, user guides, and API reference at: https://jameschapman19.github.io/cca_zoo/


Citing

If CCA-Zoo is useful in your research, please cite:

@article{Chapman2021,
  title   = {{CCA-Zoo}: A collection of Regularized, Deep Learning based, Kernel,
             and Probabilistic {CCA} methods in a scikit-learn style framework},
  author  = {Chapman, James and Wang, Hao-Ting and Wells, Lennie and Wiesner, Johannes},
  journal = {Journal of Open Source Software},
  volume  = {6},
  number  = {68},
  pages   = {3823},
  year    = {2021},
  doi     = {10.21105/joss.03823},
}

Contributing

Contributions are welcome. See docs/contributing.md for development setup, coding standards, and pull request guidelines.

About

Canonical Correlation Analysis Zoo: A collection of Regularized, Deep Learning based, Kernel, and Probabilistic methods in a scikit-learn style framework

Topics

Resources

License

Contributing

Stars

Watchers

Forks

Packages

 
 
 

Contributors

Languages