Skip to content

Access to ground-truth pseudobulk gene expression matrix #50

@tjchen020524

Description

@tjchen020524

Hi Decima team,

Thank you for making Decima and the accompanying documentation publicly available.

I’m working on a research project that extends sequence-based gene expression models by incorporating spatial microenvironment information, and I’m building on Decima while keeping pseudobulk gene expression as one of the model outputs.

I have a question regarding the ground-truth pseudobulk gene expression matrix used for training and evaluation.

I understand that multiple single-cell datasets were aggregated into pseudobulk samples and used as training targets for gene expression prediction. The API also exposes DecimaResult.ground_truth, which suggests that the true expression matrix can be accessed when available.

However, in the released W&B metadata and the supplementary data on Zenodo, I only see predicted expression values, and the AnnData .X matrix appears empty. I was unable to find a publicly available version of the ground-truth pseudobulk expression matrix in the documentation, tutorials, or the 'decima-applications' repository.

Could you clarify whether the ground-truth pseudobulk expression matrix is publicly available, and if not, what the recommended way is to obtain it?

Thanks very much for your time.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions