Skip to content

Changeo clone clustering ignores BCR cells with >1 VDJ chain #517

@timslittle

Description

@timslittle

Description of the bug

Thank you for this tool!

I noticed that when I used Changeo clonotype analysis as per sc-best-practices that some BCRs were being ignored. These all transpired to have more than one VDJ chain. Sometimes only one chain was productive and still they were ignored. Removing these extra chains permitted the cells to be included in clonotype analysis. I could understand why only the dominant chain would be used by clonotype analysis, but ignoring them completely I was not expecting.

This is probably an issue with the Changeo algorithm itself, but it may be at least worth a warning from Dandelion's point-of-view? Unless of course this is intended.

Minimal reproducible example

# From tutorial - import data
import dandelion as ddl
import scanpy as sc
import warnings
import os

if not os.path.exists("demo-vdj.h5ddl"):
    os.system("wget ftp://ftp.sanger.ac.uk/pub/users/kp9/demo-vdj.h5ddl")
vdj = ddl.read_h5ddl("demo-vdj.h5ddl")

#Default changeo clonotype analysis
ddl.pp.calculate_threshold(vdj, model="hh_s5f")
ddl.tl.define_clones(vdj, key_added="changeo_clone_id", model="hh_s5f")
ddl.tl.generate_network(
    vdj, key="sequence_alignment", layout_method="sfdp", clone_key="changeo_clone_id"
)
#Any blank clones?
any(vdj.metadata.changeo_clone_id == '') # True
#What are the VDJ chains of these blank clones?
set(vdj.metadata.productive_VDJ[ vdj.metadata.changeo_clone_id == '' ])
# {'F|F', 'None', 'T', 'T|F', 'T|F|F', 'T|T', 'T|T|F', 'T|T|F|F', 'T|T|T', 'T|T|T|F|F'}

#Repeating after filtering out extra chains means that all cells with productive VDJ/VJ chains are included
vdj_checked = ddl.pp.check_contigs(vdj)
ddl.pp.calculate_threshold(vdj_checked, model="hh_s5f")
ddl.tl.define_clones(vdj_checked, key_added="changeo_clone_id", model="hh_s5f")
ddl.tl.generate_network(
    vdj_checked, key="sequence_alignment", layout_method="sfdp", clone_key="changeo_clone_id"
)
#Now all the blank clones are either missing a VDJ or a VJ chain.

The error message produced by the code above

OS information

MacOS 15.4.1

Version information

No response

Additional context

dandelion==0.5.4 pandas==2.2.3 numpy==2.1.3 matplotlib==3.10.1 networkx==3.4.2 scipy==1.15.2

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions