-
Notifications
You must be signed in to change notification settings - Fork 26
Description
Description of the bug
Thank you for this tool!
I noticed that when I used Changeo clonotype analysis as per sc-best-practices that some BCRs were being ignored. These all transpired to have more than one VDJ chain. Sometimes only one chain was productive and still they were ignored. Removing these extra chains permitted the cells to be included in clonotype analysis. I could understand why only the dominant chain would be used by clonotype analysis, but ignoring them completely I was not expecting.
This is probably an issue with the Changeo algorithm itself, but it may be at least worth a warning from Dandelion's point-of-view? Unless of course this is intended.
Minimal reproducible example
# From tutorial - import data
import dandelion as ddl
import scanpy as sc
import warnings
import os
if not os.path.exists("demo-vdj.h5ddl"):
os.system("wget ftp://ftp.sanger.ac.uk/pub/users/kp9/demo-vdj.h5ddl")
vdj = ddl.read_h5ddl("demo-vdj.h5ddl")
#Default changeo clonotype analysis
ddl.pp.calculate_threshold(vdj, model="hh_s5f")
ddl.tl.define_clones(vdj, key_added="changeo_clone_id", model="hh_s5f")
ddl.tl.generate_network(
vdj, key="sequence_alignment", layout_method="sfdp", clone_key="changeo_clone_id"
)
#Any blank clones?
any(vdj.metadata.changeo_clone_id == '') # True
#What are the VDJ chains of these blank clones?
set(vdj.metadata.productive_VDJ[ vdj.metadata.changeo_clone_id == '' ])
# {'F|F', 'None', 'T', 'T|F', 'T|F|F', 'T|T', 'T|T|F', 'T|T|F|F', 'T|T|T', 'T|T|T|F|F'}
#Repeating after filtering out extra chains means that all cells with productive VDJ/VJ chains are included
vdj_checked = ddl.pp.check_contigs(vdj)
ddl.pp.calculate_threshold(vdj_checked, model="hh_s5f")
ddl.tl.define_clones(vdj_checked, key_added="changeo_clone_id", model="hh_s5f")
ddl.tl.generate_network(
vdj_checked, key="sequence_alignment", layout_method="sfdp", clone_key="changeo_clone_id"
)
#Now all the blank clones are either missing a VDJ or a VJ chain.The error message produced by the code above
OS information
MacOS 15.4.1
Version information
No response
Additional context
dandelion==0.5.4 pandas==2.2.3 numpy==2.1.3 matplotlib==3.10.1 networkx==3.4.2 scipy==1.15.2