Skip to content

Mouse reference sequences #39

@samkleeman1

Description

@samkleeman1

Hi,

Thanks so much for making this package available, this is a brilliant resource, especially for neoantigen prediction in mice. We are trying to call neoantigens in a tumor derived from a BALB/c background, and this creates certain issues around reference sequences. I note that you recommend aligning to the BALB/c-specific reference genome from the Sanger, I believe that has different coordinates to the Mouse Genome Project SNP file that you recommend (ftp://ftp-mouse.sanger.ac.uk/current_snps/strain_specific_vcfs/BALB_cJ.mgp.v5.snps.dbSNP142.vcf.gz), as these files are correspond to BALBc-specific mutations when reads are aligned to GRCm38 genome, and thus are incompatible (unless I am mistaken). As a result these files are not compatible. To complicate matters further, in our experience that BALB/c reference has significant gaps even in coding regions and indeed the Sanger paper where these strain-specific assemblies was published alludes to a substantially higher error rate versus GRCm38 (https://www.nature.com/articles/s41588-018-0223-8#Sec2). We have come to the conclusion that we should use the GRCm38 to align our BALB/c reads, especially as GRCm38 (cf. GRCm39) includes patches that correspond to strain-specific haplotypes. We use the pan-strain SNP and indels from the Sanger Mouse Genome Project for base quality score recalibration and then call mutations using Strelka2.

I was wondering if you had any advice about neoantigen calling for BALB/c data as we are planning. My feeling is that the best universal approach is to align everything to GRCm38 and then use the cDNA and peptides derived from this reference (i.e. available here http://ftp.ensembl.org/pub/release-89/fasta/mus_musculus/), as this is designed to capture majority of variation across most strains. Would really appreciate your thoughts on the question.

Kind regards,

Dr Sam Kleeman MD
PhD Student
Cold Spring Harbor Laboratory, NY

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions