Skip to content

wperlichek/RSV_Long_F_DMS

 
 

Repository files navigation

Pseudovirus deep mutational scanning of RSV Long F protein

Study by Cassandra Simonich and Teagan McMahon in the Bloom lab. See Simonich et al for the paper describing this study.

See https://dms-vep.org/RSV_Long_F_DMS/ for visualization of the results and links to various interactive plots and data files.

All the computer code and raw and processed data (from sequencing counts to final mutation effects) are in this repository. For a single summary file of the QC-ed mutation effects, see this CSV file.

Organization of this repo

dms-vep-pipeline-3 submodule

Most of the analysis is done by the dms-vep-pipeline-3, which was added as a git submodule to this pipeline via:

git submodule add https://github.com/dms-vep/dms-vep-pipeline-3

This added the file .gitmodules and the submodule dms-vep-pipeline-3, which was then committed to the repo. Note that if you want a specific commit or tag of dms-vep-pipeline-3 or to update to a new commit, follow the steps here, basically:

cd dms-vep-pipeline-3
git checkout <commit>

and then cd ../ back to the top-level directory, and add and commit the updated dms-vep-pipeline-3 submodule. You can also make changes to the dms-vep-pipeline-3 that you commit back to that repo.

Code and configuration

The snakemake pipeline itself is run by dms-vep-pipeline-3/Snakefile which reads its configuration from config.yaml. The conda environment used by the pipeline is that specified in the environment.yml file in dms-vep-pipeline-3.

Data

Input data utilized by the pipeline are located in ./data/.

Results and documentation

The results of running the pipeline are placed in ./results/. Due to space, only some results are tracked. For those that are not, see the .gitignore document.

The pipeline builds HTML documentation for the pipeline in ./results/docs. To visualize these docs via GitHub Pages, run:

dms-vep-pipeline-3/publish_docs_gh-pages.sh

This pushes the docs to the gh-pages branch, we can be viewed on GitHub Pages at https://dms-vep.org/RSV_Long_F_DMS/.

Running the pipeline (dry-run)

To do a test run of the pipeline you can execute the following command:

snakemake -n -s dms-vep-pipeline-3/Snakefile --rerun-incomplete

Running the pipeline

To run the pipeline, build the conda environment dms-vep-pipeline-3 in the environment.yml file of dms-vep-pipeline-3, activate it, and run snakemake, such as:

conda activate dms-vep-pipeline-3
snakemake -j 16 --software-deployment-method conda -s dms-vep-pipeline-3/Snakefile

To run on the Hutch cluster via slurm, you can run the file run_Hutch_cluster.bash:

sbatch run_Hutch_cluster.bash

Additional analyses outside of main pipeline

The ./non-pipeline_analyses/ contains additional analyses that are not part of the main pipeline. See the README within that subdirectory for more details.

About

Pseudovirus deep mutational scanning of the F protein from RSV (subtype A Long strain)

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages

  • Jupyter Notebook 75.2%
  • HTML 23.2%
  • Python 1.4%
  • Shell 0.1%
  • CSS 0.1%
  • Vue 0.0%