CysTeam

System Overview

As the CysTeam, we're submitting a suite of specialized web-based bioinformatics tools for analyzing cysteine modifications in cancer research.

Upload custom cysteine datasets or input them directly via text
Select appropriate background datasets from various cancer tissue types
Choose annotation types for enrichment analysis
Run the CSEA algorithm to identify statistically significant enriched features
Visualize and download results including enriched feature plots, detailed statistics, and comparison plots

CSEA Algorithm

The CSEA (Cysteine Set Enrichment Analysis) algorithm is implemented in csea500b.py and represents the core analytical component of the system.

Algorithm Overview

Input Processing
- User-provided cysteine modifications list
- Selected background dataset(s)
- Annotation datasets matching the selected type
Enrichment Analysis
- Statistical permutation testing (default: 500 permutations)
- P-value calculation using Gaussian kernel density estimation
- Multiple testing correction using Benjamini-Hochberg FDR
Result Generation
- Enrichment scores and statistical significance values
- Visualization outputs (barplots, heatmaps)
- Detailed CSV result files

Statistical Approach

The algorithm uses a permutation-based enrichment analysis approach:

The input cysteine list is compared against annotation sets
Random permutations of background cysteines are generated
The number of intersections with each annotation set is calculated
Kernel density estimation is used to calculate a p-value distribution
Multiple testing correction is applied to control for false discoveries

Data Models

Firestore Collections

analysisJobs

interface AnalysisJob {
  jobId: string;
  status: 'QUEUED' | 'INITIALIZING' | 'RUNNING' | 'COMPLETED' | 'ERROR';
  step: string;
  createdAt: Timestamp;
  lastUpdated: Timestamp;
  foregroundFilePath: string;
  backgroundSelections: string[];
  annotationSelection: string;
  progress: number;
  logs: string[];
  outputFiles: OutputFile[];
  stats: AnalysisStats;
  error?: string;
}

interface OutputFile {
  filename: string;
  url: string;
}

interface AnalysisStats {
  size_permutation: number;
  n_Anno_inSet: number;
  n_cys_input: number;
  n_bg_input: number;
  n_cys_notinAnno: number;
  n_bg_notinAnno: number;
}

Cloud Storage Organization

zaro-lab.firebasestorage.app/
├── aggregated_tissue_cysteines/    # Background reference data
│   ├── Updated_Breast_Cancer_Cysteine_Master_List.csv
│   ├── Updated_Colon_Cancer_Cysteine_Master_List.csv
│   └── ...
├── reference/                      # Annotation reference data
│   ├── df_annotation_sub_molecular_features.csv
│   ├── df_annotation_sub_experimental_data.csv
│   ├── df_annotation_sub_structural.csv
│   ├── bgcys_anno_molecular_features.csv
│   └── ...
├── uploads/                        # User uploaded data
│   ├── {jobId}/
│   │   └── foreground.csv
│   └── ...
└── results/                        # Analysis results
    ├── {jobId}/
    │   ├── csea_barplot.png
    │   ├── result_{filename}_seed{seed}.csv
    │   └── ...
    └── ...

API Endpoints

Cloud Functions API

`run_analysis`

Method: POST
Content-Type: application/json

Request Body:

{
  "jobId": "string",
  "foregroundFilePath": "string",
  "backgroundSelections": ["string"],
  "annotationSelection": "string"
}

Response:

{
  "message": "Analysis completed",
  "outputFiles": [
    {
      "filename": "string",
      "url": "string"
    }
  ],
  "stats": {
    "size_permutation": "number",
    "n_Anno_inSet": "number",
    "n_cys_input": "number",
    "n_bg_input": "number",
    "n_cys_notinAnno": "number",
    "n_bg_notinAnno": "number"
  }
}

`preview_csv`

Method: GET
Query Parameters:
- jobId: string
- filename: string

Response:

{
  "headers": ["string"],
  "rows": [["string"]],
  "rowCount": "number",
  "totalRows": "number"
}

R Shiny Application

The project includes a complementary R Shiny application (app_cysteins_v4.R) that provides specialized visualization and exploration of NCI60 protein data.

Features

Protein Search
- Search by protein IDs
- Filter by cell lines or tissue types
- Minimum mass spec intensity threshold configuration
Visualization
- Heatmap visualization of protein expression
- Gene analysis text output
- Minimum cell line coverage calculation

Data Sources

cystein_protein_master_cys_supplemented.RData: Primary protein expression dataset
cellline2tissue.csv: Cell line to tissue type mapping
human_kinases_list.csv: Reference list of human kinases

Implementation

The R Shiny app uses:

shiny for reactive web interface
ggplot2 for data visualization
pheatmap for heatmap generation
Custom data processing for protein expression analysis

Deployment

Frontend Deployment

The React frontend is built and deployed to Firebase Hosting:

# Build production assets
cd frontend
npm install
npm run build

# Deploy to Firebase
firebase deploy --only hosting

Backend Deployment

The Python backend functions are deployed to Firebase Cloud Functions:

# Deploy cloud functions
cd backend
firebase deploy --only functions

Local Development

Frontend:
```
cd frontend
npm install
npm start
```

Backend:

cd backend
# Create virtual environment
python -m venv venv
source venv/bin/activate  # On Windows: venv\Scripts\activate

# Install dependencies
pip install -r functions/requirements.txt

# Start Firebase emulators
firebase emulators:start

R Shiny:

cd rshiny
Rscript -e "shiny::runApp('.', port=3838)"

Acknowledgements

CSEA was developed by the Bar-Peled Lab as part of DrugMap and the original code is publicly available on GitHub. CSEA was adapted as an online tool by the Zaro Lab.

Contributors

José Montaño (University of California - San Francisco)
Vee Xu (University of California - San Francisco Gladstone Institutes)
Vishnu Rajan Tejus (University of California - Berkeley)

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
backend		backend
frontend		frontend
rshiny		rshiny
.DS_Store		.DS_Store
.gitattributes		.gitattributes
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

CysTeam

Table of Contents

System Overview

CSEA Algorithm

Algorithm Overview

Statistical Approach

Data Models

Firestore Collections

Cloud Storage Organization

API Endpoints

Cloud Functions API

`run_analysis`

`preview_csv`

R Shiny Application

Features

Data Sources

Implementation

Deployment

Frontend Deployment

Backend Deployment

Local Development

Acknowledgements

Contributors

About

Uh oh!

Releases

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

CysTeam

Table of Contents

System Overview

CSEA Algorithm

Algorithm Overview

Statistical Approach

Data Models

Firestore Collections

Cloud Storage Organization

API Endpoints

Cloud Functions API

run_analysis

preview_csv

R Shiny Application

Features

Data Sources

Implementation

Deployment

Frontend Deployment

Backend Deployment

Local Development

Acknowledgements

Contributors

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

`run_analysis`

`preview_csv`

Packages