- System Overview
- CSEA Algorithm
- Data Models
- API Endpoints
- R Shiny Application
- Deployment
- Acknowledgements
- Contributors
As the CysTeam, we're submitting a suite of specialized web-based bioinformatics tools for analyzing cysteine modifications in cancer research.
- Upload custom cysteine datasets or input them directly via text
- Select appropriate background datasets from various cancer tissue types
- Choose annotation types for enrichment analysis
- Run the CSEA algorithm to identify statistically significant enriched features
- Visualize and download results including enriched feature plots, detailed statistics, and comparison plots
The CSEA (Cysteine Set Enrichment Analysis) algorithm is implemented in csea500b.py and represents the core analytical component of the system.

-
Input Processing
- User-provided cysteine modifications list
- Selected background dataset(s)
- Annotation datasets matching the selected type
-
Enrichment Analysis
- Statistical permutation testing (default: 500 permutations)
- P-value calculation using Gaussian kernel density estimation
- Multiple testing correction using Benjamini-Hochberg FDR
-
Result Generation
- Enrichment scores and statistical significance values
- Visualization outputs (barplots, heatmaps)
- Detailed CSV result files
The algorithm uses a permutation-based enrichment analysis approach:
- The input cysteine list is compared against annotation sets
- Random permutations of background cysteines are generated
- The number of intersections with each annotation set is calculated
- Kernel density estimation is used to calculate a p-value distribution
- Multiple testing correction is applied to control for false discoveries
analysisJobs
interface AnalysisJob {
jobId: string;
status: 'QUEUED' | 'INITIALIZING' | 'RUNNING' | 'COMPLETED' | 'ERROR';
step: string;
createdAt: Timestamp;
lastUpdated: Timestamp;
foregroundFilePath: string;
backgroundSelections: string[];
annotationSelection: string;
progress: number;
logs: string[];
outputFiles: OutputFile[];
stats: AnalysisStats;
error?: string;
}
interface OutputFile {
filename: string;
url: string;
}
interface AnalysisStats {
size_permutation: number;
n_Anno_inSet: number;
n_cys_input: number;
n_bg_input: number;
n_cys_notinAnno: number;
n_bg_notinAnno: number;
}zaro-lab.firebasestorage.app/
├── aggregated_tissue_cysteines/ # Background reference data
│ ├── Updated_Breast_Cancer_Cysteine_Master_List.csv
│ ├── Updated_Colon_Cancer_Cysteine_Master_List.csv
│ └── ...
├── reference/ # Annotation reference data
│ ├── df_annotation_sub_molecular_features.csv
│ ├── df_annotation_sub_experimental_data.csv
│ ├── df_annotation_sub_structural.csv
│ ├── bgcys_anno_molecular_features.csv
│ └── ...
├── uploads/ # User uploaded data
│ ├── {jobId}/
│ │ └── foreground.csv
│ └── ...
└── results/ # Analysis results
├── {jobId}/
│ ├── csea_barplot.png
│ ├── result_{filename}_seed{seed}.csv
│ └── ...
└── ...
- Method: POST
- Content-Type: application/json
- Request Body:
{ "jobId": "string", "foregroundFilePath": "string", "backgroundSelections": ["string"], "annotationSelection": "string" } - Response:
{ "message": "Analysis completed", "outputFiles": [ { "filename": "string", "url": "string" } ], "stats": { "size_permutation": "number", "n_Anno_inSet": "number", "n_cys_input": "number", "n_bg_input": "number", "n_cys_notinAnno": "number", "n_bg_notinAnno": "number" } }
- Method: GET
- Query Parameters:
jobId: stringfilename: string
- Response:
{ "headers": ["string"], "rows": [["string"]], "rowCount": "number", "totalRows": "number" }
The project includes a complementary R Shiny application (app_cysteins_v4.R) that provides specialized visualization and exploration of NCI60 protein data.

-
Protein Search
- Search by protein IDs
- Filter by cell lines or tissue types
- Minimum mass spec intensity threshold configuration
-
Visualization
- Heatmap visualization of protein expression
- Gene analysis text output
- Minimum cell line coverage calculation
cystein_protein_master_cys_supplemented.RData: Primary protein expression datasetcellline2tissue.csv: Cell line to tissue type mappinghuman_kinases_list.csv: Reference list of human kinases
The R Shiny app uses:
shinyfor reactive web interfaceggplot2for data visualizationpheatmapfor heatmap generation- Custom data processing for protein expression analysis
The React frontend is built and deployed to Firebase Hosting:
# Build production assets
cd frontend
npm install
npm run build
# Deploy to Firebase
firebase deploy --only hostingThe Python backend functions are deployed to Firebase Cloud Functions:
# Deploy cloud functions
cd backend
firebase deploy --only functions-
Frontend:
cd frontend npm install npm start -
Backend:
cd backend # Create virtual environment python -m venv venv source venv/bin/activate # On Windows: venv\Scripts\activate # Install dependencies pip install -r functions/requirements.txt # Start Firebase emulators firebase emulators:start
-
R Shiny:
cd rshiny Rscript -e "shiny::runApp('.', port=3838)"
CSEA was developed by the Bar-Peled Lab as part of DrugMap and the original code is publicly available on GitHub. CSEA was adapted as an online tool by the Zaro Lab.
- José Montaño (University of California - San Francisco)
- Vee Xu (University of California - San Francisco Gladstone Institutes)
- Vishnu Rajan Tejus (University of California - Berkeley)
