Deterministic compliance check for clinical trial documents.
Gate 3 Architecture
Gate 3 — FULL FAILURE + NEXT-ACTION DESIGN
-
Upload Documents
- User uploads Protocol, DUA, and IRB Consent.
-
Ingestion
- Files are read and converted to raw text (DOCX/PDF → text).
-
Extraction
- Structured fields are extracted from text:
- PHI level
- Genetic data authorization
- Retention period
- No compliance decisions are made here.
- Structured fields are extracted from text:
-
Schema Normalization
- Extracted values are mapped into a canonical
ComplianceRecord. - Removes wording differences between documents.
- Extracted values are mapped into a canonical
-
Gate 3 Rules
- Deterministic checks are applied:
- PHI alignment
- Genetic authorization
- Retention alignment
- Deterministic checks are applied:
-
Decision
- PASS if all rules succeed.
- FAIL if any rule fails.
-
Explainability
- On FAIL, shows the exact rule and clause causing rejection.
-
Audit
- Records document hashes, rule results, decision, and timestamp.
- Protocol (DOCX/PDF)
- Data Use Agreement (DUA)
- IRB Consent Form
- PASS
- FAIL
- PHI level must match between Protocol and DUA
- Genetic data in Consent must be authorized by DUA
- Consent retention period must be ≥ DUA retention period
Documents → Extract → Normalize → Rules → Decision
pip install -r requirements.txt
streamlit run app/main.py