MedicalAgent: Local Agentic Medical Image Interpreter

This project is a learning platform for agentic programming with:

LLM reasoning using Google Gemini with optional Ollama fallback
LangGraph multi-agent orchestration
Chest X-ray classifier (Hugging Face/torch) with generic fallback
BLIP (Hugging Face) vision-language tool
DuckDuckGo web search tool
Streamlit UI

It is designed for experimenting with tool calling, shared memory, reflection loops, and dynamic routing.

Important Note

This is an educational engineering project and not a medical diagnostic system.

Architecture

Streamlit UI
	|
	v
LangGraph Orchestration
	|
	+--> Gemini API (primary)
	+--> Ollama Gemma (fallback, optional)
	+--> CNN Tool (medical chest X-ray classifier, in-process)
	+--> VLM Tool (BLIP, in-process)
	+--> Research Tool (DuckDuckGo)

LangGraph Nodes

PlannerAgent
ImageDecisionAgent
CNNToolNode
VLMToolNode
ResearchAgent
CriticAgent
FinalResponseAgent

The graph includes a reflection loop from CriticAgent back to ImageDecisionAgent for retries.

Project Structure

MedicalAgent/
  app.py
  requirements.txt
  .env.example
  src/medical_agent/
	config.py
	llm.py
	state.py
	agents/nodes.py
	graph/workflow.py
	tools/cnn_tool.py
	tools/vlm_tool.py
	tools/search_tool.py

Prerequisites

Python 3.10+
A valid Google Gemini API key
(Optional) Ollama running locally with Gemma for fallback

Setup

From the MedicalAgent folder:

python -m venv .venv
.venv\Scripts\activate
pip install -r requirements.txt

Optional environment variables:

set GEMINI_API_KEY=your_api_key_here
set GEMINI_MODEL=gemini-2.0-flash
set GEMINI_BASE_URL=https://generativelanguage.googleapis.com
set OLLAMA_FALLBACK_ENABLED=true
set OLLAMA_BASE_URL=http://localhost:11434
set OLLAMA_MODEL=gemma3:4b
set CRITIC_CONFIDENCE_THRESHOLD=0.65
set MAX_RETRY_LOOPS=2
set MEDICAL_AGENT_LOG_LEVEL=INFO
set CHEST_XRAY_MODEL=dima806/chest_xray_pneumonia_detection
set BLIP_CAPTION_MODEL=Salesforce/blip-image-captioning-base
set BLIP_VQA_MODEL=Salesforce/blip-vqa-base

Run

streamlit run app.py

Then open the local Streamlit URL in your browser.

First-Run Behavior

TensorFlow and BLIP pretrained weights are downloaded automatically on first use.
Subsequent runs use local cache and are faster.

How It Demonstrates Agentic Programming

PlannerAgent interprets user goal and sets strategy.
ImageDecisionAgent dynamically selects the next tool.
Tool nodes run CNN / VLM / Research as callable capabilities.
CriticAgent reflects on confidence and can trigger retries.
FinalResponseAgent synthesizes a single user-facing explanation.

All agents read and write a shared memory state in LangGraph.

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
.streamlit		.streamlit
.vscode		.vscode
src/medical_agent		src/medical_agent
.env.example		.env.example
.gitignore		.gitignore
README.md		README.md
VALIDATION_REPORT.md		VALIDATION_REPORT.md
app.py		app.py
requirements.txt		requirements.txt
test_detailed_request.py		test_detailed_request.py
test_detailed_request_phase2.py		test_detailed_request_phase2.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

MedicalAgent: Local Agentic Medical Image Interpreter

Important Note

Architecture

LangGraph Nodes

Project Structure

Prerequisites

Setup

Run

First-Run Behavior

How It Demonstrates Agentic Programming

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

MedicalAgent: Local Agentic Medical Image Interpreter

Important Note

Architecture

LangGraph Nodes

Project Structure

Prerequisites

Setup

Run

First-Run Behavior

How It Demonstrates Agentic Programming

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages