Skip to content

sAchin-680/xray-sdk

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

X-Ray SDK

A lightweight library and dashboard for debugging multi-step, non-deterministic decision systems.

What is X-Ray?

X-Ray provides transparency into complex decision processes. Unlike traditional logging that tells you what happened, X-Ray tells you why decisions were made.

Perfect for debugging:

  • LLM-powered search and filtering
  • Multi-stage ranking algorithms
  • Complex business rule pipelines
  • Any non-deterministic workflow where you need to understand "why this output?"

Project Structure

xray-sdk/
├── sdk/python/          # Core X-Ray library
│   └── xray/           # Tracer, Step, Storage, Types
├── dashboard/          # Next.js visualization UI
├── demo/               # Competitor selection demo
│   ├── competitor_selection.py
│   └── mock-data/
├── api/                # FastAPI backend
└── executions/         # Execution traces (JSON)

Quick Start

1. Run the Demo

# Install SDK
cd sdk/python
pip install -e .
cd ../..

# Run competitor selection demo
python3 demo/competitor_selection.py

This creates an execution trace showing a 3-step workflow:

  1. Generate search keywords (simulated LLM)
  2. Search for candidates (mock API)
  3. Apply filters and select best competitor

2. Start the Dashboard

# Terminal 1: Start API
cd api
pip install -r requirements.txt
python main.py

# Terminal 2: Start Dashboard
cd dashboard
npm install
npm run dev

Visit http://localhost:3000 to view your execution traces!

Or use Docker:

docker-compose up

How It Works

SDK Usage

from xray import XRayTracer

# Create tracer
tracer = XRayTracer(use_case="competitor_selection")

# Step 1: Generate keywords
with tracer.step("keyword_generation", "generation") as step:
    step.set_input({"product_title": "Water Bottle"})
    keywords = your_llm_call(...)
    step.set_output({"keywords": keywords})
    step.set_reasoning("Extracted key product attributes")

# Step 2: Search
with tracer.step("search", "search") as step:
    step.set_input({"keyword": keywords[0]})
    results = your_search_api(...)
    step.set_output({"candidates": results})

# Step 3: Apply filters
with tracer.step("apply_filters", "filter") as step:
    step.set_filters({"price_range": {"min": 10, "max": 50}})

    for candidate in results:
        # Evaluate each candidate
        step.add_evaluation(
            candidate_id=candidate["id"],
            candidate_data=candidate,
            qualified=passes_filters(candidate),
            filter_results=[...]  # Details on each filter
        )

    step.set_output({"selected": best_candidate})

# Save execution
tracer.save()

Dashboard Features

  • Executions List - See all captured executions with status, timestamps, step counts
  • Execution Detail - Deep dive into each step's inputs, outputs, reasoning
  • Filter Visualization - See exactly why candidates passed or failed filters
  • Candidate Evaluations - View filter results for each candidate

Architecture

X-Ray SDK (Python)

  • XRayTracer - Main entry point, manages execution lifecycle
  • StepContext - Context manager for capturing step data (timing, errors)
  • Storage - Simple JSON file storage (easily extensible to DB)
  • Types - Strongly-typed data structures (ExecutionTrace, StepData, FilterResult)

Dashboard (Next.js)

  • List View - Shows all executions
  • Detail View - Visualizes complete decision trail
  • StepView Component - Renders step with expandable sections

API (FastAPI)

  • GET /api/executions - List all executions
  • GET /api/executions/:id - Get execution detail
  • DELETE /api/executions/:id - Delete execution

What Makes X-Ray Different?

Aspect Traditional Logging X-Ray
Focus Events Decision reasoning
Data Messages, errors Candidates, filters, selections
Question "What happened?" "Why this output?"
Granularity Function level Business logic level

Example: Competitor Selection

Given a seller's product, find the best competitor to benchmark against:

Step 1: Keywords

  • Input: Product title, category
  • Output: Search keywords
  • Reasoning: "Extracted material, capacity, features"

Step 2: Search

  • Input: Keywords, limit
  • Output: 50 candidate products
  • Reasoning: "Fetched top results by relevance"

Step 3: Filters

  • Input: 50 candidates, reference product
  • Filters: Price range (0.5x-2x), Min rating (3.8★), Min reviews (100)
  • Evaluations: Each candidate with pass/fail details
  • Output: Selected competitor
  • Reasoning: "Highest review count among qualified candidates"

When debugging, you can immediately see:

  • Which products failed which filters (and why)
  • Whether the problem is bad keywords, too-strict filters, or poor ranking
  • The complete context for every decision

Future Improvements

With more time, I would add:

  • Database storage - PostgreSQL instead of JSON files
  • Real-time updates - WebSocket for live execution streaming
  • Filter builder - Query executions by status, use case, date range
  • Comparison view - Side-by-side comparison of executions
  • Export - Download execution traces
  • TypeScript SDK - For Node.js applications
  • Replay mode - Re-run past executions with different parameters
  • Cost tracking - For LLM token usage

Tech Stack

  • SDK: Python 3.11+, zero dependencies
  • API: FastAPI, Uvicorn
  • Dashboard: Next.js 15, React 19, TailwindCSS v4, TypeScript
  • Storage: JSON files (easily extensible)

License

MIT

About

A lightweight library and dashboard for debugging multi-step, non-deterministic decision systems like LLM workflows, ranking algorithms, and complex business pipelines.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors