---
title: NurseSim Triage
emoji: 🏥
colorFrom: blue
colorTo: indigo
sdk: docker
pinned: false
---

# NurseSim-RL: A Healthcare Agent Environment for Clinical Triage

[![AgentBeats A2A](https://img.shields.io/badge/AgentBeats-A2A%20Enabled-purple)](https://agentbeats.dev/ClinyQAi/nursesim-triage)

[![OpenEnv Challenge](https://img.shields.io/badge/OpenEnv-Challenge%202026-blue)](https://rdi.berkeley.edu/agentx-agentbeats)
[![Hugging Face Model](https://img.shields.io/badge/🤗-Model-yellow)](https://huggingface.co/NurseCitizenDeveloper/NurseSim-Triage-Llama-3.2-3B)
[![W&B Report](https://img.shields.io/badge/W%26B-Report-orange)](https://wandb.ai/mrlincs-nursing-citizen-development/huggingface)
[![License: MIT](https://img.shields.io/badge/License-MIT-green.svg)](LICENSE)

> **OpenEnv Challenge Entry** | Berkeley RDI AgentX-AgentBeats Competition  
> A Gymnasium-compatible RL environment for training AI agents to perform clinical triage using the Manchester Triage System (MTS).

![NurseSim Demo](docs/demo.gif)

## 🎯 Overview

**NurseSim-RL** simulates the decision-making process of a Triage Nurse in an Accident & Emergency (A&E) department. The agent must assess patients based on their chief complaint and vital signs, then assign an appropriate triage category (1-5) according to the Manchester Triage System.

### Key Features
- **Gymnasium-Compatible:** Standard RL interface for easy integration.
- **Expanded Dataset:** Trained on **2,100+** synthetic patient scenarios across all 5 MTS categories.
- **Safety-Aware Rewards:** Heavy penalties for under-triaging critical patients.
- **Fine-Tuned Agent:** Llama 3.2 3B trained with Unsloth (4-bit QLoRA) - **60% accuracy validated**.
- **NEW: Semantic RL Mode:** NurseEmbed-powered text embeddings for language-conditioned agents.
- **Age-Aware Triage:** Demographic parsing for accurate risk stratification.
- **A2A Protocol:** Agent-to-Agent evaluation via AgentBeats platform.
- **Docker Deployment:** Fully containerized for reproducibility.
- **Dual Mode:** Runs as interactive demo (Gradio) or API server (A2A).

## 🚀 Quick Start

### Run with Docker

```bash
# Pull the image
docker pull nursecitizendeveloper/nursesim-triage:latest

# Run in demo mode (Gradio UI)
docker run -p 7860:7860 nursecitizendeveloper/nursesim-triage:latest

# Run in A2A mode (API only)
docker run -e MODE=a2a -p 7860:7860 nursecitizendeveloper/nursesim-triage:latest
```

### Test the A2A Endpoint

```bash
# Health check
curl https://nursecitizendeveloper-nursesim-triage-demo.hf.space/health

# Get agent card
curl https://nursecitizendeveloper-nursesim-triage-demo.hf.space/.well-known/agent-card.json

# Submit a task
curl -X POST https://nursecitizendeveloper-nursesim-triage-demo.hf.space/process-task \
  -H "Content-Type: application/json" \
  -d '{
    "complaint": "Chest pain",
    "vitals": {
      "heart_rate": 110,
      "blood_pressure": "90/60",
      "spo2": 94,
      "temperature": 37.2
    }
  }'
```

## 🏗️ Project Structure

```
NurseSim-RL/
├── nursesim_rl/           # Core environment package
│   ├── __init__.py
│   ├── TriageEnv.py       # Gymnasium environment
│   └── PatientGenerator.py # Synthetic patient generation
├── notebooks/
│   └── NurseSim_RL_Unsloth_Training.ipynb  # Training notebook
├── data/
│   ├── train.jsonl        # Training dataset (500 examples)
│   └── val.jsonl          # Validation dataset (100 examples)
├── app.py                 # Gradio demo application
├── Dockerfile             # For reproducibility
├── requirements.txt
└── README.md
```

## 🚀 Quick Start

### Installation

```bash
git clone https://github.com/NurseCitizenDeveloper/NurseSim-RL.git
cd NurseSim-RL
pip install -r requirements.txt
```

### Using the Environment

```python
import gymnasium as gym
from nursesim_rl import TriageEnv

env = gym.make("NurseSim-Triage-v0")
obs, info = env.reset()

# Agent takes an action
action = {"triage_category": 2, "intervention": 1}
obs, reward, terminated, truncated, info = env.step(action)
```

### Running the Demo

**Gradio Mode (Human UI):**
```bash
export AGENT_MODE=gradio
export HF_TOKEN=your_hf_token_here
python app.py
```

**AgentBeats A2A Mode (Platform Integration):**
```bash
export AGENT_MODE=a2a
export HF_TOKEN=your_hf_token_here
python agent_main.py
```

## 🤖 AgentBeats Integration

This agent is fully compatible with the [AgentBeats platform](https://agentbeats.org) for automated agent evaluation via the **Agent-to-Agent (A2A) protocol**.

### Dual-Mode Architecture

The agent supports two deployment modes:

| Mode | Purpose | Entry Point | Port |
|------|---------|-------------|------|
| **Gradio** | Human-facing UI for demos | `app.py` | 7860 |
| **A2A** | Platform integration for automated evaluation | `agent_main.py` | 8080 |

Set the mode via the `AGENT_MODE` environment variable.

### A2A Protocol Compliance

- **Agent Card:** `.well-known/agent-card.json` - Metadata and schemas
- **Task Processing:** Structured input/output for triage assessments
- **Lifecycle Methods:** `reset()`, `health_check()`
- **Protocol Version:** A2A v1.0

### Local Testing with AgentBeats Controller

```bash
# Install earthshaker SDK
pip install earthshaker

# Set environment variables
export HF_TOKEN=your_hf_token_here
export AGENT_MODE=a2a

# Run the controller
earthshaker run_ctrl

# Test the agent card endpoint (in another terminal)
curl http://localhost:8080/.well-known/agent-card.json | jq

# Submit a test task via A2A protocol
curl -X POST http://localhost:8080/task \
  -H "Content-Type: application/json" \
  -d '{
    "complaint": "Chest pain and shortness of breath",
    "vitals": {
      "heart_rate": 120,
      "blood_pressure": "85/55",
      "spo2": 89,
      "temperature": 37.8
    }
  }'
```

### Docker Deployment

**Build:**
```bash
docker build -t nursesim-triage:latest .
```

**Run in A2A Mode:**
```bash
docker run -e HF_TOKEN=$HF_TOKEN -e AGENT_MODE=a2a -p 8080:8080 nursesim-triage:latest
```

**Run in Gradio Mode:**
```bash
docker run -e HF_TOKEN=$HF_TOKEN -e AGENT_MODE=gradio -p 7860:7860 nursesim-triage:latest
```

## 📊 Training Results & Validation

The agent was fine-tuned using **Unsloth** on a Llama 3.2 3B base model with an expanded dataset of ~2,100 clinical scenarios.

### ✅ Performance Metrics (Validated)
Evaluated on 15 Gold-Standard Clinical Scenarios using GPT-5.2 as a Clinical Judge.

| Metric | Value | Description |
|--------|-------|-------------|
| **Accuracy** | **60%** | Exact match with Manchester Triage Categories (1-5) |
| **Safety** | **70%+** | Pass Rate for critical life-threat detection (Sepsis, Anaphylaxis) |
| **Training Loss** | 0.19 | Final loss after 300 steps |
| **Hardware** | NVIDIA A100 | Google Colab |
| **Training Time** | 25 minutes | Using Unsloth QLoRA |

### 🧠 Key Methodology: Age-Aware Triage
Our validation revealed that **parsing Age and Gender** from the patient description is critical for accurate risk stratification (e.g., separating "Chest Pain" in a 72M vs 20M). The model effectively learned these demographic risk factors, improving accuracy from 16% to 60%.

See our [W&B Report](https://wandb.ai/mrlincs-nursing-citizen-development/huggingface) for detailed training curves.

## 🩺 Clinical Framework: Manchester Triage System

| Category | Priority | Target Time | Example |
|----------|----------|-------------|---------|
| 1 | Immediate | 0 min | Cardiac arrest, Anaphylaxis |
| 2 | Very Urgent | 10 min | Chest pain, Stroke |
| 3 | Urgent | 60 min | Abdominal pain, Fractures |
| 4 | Standard | 120 min | Minor injuries, Mild illness |
| 5 | Non-Urgent | 240 min | Minor cuts, GP-suitable |

## 📚 Resources

- **Hugging Face Space:** [Try the Demo](https://huggingface.co/spaces/NurseCitizenDeveloper/NurseSim-Triage-Demo)
- **Model Card:** [NurseSim-Triage-Llama-3.2-3B](https://huggingface.co/NurseCitizenDeveloper/NurseSim-Triage-Llama-3.2-3B)
- **Training Report:** [W&B Dashboard](https://wandb.ai/mrlincs-nursing-citizen-development/huggingface)
- **Blog Post:** [Training AI Agents for Clinical Triage](https://huggingface.co/blog/NurseCitizenDeveloper/nursesim-rl-training-ai-agents-clinical-triage)
- **AgentBeats Profile:** [NurseSim-Triage Benchmark](https://agentbeats.dev/ClinyQAi/nursesim-triage)
- **Leaderboard:** [Community Results](https://github.com/ClinyQAi/NurseSim-Triage-Leaderboard)
- **Docker Hub:** [nursecitizendeveloper/nursesim-triage](https://hub.docker.com/r/nursecitizendeveloper/nursesim-triage)

## 🤖 AgentBeats Integration

NurseSim-Triage implements the **Agent-to-Agent (A2A) protocol** for automated benchmarking:

### Protocol Details
- **Version:** a2a/v1.0
- **Agent Card:** `/.well-known/agent-card.json`
- **Health Endpoint:** `/health`
- **Task Endpoint:** `/process-task` (POST)

### Evaluation Metrics
- **Triage Accuracy** (0-1): Percentage of correct MTS assignments
- **Safety Score** (0-1): Penalizes dangerous under-triage
- **Response Quality** (0-1): Clinical reasoning coherence
- **Response Time** (ms): Computational efficiency

### Submit Your Agent
1. Register on [AgentBeats](https://agentbeats.dev)
2. Implement the A2A protocol
3. Submit to NurseSim-Triage benchmark
4. View results on the [leaderboard](https://agentbeats.dev/ClinyQAi/nursesim-triage)

## 🐳 Deployment

### Hugging Face Spaces
Deployed on **NVIDIA T4 (Medium)** GPU with:
- 4-bit quantization (`BitsAndBytesConfig`)
- Asynchronous model loading
- Dual-mode support (Gradio + A2A)

### Docker
```bash
# Build locally
docker build -t nursesim-triage .

# Run in demo mode
docker run -p 7860:7860 nursesim-triage

# Run in A2A mode
docker run -e MODE=a2a -p 7860:7860 nursesim-triage
```

### Environment Variables
- `MODE`: `gradio` (default) or `a2a`
- `HF_TOKEN`: Hugging Face API token (for private models)
- `OMP_NUM_THREADS`: OpenMP threads (auto-configured)

## 🏆 OpenEnv Challenge

This project was submitted to the **OpenEnv Challenge 2026** (Berkeley RDI AgentX-AgentBeats Competition).

**Key Contributions:**
- Novel benchmark for clinical AI evaluation
- Safety-focused metrics (penalizes under-triage)
- Open-source training pipeline
- Reproducible Docker deployment
- Community leaderboard

## 📄 License

MIT License - See [LICENSE](LICENSE) for details.

## 🙏 Acknowledgements

**Mentors and Champions of Innovation:**
- **Dr Clare Cable**, Chief Executive, Burdett Trust for Nursing — For championing Relational Intelligence
- **Professor Joanne Bosanquet**, Chief Executive, Foundation of Nursing Studies — For championing person-centred nursing
- **Professor Gemma Stacey**, Programme Director, Nursing Now Challenge — For inspiring global nursing leadership
- **Aisha Holloway**, Chief Nursing Officer, Scotland — For inspiring excellence
- **Josie Rudman MBE** — Mutual Mentor & champion of nurse-led innovation

**Research & Education Partners:**
- **Kumbi Kariwo** — Champion of AI equity and bias mitigation
- **Rohit Sagoo** — Children's Nurse & Innovator in education and practice
- **Dr Hellena Habte-Asres** — Big Data Researcher, Nurse & Innovator
- **Kelly Thobekile Ncube** — Senior Lecturer in Adult Nursing (SFHEA) and Global Health Lecturer Volunteer Fellow

**Technical Community:**
- **OpenEnv Challenge** — Berkeley RDI, PyTorch, Hugging Face, Unsloth
- **Manchester Triage System** — Clinical framework
- **Unsloth AI** — 2x faster fine-tuning
- **AgentBeats** — A2A protocol infrastructure
- **NVIDIA** — T4 GPU infrastructure

---

**Built for the OpenEnv Challenge 2026** 🏆

# Force rebuild trigger