Spaces:
Running
on
Zero
Running
on
Zero
Commit
Β·
d4c1bbe
1
Parent(s):
ec4d4b3
Use README
Browse files
README.md
CHANGED
|
@@ -27,7 +27,7 @@ tags:
|
|
| 27 |
### π **Document RAG (Retrieval-Augmented Generation)**
|
| 28 |
- Upload medical documents (PDF, Word, TXT, MD, JSON, XML, CSV) and get answers based on your uploaded content
|
| 29 |
- Document parsing powered by Gemini MCP for accurate text extraction
|
| 30 |
-
- Hierarchical document indexing with auto-merging retrieval
|
| 31 |
- Mitigates hallucination by grounding responses in your documents
|
| 32 |
- Toggle RAG on/off - when disabled, provides concise clinical answers without document context
|
| 33 |
|
|
@@ -40,9 +40,9 @@ tags:
|
|
| 40 |
- **Enriches Context**: Combines document RAG + web sources for comprehensive answers
|
| 41 |
|
| 42 |
### π§ **MedSwin Medical Specialist Models**
|
| 43 |
-
- **MedSwin
|
|
|
|
| 44 |
- **MedSwin KD** - Knowledge Distillation model
|
| 45 |
-
- **MedSwin TA** - Task-Aware merged model
|
| 46 |
- Models download on-demand for efficient resource usage
|
| 47 |
- Fine-tuned on MedAlpaca-7B for medical domain expertise
|
| 48 |
|
|
@@ -54,9 +54,8 @@ tags:
|
|
| 54 |
- Powered by Gemini MCP for translation
|
| 55 |
|
| 56 |
### π€ **Voice Features**
|
| 57 |
-
- **Speech-to-Text**:
|
| 58 |
-
- **Text-to-Speech**:
|
| 59 |
-
- Speech-to-text powered by Gemini MCP for accurate transcription
|
| 60 |
|
| 61 |
### βοΈ **Advanced Configuration**
|
| 62 |
- Customizable generation parameters (temperature, top-p, top-k)
|
|
@@ -81,94 +80,71 @@ tags:
|
|
| 81 |
## π§ Technical Details
|
| 82 |
|
| 83 |
- **Medical Models**: MedSwin/MedSwin-7B-SFT, MedSwin-7B-KD, MedSwin-Merged-TA-SFT-0.7
|
| 84 |
-
- **Translation**: Gemini MCP (gemini-2.5-flash-lite
|
| 85 |
-
- **Document Parsing**: Gemini MCP (
|
| 86 |
- **Speech-to-Text**: Gemini MCP (gemini-2.5-flash-lite)
|
| 87 |
-
- **Summarization**: Gemini MCP (gemini-2.5-flash
|
| 88 |
- **Reasoning & Reflection**: Gemini MCP (gemini-2.5-flash)
|
| 89 |
-
- **Text-to-Speech**: maya-research/maya1
|
| 90 |
- **Embedding Model**: abhinand/MedEmbed-large-v0.1 (domain-tuned medical embeddings)
|
| 91 |
-
- **RAG Framework**: LlamaIndex with hierarchical node parsing
|
| 92 |
-
- **Web Search**:
|
| 93 |
-
- **MCP
|
| 94 |
-
- **Gemini MCP Server**: mcp-server via MCP protocol
|
| 95 |
|
| 96 |
## π Requirements
|
| 97 |
|
| 98 |
See `requirements.txt` for full dependency list. Key dependencies:
|
| 99 |
-
- **MCP Integration**: `mcp`, `nest-asyncio` (
|
| 100 |
-
- **Fallback Dependencies**: `requests`, `beautifulsoup4` (used when MCP
|
| 101 |
-
- **Core ML**: `transformers`, `torch`
|
| 102 |
-
- **RAG Framework**: `llama-index`
|
| 103 |
-
- **Utilities**: `langdetect`, `gradio`, `spaces`
|
|
|
|
| 104 |
|
| 105 |
### π MCP Configuration
|
| 106 |
|
| 107 |
-
The application uses Gemini MCP (
|
| 108 |
|
| 109 |
```bash
|
| 110 |
-
# Gemini
|
| 111 |
export GEMINI_API_KEY="your-gemini-api-key"
|
| 112 |
|
| 113 |
-
# Gemini MCP Server Configuration
|
| 114 |
export MCP_SERVER_COMMAND="python"
|
| 115 |
-
export MCP_SERVER_ARGS="
|
| 116 |
|
| 117 |
-
# Optional Gemini Configuration
|
| 118 |
-
export GEMINI_MODEL="gemini-2.5-flash" # For
|
| 119 |
-
export GEMINI_MODEL_LITE="gemini-2.5-flash-lite" # For
|
| 120 |
export GEMINI_TIMEOUT=300000 # Request timeout in milliseconds (default: 5 minutes)
|
| 121 |
export GEMINI_MAX_OUTPUT_TOKENS=8192 # Maximum output tokens (default)
|
| 122 |
-
export GEMINI_MAX_FILES=10 # Maximum number of files per request (default)
|
| 123 |
-
export GEMINI_MAX_TOTAL_FILE_SIZE=50 # Maximum total file size in MB (default)
|
| 124 |
export GEMINI_TEMPERATURE=0.2 # Temperature for generation 0-2 (default: 0.2)
|
| 125 |
```
|
| 126 |
|
| 127 |
-
**
|
| 128 |
-
- **Translation**: Multi-language translation using Gemini MCP (gemini-2.5-flash-lite)
|
| 129 |
-
- **Document Parsing**: Extract text from PDF, Word, and other documents using Gemini MCP
|
| 130 |
-
- **Speech-to-Text**: Audio transcription using Gemini MCP (gemini-2.5-flash-lite)
|
| 131 |
-
- **Summarization**: Web content summarization using Gemini MCP (gemini-2.5-flash)
|
| 132 |
-
- **Reasoning & Reflection**: Query analysis and answer quality evaluation using Gemini MCP
|
| 133 |
-
|
| 134 |
-
**Supported File Types for Document Parsing:**
|
| 135 |
-
- Documents: PDF, DOC, DOCX (treated as images, one page = one image)
|
| 136 |
-
- Text: TXT, MD, JSON, XML, CSV
|
| 137 |
-
- Images: JPG, JPEG, PNG, GIF, WebP, SVG, BMP, TIFF
|
| 138 |
-
- Audio: MP3, WAV, AIFF, AAC, OGG, FLAC (up to 15MB per file)
|
| 139 |
-
- Video: MP4, AVI, MOV, WEBM, FLV, MPG, WMV (up to 10 files per request)
|
| 140 |
-
|
| 141 |
-
1. **Install MCP Python SDK** (already in requirements.txt):
|
| 142 |
-
```bash
|
| 143 |
-
pip install mcp nest-asyncio
|
| 144 |
-
```
|
| 145 |
|
| 146 |
-
|
| 147 |
```bash
|
| 148 |
-
|
| 149 |
-
pip install mcp-server
|
| 150 |
```
|
| 151 |
|
| 152 |
-
|
| 153 |
- Visit [Google AI Studio](https://aistudio.google.com/) to get your API key
|
| 154 |
-
- Set it
|
| 155 |
|
| 156 |
-
|
| 157 |
-
|
| 158 |
-
|
| 159 |
-
export MCP_SERVER_COMMAND="python"
|
| 160 |
-
export MCP_SERVER_ARGS="-m mcp_server"
|
| 161 |
-
```
|
| 162 |
|
| 163 |
-
**Note**: The application requires Gemini MCP for translation, document parsing, transcription, and summarization. Web search
|
| 164 |
|
| 165 |
## π― Use Cases
|
| 166 |
|
| 167 |
-
-
|
| 168 |
-
-
|
| 169 |
-
-
|
| 170 |
-
-
|
| 171 |
-
-
|
| 172 |
|
| 173 |
## π₯ Enterprise-Level Clinical Decision Support
|
| 174 |
|
|
@@ -274,16 +250,15 @@ MedLLM Agent is designed to support **doctors, clinicians, and medical specialis
|
|
| 274 |
|
| 275 |
### **Implementation in Clinical Settings**
|
| 276 |
|
| 277 |
-
**Hospital Systems**:
|
|
|
|
|
|
|
|
|
|
| 278 |
|
| 279 |
-
|
| 280 |
-
|
| 281 |
-
**Medical Education**: Support medical training and education with comprehensive, evidence-based answers.
|
| 282 |
|
| 283 |
-
|
| 284 |
|
| 285 |
---
|
| 286 |
|
| 287 |
-
|
| 288 |
-
|
| 289 |
-
> Introduction: A medical app for MCP-1st-Birthday hackathon, integrating MCP searcher and document RAG with autonomous reasoning, planning, and execution capabilities for enterprise-level clinical decision support.
|
|
|
|
| 27 |
### π **Document RAG (Retrieval-Augmented Generation)**
|
| 28 |
- Upload medical documents (PDF, Word, TXT, MD, JSON, XML, CSV) and get answers based on your uploaded content
|
| 29 |
- Document parsing powered by Gemini MCP for accurate text extraction
|
| 30 |
+
- Hierarchical document indexing with auto-merging retrieval for comprehensive context
|
| 31 |
- Mitigates hallucination by grounding responses in your documents
|
| 32 |
- Toggle RAG on/off - when disabled, provides concise clinical answers without document context
|
| 33 |
|
|
|
|
| 40 |
- **Enriches Context**: Combines document RAG + web sources for comprehensive answers
|
| 41 |
|
| 42 |
### π§ **MedSwin Medical Specialist Models**
|
| 43 |
+
- **MedSwin TA** (default) - Task-Aware merged model
|
| 44 |
+
- **MedSwin SFT** - Supervised Fine-Tuned model
|
| 45 |
- **MedSwin KD** - Knowledge Distillation model
|
|
|
|
| 46 |
- Models download on-demand for efficient resource usage
|
| 47 |
- Fine-tuned on MedAlpaca-7B for medical domain expertise
|
| 48 |
|
|
|
|
| 54 |
- Powered by Gemini MCP for translation
|
| 55 |
|
| 56 |
### π€ **Voice Features**
|
| 57 |
+
- **Speech-to-Text**: Voice input transcription using Gemini MCP
|
| 58 |
+
- **Text-to-Speech**: Voice output generation using Maya1 TTS model (optional, fallback to MCP if unavailable)
|
|
|
|
| 59 |
|
| 60 |
### βοΈ **Advanced Configuration**
|
| 61 |
- Customizable generation parameters (temperature, top-p, top-k)
|
|
|
|
| 80 |
## π§ Technical Details
|
| 81 |
|
| 82 |
- **Medical Models**: MedSwin/MedSwin-7B-SFT, MedSwin-7B-KD, MedSwin-Merged-TA-SFT-0.7
|
| 83 |
+
- **Translation**: Gemini MCP (gemini-2.5-flash-lite)
|
| 84 |
+
- **Document Parsing**: Gemini MCP (PDF, Word, TXT, MD, JSON, XML, CSV)
|
| 85 |
- **Speech-to-Text**: Gemini MCP (gemini-2.5-flash-lite)
|
| 86 |
+
- **Summarization**: Gemini MCP (gemini-2.5-flash)
|
| 87 |
- **Reasoning & Reflection**: Gemini MCP (gemini-2.5-flash)
|
| 88 |
+
- **Text-to-Speech**: maya-research/maya1 (optional, with MCP fallback)
|
| 89 |
- **Embedding Model**: abhinand/MedEmbed-large-v0.1 (domain-tuned medical embeddings)
|
| 90 |
+
- **RAG Framework**: LlamaIndex with hierarchical node parsing and auto-merging retrieval
|
| 91 |
+
- **Web Search**: MCP tools with automatic fallback to DuckDuckGo
|
| 92 |
+
- **MCP Server**: Bundled Python-based Gemini MCP server (agent.py)
|
|
|
|
| 93 |
|
| 94 |
## π Requirements
|
| 95 |
|
| 96 |
See `requirements.txt` for full dependency list. Key dependencies:
|
| 97 |
+
- **MCP Integration**: `mcp`, `nest-asyncio`, `google-genai` (for Gemini MCP server)
|
| 98 |
+
- **Fallback Dependencies**: `requests`, `beautifulsoup4`, `ddgs` (used when MCP web search unavailable)
|
| 99 |
+
- **Core ML**: `transformers`, `torch`, `accelerate`
|
| 100 |
+
- **RAG Framework**: `llama-index`, `llama_index.llms.huggingface`, `llama_index.embeddings.huggingface`
|
| 101 |
+
- **Utilities**: `langdetect`, `gradio`, `spaces`, `soundfile`
|
| 102 |
+
- **TTS**: Optional - `TTS` package (voice features work with MCP fallback if unavailable)
|
| 103 |
|
| 104 |
### π MCP Configuration
|
| 105 |
|
| 106 |
+
The application uses a bundled Gemini MCP server (agent.py) for translation, document parsing, transcription, and summarization. Configure via environment variables:
|
| 107 |
|
| 108 |
```bash
|
| 109 |
+
# Required: Gemini API Key
|
| 110 |
export GEMINI_API_KEY="your-gemini-api-key"
|
| 111 |
|
| 112 |
+
# Optional: Gemini MCP Server Configuration (defaults to bundled agent.py)
|
| 113 |
export MCP_SERVER_COMMAND="python"
|
| 114 |
+
export MCP_SERVER_ARGS="/path/to/agent.py" # Default: bundled agent.py
|
| 115 |
|
| 116 |
+
# Optional: Gemini Model Configuration
|
| 117 |
+
export GEMINI_MODEL="gemini-2.5-flash" # For complex tasks (default)
|
| 118 |
+
export GEMINI_MODEL_LITE="gemini-2.5-flash-lite" # For simple tasks (default)
|
| 119 |
export GEMINI_TIMEOUT=300000 # Request timeout in milliseconds (default: 5 minutes)
|
| 120 |
export GEMINI_MAX_OUTPUT_TOKENS=8192 # Maximum output tokens (default)
|
|
|
|
|
|
|
| 121 |
export GEMINI_TEMPERATURE=0.2 # Temperature for generation 0-2 (default: 0.2)
|
| 122 |
```
|
| 123 |
|
| 124 |
+
**Setup Steps:**
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 125 |
|
| 126 |
+
1. **Install Dependencies** (already in requirements.txt):
|
| 127 |
```bash
|
| 128 |
+
pip install mcp nest-asyncio google-genai
|
|
|
|
| 129 |
```
|
| 130 |
|
| 131 |
+
2. **Get Gemini API Key**:
|
| 132 |
- Visit [Google AI Studio](https://aistudio.google.com/) to get your API key
|
| 133 |
+
- Set it: `export GEMINI_API_KEY="your-api-key"`
|
| 134 |
|
| 135 |
+
3. **Run the Application**:
|
| 136 |
+
- The bundled MCP server (agent.py) will be used automatically
|
| 137 |
+
- No additional MCP server installation required
|
|
|
|
|
|
|
|
|
|
| 138 |
|
| 139 |
+
**Note**: The application requires Gemini MCP for translation, document parsing, transcription, and summarization. Web search supports fallback to direct DuckDuckGo API if MCP web search tools are unavailable.
|
| 140 |
|
| 141 |
## π― Use Cases
|
| 142 |
|
| 143 |
+
- **Clinical Decision Support**: Evidence-based answers from documents and current medical literature
|
| 144 |
+
- **Medical Document Q&A**: Query uploaded patient records, research papers, and clinical guidelines
|
| 145 |
+
- **Multi-Language Consultations**: Automatic translation for international patient care
|
| 146 |
+
- **Research Assistance**: Synthesize information from multiple medical sources
|
| 147 |
+
- **Drug Information**: Comprehensive drug information with interaction analysis
|
| 148 |
|
| 149 |
## π₯ Enterprise-Level Clinical Decision Support
|
| 150 |
|
|
|
|
| 250 |
|
| 251 |
### **Implementation in Clinical Settings**
|
| 252 |
|
| 253 |
+
- **Hospital Systems**: Clinical decision support with EMR integration and institutional medical libraries
|
| 254 |
+
- **Specialty Clinics**: Customize with specialty-specific documents and guidelines
|
| 255 |
+
- **Medical Education**: Comprehensive, evidence-based answers for training and education
|
| 256 |
+
- **Research Institutions**: Accelerate research by synthesizing information from multiple sources
|
| 257 |
|
| 258 |
+
---
|
|
|
|
|
|
|
| 259 |
|
| 260 |
+
**β οΈ Important Disclaimer**: This system is designed to **assist** medical professionals with information retrieval and synthesis. It does not replace clinical judgment. All medical decisions must be made by qualified healthcare professionals who consider the full clinical context, patient-specific factors, and their professional expertise.
|
| 261 |
|
| 262 |
---
|
| 263 |
|
| 264 |
+
> **Built for MCP-1st-Birthday Hackathon**: Enterprise-level clinical decision support system integrating MCP protocol, document RAG, and autonomous reasoning capabilities.
|
|
|
|
|
|