Spaces:

Agents-MCP-Hackathon
/

agentic-coach-advisor-medgemma

Running

App Files Files Community

agentic-coach-advisor-medgemma / README.md

David Tang

turn off modal endpoint

b7f9927 6 months ago

preview code

raw

history blame contribute delete

3.23 kB

	---
	title: Agentic Health Coach Medgemma
	emoji: 💬
	colorFrom: blue
	colorTo: purple
	sdk: gradio
	sdk_version: 5.33.1
	app_file: app.py
	pinned: true
	tags:
	- agent-demo-track
	license: mit
	short_description: agentic medGemma health coach with vllm.
	---

	[Youtube explainer (7 mins)](https://youtu.be/NwTKnTHfZAg)
	Nb. Modal backend is turned off since completion of hackathon.
	Host your own Modal LLM endpoint by referring to the .py files.

	# MedGemma Agent: AI-Powered Medical Assistant

	## 🏥 Overview

	MedGemma Agent is an advanced AI-powered medical assistant that provides accessible and accurate medical information to patients and non-medical professionals. Built on top of Google's MedGemma model, this application combines state-of-the-art medical language understanding with multimodal capabilities to deliver clear, concise, and reliable medical insights.

	## ✨ Key Features

	- Multimodal Understanding: Process both text queries and medical images
	- Real-time Responses: Stream responses for an interactive experience
	- Wikipedia Integration: Access to verified medical information
	- User-friendly Interface: Clean, modern UI with example queries
	- Secure API: Protected endpoints with API key authentication

	## 🚀 Technical Implementation

	### Backend Architecture

	The application is built using:
	- Modal: For serverless deployment and GPU acceleration
	- FastAPI: For robust API endpoints
	- VLLM: For efficient model inference
	- MedGemma-4B: Fine-tuned medical language model
	- Wikipedia API: For additional medical context

	### Key Components

	1. Model Deployment
	- Utilizes Modal's GPU-accelerated containers
	- Implements efficient model loading with VLLM
	- Supports bfloat16 precision for optimal performance

	2. API Layer
	- Streaming responses for real-time interaction
	- Secure API key authentication
	- Base64 image processing for multimodal inputs

	3. Frontend Interface
	- Built with Gradio for seamless user interaction
	- Custom CSS theming for professional appearance
	- Example queries for common medical scenarios

	## 🛠️ Usage

	1. Text Queries
	- Ask medical questions in natural language
	- Get clear, patient-friendly explanations
	- Example: "What are the symptoms of a stroke?"

	2. Image Analysis
	- Upload medical images for analysis
	- Get AI-powered insights about the image
	- Supports common medical image formats

	## 🔒 Security

	- API key authentication for all requests
	- Secure image processing
	- Protected model endpoints

	## 🏗️ Technical Stack

	- Backend: Modal, FastAPI, VLLM
	- Frontend: Gradio
	- Model: MedGemma-4B (unsloth/medgemma-4b-it-unsloth-bnb-4bit)
	- Additional Tools: Wikipedia API for medical context

	## 🎯 Performance

	- Optimized for low latency responses
	- GPU-accelerated inference
	- Efficient memory utilization with 4-bit quantization
	- Maximum context length of 8192 tokens

	## 🤝 Contributing

	We welcome contributions! Please feel free to submit issues and pull requests.

	## 📝 License

	This project is licensed under the MIT License - see the LICENSE file for details.

	---

	Built with ❤️ for the Hugging Face Spaces Hackathon.