Spaces:
Paused
Paused
File size: 4,560 Bytes
fab8051 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 |
---
title: RadExtract
emoji: ποΈ
colorFrom: blue
colorTo: green
sdk: docker
pinned: false
license: apache-2.0
header: mini
app_port: 7870
tags:
- medical
- nlp
- radiology
- langextract
- gemini
- structured-data
---
# RadExtract: Radiology Report Structuring Demo
[](https://huggingface.co/spaces/google/radextract)
[](https://github.com/google/langextract)
[](https://opensource.org/licenses/Apache-2.0)
A demonstration application powered by [LangExtract](https://github.com/google/langextract) that structures radiology reports using Gemini models. Transform unstructured radiology text into organized, interactive segments with clinical significance annotations.
## Try the Demo
**[Launch RadExtract Demo](https://huggingface.co/spaces/google/radextract)**
Transform unstructured radiology reports into structured data with highlighted findings that are precisely mapped back to the original source text.
## Key Features
- **Structured Output**: Organizes reports into anatomical sections with clinical significance
- **Interactive Highlighting**: Click any finding to see its exact source in the original text
- **Clinical Significance**: Annotates findings as minor, significant, or grounding
- **Character-Level Mapping**: Precise attribution back to source text
- **Multi-Model Support**: Gemini 2.5 Flash (fast) and Pro (comprehensive)
## Quick Start
### Setup
```bash
git clone https://huggingface.co/spaces/google/radextract
cd radextract
python -m venv venv
source venv/bin/activate
pip install -e ".[dev]"
cp env.list.example env.list
# Edit env.list and set KEY=your_gemini_api_key_here
```
### Local Development
```bash
source venv/bin/activate
export KEY=your_gemini_api_key_here
python app.py
```
Access at: http://localhost:7870
## API Usage
### Example Request
```bash
curl -X POST \
-H 'X-Model-ID: gemini-2.5-flash' \
-H 'X-Use-Cache: true' \
-d 'FINDINGS: Normal heart and lungs. IMPRESSION: Normal study.' \
http://localhost:7870/predict
```
### Response Format
```json
{
"segments": [{
"type": "body",
"label": "Chest",
"content": "Normal heart and lungs",
"intervals": [{"startPos": 10, "endPos": 32}],
"significance": "minor"
}],
"text": "Chest:\n- Normal heart and lungs",
"annotated_document_json": {...}
}
```
## Architecture
- **Backend**: Flask + Python 3.10+ with full type safety
- **NLP Engine**: [LangExtract](https://github.com/google/langextract) for structured extraction
- **AI Models**: Google Gemini 2.5 (Flash/Pro)
- **Frontend**: Vanilla JavaScript with interactive UI
- **Deployment**: Docker + Hugging Face Spaces
- **Package Details**: See [pyproject.toml](https://huggingface.co/spaces/google/radextract/blob/main/pyproject.toml) for dependencies, metadata, and tooling
## Project Structure
```
radextract/
βββ app.py # Flask API endpoints
βββ structure_report.py # Core structuring logic
βββ sanitize.py # Text preprocessing & normalization
βββ prompt_instruction.py # LangExtract prompt
βββ cache_manager.py # Response caching
βββ static/ # Frontend assets
βββ templates/ # HTML templates
```
## Development
### Setup
```bash
git clone https://huggingface.co/spaces/google/radextract
cd radextract
python -m venv venv
source venv/bin/activate
pip install -e ".[dev]"
```
### Code Quality
```bash
# Format code
pyink . && isort .
# Type checking
mypy . --ignore-missing-imports
# Run tests
pytest
```
### Docker
```bash
# Build and run
docker build -t radextract .
docker run -p 7870:7870 --env-file env.list radextract
```
## License
Apache License 2.0 - see [LICENSE](LICENSE) for details.
## Related Projects
- **[LangExtract](https://github.com/google/langextract)**: Core NLP library
---
**Built for the medical AI community** | **Hosted on Hugging Face Spaces**
## Disclaimer
This is not an officially supported Google product. If you use RadExtract or LangExtract in production or publications, please cite accordingly and acknowledge usage. Use is subject to the [Apache 2.0 License](LICENSE). For health-related applications, use of LangExtract is also subject to the [Health AI Developer Foundations Terms of Use](https://developers.google.com/health-ai-foundations/terms).
|