text_extractor / README.md
prans-cs55's picture
Update README.md
a4d2f09 verified
---
title: Text Extractor
emoji: πŸ‘€
colorFrom: gray
colorTo: indigo
sdk: gradio
sdk_version: 6.0.2
app_file: app.py
pinned: false
license: mit
---
Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference
___________
.
🧠 OCR Text Extractor + Summarizer
An AI-powered tool that extracts text from images using Tesseract OCR and then summarizes it using a transformer model.
Upload any image (screenshots, photos, scanned documents, notes) β†’ Get clean extracted text + an AI summary.
πŸš€ Features
πŸ“€ Upload an image with text
πŸ”Ž Extracts text using Tesseract OCR
✨ Summarizes extracted text using HuggingFace transformers
⚑ Fast, simple Gradio UI
πŸ› οΈ Works on CPU β€” no GPU required
🧩 How it Works
Image is processed with Tesseract OCR
Extracted text is cleaned
Text is fed into a pretrained summarization model
Output summary is displayed instantly
πŸ—‚οΈ Project Structure
β”œβ”€β”€ app.py
β”œβ”€β”€ requirements.txt
β”œβ”€β”€ packages.txt
└── README.md
πŸ“¦ Dependencies
Python packages (requirements.txt)
gradio
pillow
pytesseract
transformers
torch
tesseract
System packages (packages.txt)
tesseract-ocr
tesseract-ocr-eng
These ensure Tesseract OCR runs correctly on HuggingFace Spaces.
▢️ Running Locally
pip install -r requirements.txt
python app.py
πŸ“Έ Demo
Just upload an image β†’ click Submit β†’ done!
πŸ™Œ Acknowledgements
Tesseract OCR
HuggingFace Transformers
Gradio for UI
πŸ”— Try the live Space
πŸ‘‰https://huggingface.co/spaces/prans-cs55/text_extractor