Instructions to use vignesh0007/Anime-Gen-Llama-2-7B with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use vignesh0007/Anime-Gen-Llama-2-7B with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-generation", model="vignesh0007/Anime-Gen-Llama-2-7B")# Load model directly from transformers import AutoModel model = AutoModel.from_pretrained("vignesh0007/Anime-Gen-Llama-2-7B", dtype="auto") - PEFT
How to use vignesh0007/Anime-Gen-Llama-2-7B with PEFT:
Task type is invalid.
- Notebooks
- Google Colab
- Kaggle
- Local Apps Settings
- vLLM
How to use vignesh0007/Anime-Gen-Llama-2-7B with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "vignesh0007/Anime-Gen-Llama-2-7B" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "vignesh0007/Anime-Gen-Llama-2-7B", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }'Use Docker
docker model run hf.co/vignesh0007/Anime-Gen-Llama-2-7B
- SGLang
How to use vignesh0007/Anime-Gen-Llama-2-7B with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "vignesh0007/Anime-Gen-Llama-2-7B" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "vignesh0007/Anime-Gen-Llama-2-7B", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "vignesh0007/Anime-Gen-Llama-2-7B" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "vignesh0007/Anime-Gen-Llama-2-7B", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }' - Docker Model Runner
How to use vignesh0007/Anime-Gen-Llama-2-7B with Docker Model Runner:
docker model run hf.co/vignesh0007/Anime-Gen-Llama-2-7B
🧠 Anime-Gen-Llama-2-7B
Anime-Gen-Llama-2-7B is a LoRA fine-tuned version of meta-llama/Llama-2-7b-hf, trained on a custom anime/manga-style dataset to generate structured short stories and panel descriptions from prompts. This model was trained using the PEFT library and bitsandbytes for efficient finetuning.
Model Details
Model Description
- Developer: Vignesh Ramaswamy Balasundaram
- Finetuned From:
meta-llama/Llama-2-7b-hf - Language: English
- License: Meta LLaMA 2 community license
- Task Type: Causal Language Modeling
- LoRA Adapter: Yes (via
peft) - Model Size: 7B parameters (base model)
Model Sources
Uses
Direct Use
- Text-to-Anime panel generation
- Short manga-style storytelling
- Prompt-driven narrative generation
Out-of-Scope Use
- Legal, financial, or medical decision-making
- Real-time conversation agents
- General-purpose dialogue
Bias, Risks, and Limitations
Limitations
- Trained on a small, domain-specific dataset (anime-style stories)
- May hallucinate character names or plots
- Not guaranteed to follow strict story structure
Recommendations
- Users should validate outputs for correctness and coherence before using in production.
- Consider using as a creative writing aid rather than factual generation.
How to Get Started
from transformers import AutoTokenizer, AutoModelForCausalLM
from peft import PeftModel, PeftConfig
adapter_id = "vignesh0007/Anime-Gen-Llama-2-7B"
config = PeftConfig.from_pretrained(adapter_id)
base_model = AutoModelForCausalLM.from_pretrained(
config.base_model_name_or_path,
device_map="auto",
trust_remote_code=True
)
model = PeftModel.from_pretrained(base_model, adapter_id)
tokenizer = AutoTokenizer.from_pretrained(adapter_id)
prompt = "Title: The Final Duel\nCharacters: Yuki, Daichi\nPanel 1:"
inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
outputs = model.generate(**inputs, max_new_tokens=100, temperature=0.8, top_p=0.95, do_sample=True)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))