che111
/

AlphaMed-7B-base-rl

Model card Files Files and versions

AlphaMed-7B-base-rl / README.md

che111's picture

Update README.md

1f8a3cd verified 9 months ago

|

history blame contribute delete

1.41 kB

	---
	license: mit
	---

	# 🧠 AlphaMed

	This is the official model checkpoint for the paper:
	[AlphaMed: Incentivizing Medical Reasoning with minimalist Rule-Based RL](https://www.arxiv.org/abs/2505.17952)
	AlphaMed is a medical large language model trained without supervised fine-tuning on chain-of-thought (CoT) data,
	relying solely on reinforcement learning to elicit step-by-step reasoning in complex medical tasks.

	## 🚀 Usage

	To use the model, format your input prompt as:

	> Question: [your medical question here]
	> Please reason step by step, and put the final answer in \boxed{}

	### 🔬 Example

	```python
	from transformers import AutoTokenizer, AutoModelForCausalLM, pipeline

	# Load model and tokenizer
	model_id = "che111/AlphaMed-3B-instruct-rl" # Replace with actual repo path
	tokenizer = AutoTokenizer.from_pretrained(model_id)
	model = AutoModelForCausalLM.from_pretrained(model_id)

	pipe = pipeline("text-generation", model=model, tokenizer=tokenizer)

	# Format question
	prompt = (
	"Question: A 45-year-old patient presents with chest pain radiating to the left arm and elevated troponin levels. "
	"What is the most likely diagnosis?\n"
	"Please reason step by step, and put the final answer in \\boxed{}"
	)

	# Generate output
	max_new_tokens=8196
	output = pipe(prompt, max_new_tokens=max_new_tokens, do_sample=False)[0]["generated_text"]
	print(output)