Update README.md
Browse files
README.md
CHANGED
|
@@ -6,7 +6,7 @@ datasets:
|
|
| 6 |
|
| 7 |
|
| 8 |
|
| 9 |
-
#
|
| 10 |
|
| 11 |
<img src='cheap.png' width='700'>
|
| 12 |
|
|
@@ -19,7 +19,7 @@ datasets:
|
|
| 19 |
* **Base**: Qwen 4B (dense)
|
| 20 |
* **Teacher**: Tongyi DeepResearch 30B A3B (MoE)
|
| 21 |
* **Method**: SFT distillation on **33k** curated deep-research examples
|
| 22 |
-
* **Dataset**: [`
|
| 23 |
* **Primary Use**: Fast, low-cost **DeepResearch** agent runs (browsing, multi-step reasoning, source-grounded answers)
|
| 24 |
|
| 25 |
## Evaluation
|
|
@@ -29,7 +29,7 @@ datasets:
|
|
| 29 |
|
| 30 |
## Training Data
|
| 31 |
|
| 32 |
-
* **Primary dataset**: [`
|
| 33 |
|
| 34 |
## Inference with Alibaba-NLP/DeepResearch (Recommended)
|
| 35 |
|
|
@@ -50,7 +50,7 @@ pip install -e . # or pip install -r requirements.txt if provided
|
|
| 50 |
Edit the config to add this model
|
| 51 |
|
| 52 |
```bash
|
| 53 |
-
MODEL_PATH=
|
| 54 |
```
|
| 55 |
|
| 56 |
### Hardware notes
|
|
@@ -76,7 +76,7 @@ If you use this model, please cite:
|
|
| 76 |
title = {CheapResearch 4B Thinking},
|
| 77 |
author = {Artem Y.},
|
| 78 |
year = {2025},
|
| 79 |
-
url = {https://huggingface.co/
|
| 80 |
}
|
| 81 |
```
|
| 82 |
|
|
@@ -87,7 +87,7 @@ And the dataset:
|
|
| 87 |
title = {CheapResearch-DS-33k},
|
| 88 |
author = {Artem Y.},
|
| 89 |
year = {2025},
|
| 90 |
-
url = {https://huggingface.co/datasets/
|
| 91 |
}
|
| 92 |
```
|
| 93 |
|
|
@@ -119,11 +119,11 @@ tags:
|
|
| 119 |
- vllm
|
| 120 |
- cheapresearch
|
| 121 |
datasets:
|
| 122 |
-
-
|
| 123 |
base_model:
|
| 124 |
- Qwen/Qwen3-4B-Thinking-2507
|
| 125 |
model-index:
|
| 126 |
-
- name:
|
| 127 |
results: []
|
| 128 |
---
|
| 129 |
```
|
|
|
|
| 6 |
|
| 7 |
|
| 8 |
|
| 9 |
+
# FlashResearch-4B-Thinking
|
| 10 |
|
| 11 |
<img src='cheap.png' width='700'>
|
| 12 |
|
|
|
|
| 19 |
* **Base**: Qwen 4B (dense)
|
| 20 |
* **Teacher**: Tongyi DeepResearch 30B A3B (MoE)
|
| 21 |
* **Method**: SFT distillation on **33k** curated deep-research examples
|
| 22 |
+
* **Dataset**: [`flashresearch/FlashResearch-DS-33k`](https://huggingface.co/datasets/cheapresearch/CheapResearch-DS-33k)
|
| 23 |
* **Primary Use**: Fast, low-cost **DeepResearch** agent runs (browsing, multi-step reasoning, source-grounded answers)
|
| 24 |
|
| 25 |
## Evaluation
|
|
|
|
| 29 |
|
| 30 |
## Training Data
|
| 31 |
|
| 32 |
+
* **Primary dataset**: [`flashresearch/FlashResearch-DS-33k`](https://huggingface.co/datasets/flashresearch/FlashResearch-DS-33k)
|
| 33 |
|
| 34 |
## Inference with Alibaba-NLP/DeepResearch (Recommended)
|
| 35 |
|
|
|
|
| 50 |
Edit the config to add this model
|
| 51 |
|
| 52 |
```bash
|
| 53 |
+
MODEL_PATH=flashresearch/FlashResearch-4B-Thinking
|
| 54 |
```
|
| 55 |
|
| 56 |
### Hardware notes
|
|
|
|
| 76 |
title = {CheapResearch 4B Thinking},
|
| 77 |
author = {Artem Y.},
|
| 78 |
year = {2025},
|
| 79 |
+
url = {https://huggingface.co/flashresearch/FlashResearch-4B-Thinking}
|
| 80 |
}
|
| 81 |
```
|
| 82 |
|
|
|
|
| 87 |
title = {CheapResearch-DS-33k},
|
| 88 |
author = {Artem Y.},
|
| 89 |
year = {2025},
|
| 90 |
+
url = {https://huggingface.co/datasets/flashresearch/FlashResearch-DS-33k}
|
| 91 |
}
|
| 92 |
```
|
| 93 |
|
|
|
|
| 119 |
- vllm
|
| 120 |
- cheapresearch
|
| 121 |
datasets:
|
| 122 |
+
- flashresearch/FlashResearch-DS-33k
|
| 123 |
base_model:
|
| 124 |
- Qwen/Qwen3-4B-Thinking-2507
|
| 125 |
model-index:
|
| 126 |
+
- name: FlashResearch-4B-Thinking
|
| 127 |
results: []
|
| 128 |
---
|
| 129 |
```
|