bwshen-mi commited on
Commit
f25fa7e
·
verified ·
1 Parent(s): dd63325

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +17 -4
README.md CHANGED
@@ -202,8 +202,6 @@ To support high-throughput RL training for large-scale MoE models, we implemente
202
 
203
  MiMo-V2-Flash supports FP8 mixed precision inference. We recommend using **SGLang** for optimal performance.
204
 
205
- Usage Recommendations: we recommend setting the sampling parameters to `temprature=0.8, top_p=0.95`.
206
-
207
  ### Quick Start with SGLang
208
 
209
  ```bash
@@ -256,8 +254,7 @@ curl -i http://localhost:9001/v1/chat/completions \
256
 
257
  ### Notifications
258
 
259
- > [!IMPORTANT]
260
- > In the thinking mode with multi-turn tool calls, the model returns a `reasoning_content` field alongside `tool_calls`. To continue the conversation, the user must persist all history `reasoning_content` in the `messages` array of each subsequent request.
261
 
262
  > [!IMPORTANT]
263
  > The following system prompts are **HIGHLY** recommended, please choose from English and Chinese version.
@@ -278,6 +275,22 @@ Chinese
278
  今天的日期:{date} {week},你的知识截止日期是2024年12月。
279
  ```
280
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
281
  -----
282
 
283
  ## 7. Citation
 
202
 
203
  MiMo-V2-Flash supports FP8 mixed precision inference. We recommend using **SGLang** for optimal performance.
204
 
 
 
205
  ### Quick Start with SGLang
206
 
207
  ```bash
 
254
 
255
  ### Notifications
256
 
257
+ #### 1. System prompt
 
258
 
259
  > [!IMPORTANT]
260
  > The following system prompts are **HIGHLY** recommended, please choose from English and Chinese version.
 
275
  今天的日期:{date} {week},你的知识截止日期是2024年12月。
276
  ```
277
 
278
+ #### 2. Sampling parameters
279
+
280
+ > [!IMPORTANT]
281
+ > Recommended sampling parameters:
282
+ >
283
+ > `top_p=0.95`
284
+ >
285
+ > `temperature=0.8` for math, writing, web-dev
286
+ >
287
+ > `temperature=0.3` for agentic taks (e.g., vibe-coding, tool-use)
288
+
289
+ #### 3. Tool-use practice
290
+
291
+ > [!IMPORTANT]
292
+ > In the thinking mode with multi-turn tool calls, the model returns a `reasoning_content` field alongside `tool_calls`. To continue the conversation, the user must persist all history `reasoning_content` in the `messages` array of each subsequent request.
293
+
294
  -----
295
 
296
  ## 7. Citation