--- license: apache-2.0 tags: - text-generation - language-model - causal-lm - cosmicfish - 120m - transformer - rope - gqa - swiglu - rmsnorm language: en datasets: - CosmicSet-1.0 - akkiisfrommars/TreeCorpusCleanedmodel model_type: CosmicFish pipeline_tag: text-generation --- # CosmicFish-120M A 120M parameter language model with modern architecture improvements developed by Mistyoz AI. ## Quick Start **The easiest way to chat with CosmicFish is using our chat.py script:** ```bash # Download the chat script from this repository wget https://huggingface.co/MistyozAI/CosmicFish-120M/resolve/main/chat.py # Install dependencies pip install transformers huggingface-hub termcolor # Run the chat interface (automatically downloads model) python chat.py ``` The `chat.py` script handles all model loading, generation, and provides the best chat experience with live streaming, repetition penalty, and conversation commands. ## Model Details - **Parameters**: 121M - **Architecture**: CosmicFish (RoPE, GQA, SwiGLU, RMSNorm) - **Context Length**: 512 tokens - **Vocabulary**: 50,257 tokens - **Training Data**: CosmicSet 1.0 - **Developer**: Mistyoz AI - **Repository**: MistyozAI/CosmicFish-120M ## Usage ### Installation ```bash pip install transformers huggingface-hub termcolor ``` ### Quick Chat Interface ```python from transformers import GPT2Tokenizer from huggingface_hub import snapshot_download import torch import json import os # Download model from Hugging Face Hub cache_dir = snapshot_download(repo_id="MistyozAI/CosmicFish-120M") # Load tokenizer tokenizer = GPT2Tokenizer.from_pretrained("gpt2") # Load config with open(os.path.join(cache_dir, "config.json")) as f: config_dict = json.load(f) # Load model weights state_dict = torch.load(os.path.join(cache_dir, "pytorch_model.bin"), map_location="cpu") # Note: Full model class available in the repository print("Model downloaded and ready for use!") ``` ### Advanced Generation with Repetition Penalty ```python def generate_with_repetition_penalty(model, tokenizer, prompt, max_tokens=100, temperature=0.7, penalty=1.2): input_ids = torch.tensor(tokenizer.encode(prompt)).unsqueeze(0) generated = input_ids.clone() for _ in range(max_tokens): with torch.no_grad(): logits, _ = model(generated) next_token_logits = logits[:, -1, :] / temperature # Apply repetition penalty if penalty > 1.0: for token_id in set(generated[0].tolist()): if next_token_logits[0, token_id] > 0: next_token_logits[0, token_id] /= penalty else: next_token_logits[0, token_id] *= penalty probs = torch.nn.functional.softmax(next_token_logits, dim=-1) next_token = torch.multinomial(probs, num_samples=1) if next_token.item() == tokenizer.eos_token_id: break generated = torch.cat([generated, next_token], dim=1) return tokenizer.decode(generated[0], skip_special_tokens=True) ``` ### Chat Interface ```python def chat_with_model(): conversation = [] while True: user_input = input("You: ") if user_input.lower() in ['quit', 'exit']: break context = "Below is a conversation between a human and an AI assistant.\n\n" for human, ai in conversation: context += f"Human: {human}\nAssistant: {ai}\n\n" context += f"Human: {user_input}\nAssistant:" # Generate response with repetition penalty response = generate_with_repetition_penalty( model, tokenizer, context, max_tokens=150, temperature=0.7, penalty=1.2 ) # Extract just the assistant's response response = response.split("Assistant:")[-1].split('\n')[0].strip() print(f"CosmicFish: {response}") conversation.append((user_input, response)) chat_with_model() ``` ## Architecture CosmicFish uses several modern improvements over standard transformers: - **RoPE (Rotary Position Embeddings)**: Better position encoding than absolute positions - **GQA (Grouped-Query Attention)**: Reduces memory usage with 4 query groups - **SwiGLU**: More effective activation function than ReLU/GELU - **RMSNorm**: Simpler, more stable normalization than LayerNorm ## Training - **Dataset**: CosmicSet 1.0 - **Sequence Length**: 512 tokens - **Training Steps**: ~300K iterations - **Hardware**: Nvidia A40 x1 ## Performance - **Speed**: Varies by hardware (not benchmarked) - **Memory**: ~500MB RAM (FP16) - **File Size**: 243MB ## Limitations - Small model size (120M parameters) may produce less accurate responses - 512 token context limit - Training data cutoff applies - May generate incorrect information - Cannot browse internet or access real-time data ## License Apache 2.0 - see LICENSE file. ## Credit If you use CosmicFish-120M, please credit Mistyoz AI.