Instructions to use Nexesenex/TomGrc_FusionNet_7Bx2_MoE_v0.1-iMat.GGUF with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- llama-cpp-python
How to use Nexesenex/TomGrc_FusionNet_7Bx2_MoE_v0.1-iMat.GGUF with llama-cpp-python:
# !pip install llama-cpp-python from llama_cpp import Llama llm = Llama.from_pretrained( repo_id="Nexesenex/TomGrc_FusionNet_7Bx2_MoE_v0.1-iMat.GGUF", filename="TomGrc_FusionNet_7Bx2_MoE_v0.1-b1924-Q8_0.gguf", )
output = llm( "Once upon a time,", max_tokens=512, echo=True ) print(output)
- Notebooks
- Google Colab
- Kaggle
- Local Apps
- llama.cpp
How to use Nexesenex/TomGrc_FusionNet_7Bx2_MoE_v0.1-iMat.GGUF with llama.cpp:
Install from brew
brew install llama.cpp # Start a local OpenAI-compatible server with a web UI: llama-server -hf Nexesenex/TomGrc_FusionNet_7Bx2_MoE_v0.1-iMat.GGUF:Q4_K_S # Run inference directly in the terminal: llama-cli -hf Nexesenex/TomGrc_FusionNet_7Bx2_MoE_v0.1-iMat.GGUF:Q4_K_S
Install from WinGet (Windows)
winget install llama.cpp # Start a local OpenAI-compatible server with a web UI: llama-server -hf Nexesenex/TomGrc_FusionNet_7Bx2_MoE_v0.1-iMat.GGUF:Q4_K_S # Run inference directly in the terminal: llama-cli -hf Nexesenex/TomGrc_FusionNet_7Bx2_MoE_v0.1-iMat.GGUF:Q4_K_S
Use pre-built binary
# Download pre-built binary from: # https://github.com/ggerganov/llama.cpp/releases # Start a local OpenAI-compatible server with a web UI: ./llama-server -hf Nexesenex/TomGrc_FusionNet_7Bx2_MoE_v0.1-iMat.GGUF:Q4_K_S # Run inference directly in the terminal: ./llama-cli -hf Nexesenex/TomGrc_FusionNet_7Bx2_MoE_v0.1-iMat.GGUF:Q4_K_S
Build from source code
git clone https://github.com/ggerganov/llama.cpp.git cd llama.cpp cmake -B build cmake --build build -j --target llama-server llama-cli # Start a local OpenAI-compatible server with a web UI: ./build/bin/llama-server -hf Nexesenex/TomGrc_FusionNet_7Bx2_MoE_v0.1-iMat.GGUF:Q4_K_S # Run inference directly in the terminal: ./build/bin/llama-cli -hf Nexesenex/TomGrc_FusionNet_7Bx2_MoE_v0.1-iMat.GGUF:Q4_K_S
Use Docker
docker model run hf.co/Nexesenex/TomGrc_FusionNet_7Bx2_MoE_v0.1-iMat.GGUF:Q4_K_S
- LM Studio
- Jan
- Ollama
How to use Nexesenex/TomGrc_FusionNet_7Bx2_MoE_v0.1-iMat.GGUF with Ollama:
ollama run hf.co/Nexesenex/TomGrc_FusionNet_7Bx2_MoE_v0.1-iMat.GGUF:Q4_K_S
- Unsloth Studio new
How to use Nexesenex/TomGrc_FusionNet_7Bx2_MoE_v0.1-iMat.GGUF with Unsloth Studio:
Install Unsloth Studio (macOS, Linux, WSL)
curl -fsSL https://unsloth.ai/install.sh | sh # Run unsloth studio unsloth studio -H 0.0.0.0 -p 8888 # Then open http://localhost:8888 in your browser # Search for Nexesenex/TomGrc_FusionNet_7Bx2_MoE_v0.1-iMat.GGUF to start chatting
Install Unsloth Studio (Windows)
irm https://unsloth.ai/install.ps1 | iex # Run unsloth studio unsloth studio -H 0.0.0.0 -p 8888 # Then open http://localhost:8888 in your browser # Search for Nexesenex/TomGrc_FusionNet_7Bx2_MoE_v0.1-iMat.GGUF to start chatting
Using HuggingFace Spaces for Unsloth
# No setup required # Open https://huggingface.co/spaces/unsloth/studio in your browser # Search for Nexesenex/TomGrc_FusionNet_7Bx2_MoE_v0.1-iMat.GGUF to start chatting
- Docker Model Runner
How to use Nexesenex/TomGrc_FusionNet_7Bx2_MoE_v0.1-iMat.GGUF with Docker Model Runner:
docker model run hf.co/Nexesenex/TomGrc_FusionNet_7Bx2_MoE_v0.1-iMat.GGUF:Q4_K_S
- Lemonade
How to use Nexesenex/TomGrc_FusionNet_7Bx2_MoE_v0.1-iMat.GGUF with Lemonade:
Pull the model
# Download Lemonade from https://lemonade-server.ai/ lemonade pull Nexesenex/TomGrc_FusionNet_7Bx2_MoE_v0.1-iMat.GGUF:Q4_K_S
Run and chat with the model
lemonade run user.TomGrc_FusionNet_7Bx2_MoE_v0.1-iMat.GGUF-Q4_K_S
List all available models
lemonade list
YAML Metadata Warning:empty or missing yaml metadata in repo card
Check out the documentation for more information.
GGUF Quants with iMatrix for : https://huggingface.co/TomGrc/FusionNet_7Bx2_MoE_v0.1
The second version of a hard-benching model, but without excessive overfit : the perplexity is a bit high, but still drops at longer context before stabilizing, which means that the model is really usable. As far as I know, that's the smartest compromise among the 7bx2 Mistral amateur MOE on the 1st february 2024.
Llama CPP Benchs :
- FusionNet_7Bx2_MoE_v0.1-b1924-Q8_0.gguf,-,Hellaswag,89,400,2024-02-03 00:00:00,,MOE_14b,Mistral_v0.2,8192,,,GGUF,TomGrc,Nexesenex,
- FusionNet_7Bx2_MoE_v0.1-b1924-Q8_0.gguf,-,Hellaswag_Bin,85,400,2024-02-03 00:00:00,,MOE_14b,Mistral_v0.2,8192,,,GGUF,TomGrc,Nexesenex,
- FusionNet_7Bx2_MoE_v0.1-b1924-Q8_0.gguf,-,Arc-Challenge,61.20401338,,299,2024-02-03 05:40:00,,MOE_14b,Mistral_v0.2,8192,,,GGUF,TomGrc,Nexesenex,
- FusionNet_7Bx2_MoE_v0.1-b1924-Q8_0.gguf,-,Arc-Easy,76.31578947,,570,2024-02-03 05:40:00,,MOE_14b,Mistral_v0.2,8192,,,GGUF,TomGrc,Nexesenex,
- FusionNet_7Bx2_MoE_v0.1-b1924-Q8_0.gguf,-,MMLU,44.40433213,,277,2024-02-03 05:40:00,,MOE_14b,Mistral_v0.2,8192,,,GGUF,TomGrc,Nexesenex,
- FusionNet_7Bx2_MoE_v0.1-b1924-Q8_0.gguf,-,Thruthful-QA,51.77478580,,817,2024-02-03 05:40:00,,MOE_14b,Mistral_v0.2,8192,,,GGUF,TomGrc,Nexesenex,
- FusionNet_7Bx2_MoE_v0.1-b1924-Q8_0.gguf,-,Winogrande,83.5833,,1267,2024-02-03 05:40:00,,MOE_14b,Mistral_v0.2,8192,,,GGUF,TomGrc,Nexesenex,
- FusionNet_7Bx2_MoE_v0.1-b1924-Q8_0.gguf,-,wikitext,6.7506,512,512,2024-02-03 00:00:00,,MOE_14b,Mistral_v0.2,8192,,,GGUF,TomGrc,Nexesenex,
- FusionNet_7Bx2_MoE_v0.1-b1924-Q8_0.gguf,-,wikitext,5.4589,4096,4096,2024-02-03 00:00:00,,MOE_14b,Mistral_v0.2,8192,,,GGUF,TomGrc,Nexesenex,
- FusionNet_7Bx2_MoE_v0.1-b1924-Q8_0.gguf,-,wikitext,5.3443,6144,6144,2024-02-03 00:00:00,,MOE_14b,Mistral_v0.2,8192,,,GGUF,TomGrc,Nexesenex,
- FusionNet_7Bx2_MoE_v0.1-b1924-Q8_0.gguf,-,wikitext,5.5125,7168,7168,2024-02-03 00:00:00,,MOE_14b,Mistral_v0.2,8192,,,GGUF,TomGrc,Nexesenex,
- FusionNet_7Bx2_MoE_v0.1-b1924-Q8_0.gguf,-,wikitext,5.2504,8192,8192,2024-02-03 00:00:00,,MOE_14b,Mistral_v0.2,8192,,,GGUF,TomGrc,Nexesenex,
- FusionNet_7Bx2_MoE_v0.1-b1924-Q8_0.gguf,-,wikitext,8.9976,10240,10240,2024-02-03 00:00:00,,MOE_14b,Mistral_v0.2,8192,,,GGUF,TomGrc,Nexesenex,
- FusionNet_7Bx2_MoE_v0.1-b1924-Q8_0.gguf,-,wikitext,43.9996,12288,12288,2024-02-03 00:00:00,,MOE_14b,Mistral_v0.2,8192,,,GGUF,TomGrc,Nexesenex,
- FusionNet_7Bx2_MoE_v0.1-b1924-Q8_0.gguf,-,wikitext,780.5238,16384,16384,2024-02-03 00:00:00,,MOE_14b,Mistral_v0.2,8192,,,GGUF,TomGrc,Nexesenex,
- Downloads last month
- 146
2-bit
3-bit
4-bit
5-bit
8-bit