rally-2b-rp

Browser-oriented ONNX export of a Gemma 4 Heretic checkpoint packaged for WebGPU / Transformers.js.

Capabilities

  • Supported inputs: text, image, audio, video

Version Notes

  • This repo uses the enhanced browser v2 multimodal contract, even though the repo name does not carry a -v2 suffix.
  • It includes support for audio, video in addition to text, image.
  • There is no separate -v2 sibling for this tuned repo; this package itself is the multimodal variant.

Provenance

  • Source model: /home/jovyan/work/heretic-to-onnx/build/phala_gpu_tee/models/rally-2b-rp-merged
  • Base model for inherited processor assets: google/gemma-4-E2B-it
  • Architecture family: gemma4_conditional_generation
  • Expected architecture: Gemma4ForConditionalGeneration
  • Target dtype: q4f16
  • Target device: webgpu

Expected ONNX Sessions

  • vision_encoder_q4f16.onnx
  • audio_encoder_q4f16.onnx
  • embed_tokens_q4f16.onnx
  • decoder_model_merged_q4f16.onnx

Usage

Load this repo with Transformers.js using the model's transformers.js_config metadata and WebGPU backend.

Downloads last month
20
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support