ml_in_elixir_code / ml_e2e_template.livemd
gokashi's picture
Add files using upload-large-folder tool
a554e78 verified
# Machine Learning in Elixir β€” End-to-End with Bumblebee + Hugging Face
<!-- livebook:{"persist_outputs":true} -->
## Overview
## Skills
- `hf_cli.md` – Hugging Face CLI usage
- `hf_jobs.md` – Running workloads on HF Jobs
- `training_trl.md` – TRL model training
- `hf_dataset_viewer.md` – Dataset Viewer API
- `gradio.md` – Gradio UI integration
- *(Full catalog at https://skills.sh/huggingface/skills)*
This Livebook is a complete end-to-end ML template built on the Elixir ML ecosystem
from _Machine Learning in Elixir_ by Sean Moriarity, with **Bumblebee** as the core
integration layer to the **Hugging Face Hub**.
**What we cover:**
| Section | Library | Task |
|---------|---------|------|
| Foundations | `Nx` | Tensors, gradients, JIT compilation |
| Pre-trained NLP | `Bumblebee` | Fill-mask, sentiment, NER, zero-shot |
| Pre-trained Vision | `Bumblebee` | Image classification (ViT, ResNet) |
| Audio | `Bumblebee` | Speech-to-text (Whisper) |
| Generative AI | `Bumblebee` | Text generation (GPT-2) & Stable Diffusion |
| Embeddings | `Bumblebee` | Sentence similarity search |
| Custom Training | `Axon` | Build & train from scratch |
| Fine-tuning | `Bumblebee` | Boosted training on pre-trained models |
| Serving | `Nx.Serving` | Production batched inference |
| Deployment | `Phoenix` | LiveView integration pattern |
| Interactive UI | `Kino` | Live input forms & charts |
---
## Section 0 β€” Install & Configure
```elixir
Mix.install([
{:nx, "~> 0.10"},
{:axon, "~> 0.7"},
{:exla, "~> 0.10"},
{:bumblebee, "~> 0.6"},
{:kino, "~> 0.15"},
{:kino_vega_lite, "~> 0.1"},
{:vega_lite, "~> 0.1"},
{:stb_image, "~> 0.6"},
{:req, "~> 0.5"}
])
Nx.global_default_backend(EXLA.Backend)
IO.puts("Nx version: #{Nx.version()}")
IO.puts("Axon version: #{Axon.version()}")
IO.puts("EXLA backend: #{inspect(Nx.default_backend())}")
IO.puts("Bumblebee loaded: #{Code.ensure_loaded?(Bumblebee)}")
IO.puts("Cache dir: #{Bumblebee.cache_dir()}")
```
---
## Section 1 β€” Nx Foundations
Before diving into Bumblebee, let's ground ourselves in Nx β€” the numerical
backbone that every Elixir ML library builds on.
### 1.1 Tensors
```elixir
import Nx
# Scalars, vectors, matrices, higher-order
scalar = Nx.tensor(3.14)
vector = Nx.tensor([1.0, 2.0, 3.0])
matrix = Nx.tensor([[1, 2, 3], [4, 5, 6]])
cube = Nx.iota({2, 3, 4})
IO.puts("scalar shape=#{inspect(Nx.shape(scalar))} type=#{Nx.type(scalar)}")
IO.puts("vector shape=#{inspect(Nx.shape(vector))} type=#{Nx.type(vector)}")
IO.puts("matrix shape=#{inspect(Nx.shape(matrix))} type=#{Nx.type(matrix)}")
IO.puts("cube shape=#{inspect(Nx.shape(cube))} type=#{Nx.type(cube)}")
{scalar, vector, matrix}
```
### 1.2 Operations & Broadcasting
```elixir
a = Nx.tensor([1.0, 2.0, 3.0])
b = Nx.tensor([10.0, 20.0, 30.0])
# Element-wise
IO.puts("add: #{inspect(Nx.add(a, b))}")
IO.puts("multiply: #{inspect(Nx.multiply(a, b))}")
IO.puts("pow: #{inspect(Nx.pow(a, 2))}")
# Reductions
IO.puts("sum: #{Nx.sum(a)}")
IO.puts("mean: #{Nx.mean(a)}")
IO.puts("std: #{Nx.standard_deviation(a)}")
# Dot product
IO.puts("dot: #{Nx.dot(a, b)}")
# Matrix multiply
m1 = Nx.tensor([[1.0, 2.0], [3.0, 4.0]])
m2 = Nx.tensor([[5.0, 6.0], [7.0, 8.0]])
IO.puts("matmul: #{inspect(Nx.dot(m1, m2))}")
```
### 1.3 Automatic Differentiation
This is how models learn β€” computing gradients of loss with respect to parameters.
```elixir
defmodule AutoDiff do
import Nx.Defn
# Define a function f(x) = xΒ³ + 2xΒ²
defnp f(x), do: Nx.pow(x, 3) + 2 * Nx.pow(x, 2)
# Compute gradient symbolically
def grad_f(x), do: Nx.Defn.grad(x, &f/1)
# Gradient of MSE loss
defnp mse_loss(y_true, y_pred) do
Nx.mean(Nx.pow(y_true - y_pred, 2))
end
def grad_mse(y_true, y_pred, w) do
Nx.Defn.grad(w, fn weights ->
predictions = Nx.dot(y_pred, weights)
mse_loss(y_true, predictions)
end)
end
end
x = Nx.tensor(3.0)
IO.puts("f(3) = #{Nx.to_number(AutoDiff.f(x))}")
IO.puts("f'(3) = #{Nx.to_number(AutoDiff.grad_f(x))}")
IO.puts("expected = 3*9 + 2*2*3 = #{3 * 9 + 2 * 2 * 3}")
```
### 1.4 JIT Compilation
```elixir
# JIT compiles for GPU/CPU acceleration β€” critical for inference speed
defmodule FastMath do
import Nx.Defn
defn slow_sigmoid(x) do
1 / (1 + Nx.exp(-x))
end
end
# JIT-compiled version
fast_sigmoid = Nx.Defn.jit(&FastMath.slow_sigmoid/1)
input = Nx.tensor([[-2.0, -1.0, 0.0, 1.0, 2.0]])
# Benchmark
{us, result} = :timer.tc(fn -> fast_sigmoid.(input) end)
IO.puts("JIT sigmoid: #{us}ΞΌs result=#{inspect(result)}")
```
---
## Section 2 β€” Bumblebee: Pre-trained NLP Models
Bumblebee loads pre-trained models from the Hugging Face Hub and wraps them
in `Nx.Serving` for production-ready batched inference.
### 2.1 Fill-Mask (BERT)
```elixir
{:ok, bert_model} = Bumblebee.load_model({:hf, "google-bert/bert-base-uncased"})
{:ok, bert_tokenizer} = Bumblebee.load_tokenizer({:hf, "google-bert/bert-base-uncased"})
bert_fill_mask = Bumblebee.Text.fill_mask(bert_model, bert_tokenizer)
results = Nx.Serving.run(bert_fill_mask, "Elixir is a [MASK] language.")
IO.inspect(results, label: "Fill-Mask Results")
```
### 2.2 Sentiment Analysis (DistilBERT)
```elixir
{:ok, sentiment_model} =
Bumblebee.load_model({:hf, "distilbert/distilbert-base-uncased-finetuned-sst-2-english"})
{:ok, sentiment_tokenizer} =
Bumblebee.load_tokenizer({:hf, "distilbert/distilbert-base-uncased-finetuned-sst-2-english"})
sentiment_serving = Bumblebee.Text.Classification.text_classification(
sentiment_model,
sentiment_tokenizer
)
texts = [
"Machine learning in Elixir is amazing!",
"This tutorial is boring and confusing.",
"The BEAM VM handles concurrent ML workloads well.",
"I love how functional programming simplifies ML pipelines."
]
Enum.each(texts, fn text ->
result = Nx.Serving.run(sentiment_serving, text)
IO.puts("#{inspect(result)} ← \"#{String.slice(text, 0..50)}...\"")
end)
```
### 2.3 Named Entity Recognition (BERT-NER)
```elixir
{:ok, ner_model} = Bumblebee.load_model({:hf, "dslim/bert-base-NER"})
{:ok, ner_tokenizer} = Bumblebee.load_tokenizer({:hf, "google-bert/bert-base-uncased"})
ner_serving = Bumblebee.Text.TokenClassification.token_classification(
ner_model,
ner_tokenizer
)
ner_text = "Sean Moriarity wrote Machine Learning in Elixir for Pragmatic Bookshelf. He lives in Austin, Texas."
ner_result = Nx.Serving.run(ner_serving, ner_text)
IO.puts("Input: #{ner_text}")
IO.inspect(ner_result, label: "NER Entities")
```
### 2.4 Zero-Shot Classification
No fine-tuning needed β€” classify arbitrary text into custom categories.
```elixir
{:ok, zs_model} = Bumblebee.load_model({:hf, "facebook/bart-large-mnli"})
{:ok, zs_tokenizer} = Bumblebee.load_tokenizer({:hf, "facebook/bart-large-mnli"})
zs_serving = Bumblebee.Text.ZeroShotClassification.zero_shot_classification(
zs_model,
zs_tokenizer
)
article = """
Nx brings numerical computing to the BEAM, enabling machine learning
pipelines that leverage Elixir's concurrency and fault tolerance.
Bumblebee provides access to thousands of pre-trained models from
the Hugging Face Hub directly in Livebook.
"""
labels = ["technology", "sports", "politics", "science", "finance"]
zs_result = Nx.Serving.run(zs_serving, %{text: article, labels: labels})
IO.inspect(zs_result, label: "Zero-Shot Classification")
```
---
## Section 3 β€” Bumblebee: Vision Models
### 3.1 Image Classification (ViT / ResNet)
```elixir
{:ok, vit_model} = Bumblebee.load_model({:hf, "google/vit-base-patch16-224"})
{:ok, vit_featurizer} = Bumblebee.load_featurizer({:hf, "google/vit-base-patch16-224"})
vit_serving = Bumblebee.Vision.ImageClassification.image_classification(
vit_model,
vit_featurizer
)
# Download a sample image
image_url = "https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/pipeline-cat-chonk.jpeg"
image_data = Req.get!(image_url).body
# Save and load
File.write!("/tmp/sample.jpg", image_data)
{:ok, image} = StbImage.read_file("/tmp/sample.jpg")
IO.puts("Image: #{StbImage.width(image)}x#{StbImage.height(image)}")
img_result = Nx.Serving.run(vit_serving, image)
IO.inspect(img_result, label: "Image Classification")
```
### 3.2 Batch Image Classification
```elixir
# Nx.Serving automatically batches multiple requests for GPU efficiency
images = [image] # In production, this would be multiple images
batch_result = Nx.Serving.run(vit_serving, images)
IO.inspect(batch_result, label: "Batch Classification")
```
---
## Section 4 β€” Bumblebee: Text Generation
### 4.1 GPT-2 Text Generation
```elixir
{:ok, gpt2_model} = Bumblebee.load_model({:hf, "openai-community/gpt2"})
{:ok, gpt2_tokenizer} = Bumblebee.load_tokenizer({:hf, "openai-community/gpt2"})
{:ok, gpt2_generation_config} = Bumblebee.load_generation_config({:hf, "openai-community/gpt2"})
gpt2_serving = Bumblebee.Text.generation(
gpt2_model,
gpt2_tokenizer,
gpt2_generation_config,
compile: [batch_size: 1, sequence_length: 64],
defn_options: [compiler: EXLA]
)
prompt = "Machine learning in Elixir is"
gen_result = Nx.Serving.run(gpt2_serving, prompt)
IO.puts("Prompt: #{prompt}")
IO.puts("Output: #{inspect(gen_result)}")
```
### 4.2 Interactive Text Generation
<!-- livebook:{"attrs":{"source":"# Interactive Text Generation\nalias Kino.Input\n\nprompt_input = Kino.Input.text(\"Prompt\", default: \"The future of ML in Elixir is\")\nmax_tokens = Kino.Input.number(\"Max tokens\", default: 50)\n\nform = Kino.Control.form(\n %{prompt: prompt_input, max_tokens: max_tokens},\n submit: \"Generate\"\n)\n\nKino.listen(form, fn %{data: %{prompt: prompt, max_tokens: max_tokens}} ->\n config = Bumblebee.configure(gpt2_generation_config, max_new_tokens: trunc(max_tokens))\n serving = Bumblebee.Text.generation(gpt2_model, gpt2_tokenizer, config)\n result = Nx.Serving.run(serving, prompt)\n Kino.Text.new(\"#{prompt}#{result.text}\")\nend)\n\nform","title":"GPT-2 Generator"},"chunks":[{"chunk":"","type":"Elixir"}],"kind":"Elixir","source_type":"cell"} -->
```elixir
alias Kino.Input
prompt_input = Kino.Input.text("Prompt", default: "The future of ML in Elixir is")
max_tokens_input = Kino.Input.number("Max tokens", default: 50)
form =
Kino.Control.form(
%{prompt: prompt_input, max_tokens: max_tokens_input},
submit: "Generate"
)
Kino.listen(form, fn %{data: %{prompt: prompt, max_tokens: max_tokens}} ->
config = Bumblebee.configure(gpt2_generation_config, max_new_tokens: trunc(max_tokens))
serving =
Bumblebee.Text.generation(
gpt2_model,
gpt2_tokenizer,
config,
defn_options: [compiler: EXLA]
)
result = Nx.Serving.run(serving, prompt)
Kino.Text.new("#{prompt}#{result.text}")
end)
form
```
---
## Section 5 β€” Bumblebee: Embeddings & Similarity
### 5.1 Sentence Embeddings
```elixir
{:ok, emb_model} = Bumblebee.load_model({:hf, "sentence-transformers/all-MiniLM-L6-v2"})
{:ok, emb_tokenizer} = Bumblebee.load_tokenizer({:hf, "sentence-transformers/all-MiniLM-L6-v2"})
embedding_serving = Bumblebee.Text.TextEmbedding.text_embedding(
emb_model,
emb_tokenizer
)
sentences = [
"Nx provides numerical computing for Elixir",
"Axon is a neural network library built on Nx",
"Bumblebee connects Elixir to the Hugging Face Hub",
"I enjoy cooking Italian food on weekends",
"The weather forecast predicts rain tomorrow"
]
embeddings =
Enum.map(sentences, fn s ->
result = Nx.Serving.run(embedding_serving, s)
result.embedding
end)
IO.puts("Generated #{length(embeddings)} embeddings")
IO.puts("Embedding dim: #{inspect(Nx.shape(hd(embeddings)))}")
```
### 5.2 Cosine Similarity Search
```elixir
defmodule Similarity do
import Nx
defn cosine_similarity(a, b) do
a_norm = a / Nx.sqrt(Nx.sum(a * a))
b_norm = b / Nx.sqrt(Nx.sum(b * b))
Nx.sum(a_norm * b_norm)
end
def find_most_similar(query_embedding, corpus_embeddings) do
corpus_embeddings
|> Enum.map(fn emb -> Nx.to_number(cosine_similarity(query_embedding, emb)) end)
|> Enum.with_index()
|> Enum.sort_by(fn {score, _idx} -> -score end)
end
end
query = "How do I build neural networks in Elixir?"
query_emb = Nx.Serving.run(embedding_serving, query).embedding
IO.puts("Query: \"#{query}\"\n")
Similarity.find_most_similar(query_emb, embeddings)
|> Enum.each(fn {score, idx} ->
IO.puts(" #{Float.round(score, 4)} #{Enum.at(sentences, idx)}")
end)
```
---
## Section 5.5 β€” Bumblebee: Audio (Whisper Speech-to-Text)
Bumblebee wraps OpenAI's Whisper for speech-to-text directly in Elixir.
```elixir
{:ok, whisper_model} = Bumblebee.load_model({:hf, "openai/whisper-tiny"})
{:ok, whisper_featurizer} = Bumblebee.load_featurizer({:hf, "openai/whisper-tiny"})
{:ok, whisper_tokenizer} = Bumblebee.load_tokenizer({:hf, "openai/whisper-tiny"})
{:ok, whisper_generation_config} = Bumblebee.load_generation_config({:hf, "openai/whisper-tiny"})
whisper_serving = Bumblebee.Audio.speech_to_text(
whisper_model,
whisper_featurizer,
whisper_tokenizer,
whisper_generation_config
)
# Download a sample audio file
audio_url = "https://huggingface.co/datasets/Narsil/asr_dummy/resolve/main/mlk.flac"
audio_data = Req.get!(audio_url).body
File.write!("/tmp/sample_audio.flac", audio_data)
# Transcribe
{:ok, audio_info} = Bumblebee.Audio.LoadedAudio.from_file("/tmp/sample_audio.flac")
whisper_result = Nx.Serving.run(whisper_serving, audio_info)
IO.puts("Transcription: #{whisper_result.text}")
```
### Interactive Audio Transcription
```elixir
audio_file_input = Kino.Input.file("Upload audio (WAV/FLAC/MP3)")
audio_form = Kino.Control.form(%{file: audio_file_input}, submit: "Transcribe")
Kino.listen(audio_form, fn %{data: %{file: file}} ->
if file do
{:ok, audio} = Bumblebee.Audio.LoadedAudio.from_file(file.path)
result = Nx.Serving.run(whisper_serving, audio)
Kino.Text.new(result.text)
else
Kino.Text.new("Please upload an audio file.")
end
end)
audio_form
```
---
## Section 5.6 β€” Bumblebee: Stable Diffusion (Image Generation)
Generate images from text prompts using Stable Diffusion β€” all within Elixir.
> **Note:** This section requires a GPU with 4GB+ VRAM. On CPU it will be very slow.
```elixir
{:ok, sd_info} = Bumblebee.load_model({:hf, "CompVis/stable-diffusion-v1-4", subdir: "unet"})
{:ok, sd_vae} = Bumblebee.load_model({:hf, "CompVis/stable-diffusion-v1-4", subdir: "vae"})
{:ok, sd_clip} = Bumblebee.load_model({:hf, "CompVis/stable-diffusion-v1-4", subdir: "text_encoder"})
{:ok, sd_tokenizer} = Bumblebee.load_tokenizer({:hf, "CompVis/stable-diffusion-v1-4", subdir: "tokenizer"})
{:ok, sd_scheduler} = Bumblebee.load_scheduler({:hf, "CompVis/stable-diffusion-v1-4", subdir: "scheduler"})
# Image generation serving
sd_serving = Bumblebee.Diffusion.StableDiffusion.text_to_image(
sd_info,
sd_vae,
sd_clip,
sd_tokenizer,
sd_scheduler,
num_steps: 20,
guidance_scale: 7.5
)
# Generate
sd_result = Nx.Serving.run(sd_serving, %{
prompt: "a photograph of a bee programming in elixir, highly detailed, 4k",
negative_prompt: "blurry, low quality"
})
# Display the generated image
Kino.Image.new(sd_result.image)
```
### Interactive Image Generation
```elixir
prompt_input = Kino.Input.text("Prompt", default: "a cute robot bee coding in Elixir")
neg_input = Kino.Input.text("Negative prompt", default: "blurry, ugly, low quality")
steps_input = Kino.Input.number("Steps", default: 20)
sd_form = Kino.Control.form(
%{prompt: prompt_input, negative: neg_input, steps: steps_input},
submit: "Generate"
)
Kino.listen(sd_form, fn %{data: %{prompt: prompt, negative: negative, steps: steps}} ->
result = Nx.Serving.run(sd_serving, %{
prompt: prompt,
negative_prompt: negative,
num_steps: trunc(steps)
})
Kino.Layout.grid([Kino.Image.new(result.image)], columns: 1)
end)
sd_form
```
---
## Section 6 β€” Custom Training with Axon
Beyond pre-trained models, Axon lets you build and train from scratch.
### 6.1 Synthetic Data
```elixir
defmodule Data do
import Nx
def make_classification(n \\ 2000, features \\ 4, classes \\ 3, seed \\ 42) do
key = Nx.Random.key(seed)
{centers, key} = Nx.Random.normal(key, 0, 2, shape: {classes, features})
{labels_raw, key} = Nx.Random.randint(key, 0, classes, shape: {n})
{noise, _key} = Nx.Random.normal(key, 0, 0.4, shape: {n, features})
x = Nx.take(centers, labels_raw) |> Nx.add(noise)
y = Nx.equal(Nx.new_axis(labels_raw, 1), Nx.iota({1, classes})) |> Nx.as_type(:f32)
# Normalize
mu = Nx.mean(x, axes: [0])
sigma = Nx.standard_deviation(x, axes: [0])
x_norm = Nx.divide(Nx.subtract(x, mu), sigma)
# Split
split = round(n * 0.8)
{{x_norm[0..(split - 1)//1], y[0..(split - 1)//1]},
{x_norm[split..(n - 1)//1], y[split..(n - 1)//1]}}
end
end
{{x_train, y_train}, {x_test, y_test} = _test_data} = Data.make_classification()
IO.puts("Train: #{Nx.axis_size(x_train, 0)} | Test: #{Nx.axis_size(x_test, 0)}")
IO.puts("Features: #{Nx.axis_size(x_train, 1)} | Classes: #{Nx.axis_size(y_train, 1)}")
```
### 6.2 Define Model
```elixir
import Axon
n_features = Nx.axis_size(x_train, 1)
n_classes = Nx.axis_size(y_train, 1)
model =
Axon.input("features", shape: {nil, n_features})
|> Axon.dense(64, activation: :relu, name: "hidden_1")
|> Axon.batch_norm(name: "bn_1")
|> Axon.dropout(rate: 0.2, name: "drop_1")
|> Axon.dense(32, activation: :relu, name: "hidden_2")
|> Axon.batch_norm(name: "bn_2")
|> Axon.dropout(rate: 0.2, name: "drop_2")
|> Axon.dense(n_classes, activation: :softmax, name: "output")
Axon.Display.as_table(model, Nx.template({1, n_features}, :f32)) |> IO.puts()
```
### 6.3 Train
```elixir
train_data =
x_train
|> Nx.to_batched(64)
|> Enum.zip(Nx.to_batched(y_train, 64))
|> Stream.map(fn {xb, yb} -> %{"features" => xb, "targets" => yb} end)
val_data =
x_test
|> Nx.to_batched(64)
|> Enum.zip(Nx.to_batched(y_test, 64))
|> Stream.map(fn {xb, yb} -> %{"features" => xb, "targets" => yb} end)
trained_state =
model
|> Axon.Loop.trainer(:categorical_cross_entropy, Axon.Optimizers.adam(0.001))
|> Axon.Loop.metric(:accuracy, "acc")
|> Axon.Loop.validate(model, val_data)
|> Axon.Loop.early_stopping("validation_loss", patience: 5, mode: :min)
|> Axon.Loop.run(train_data, %{}, epochs: 30, compiler: EXLA)
IO.puts("Training complete!")
```
### 6.4 Evaluate & Predict
```elixir
# JIT-compiled prediction
predict_fn = Nx.Defn.jit(fn params, input -> Axon.predict(model, params, input) end)
# Evaluate on test set
test_preds = predict_fn.(trained_state, x_test)
pred_classes = Nx.argmax(test_preds, axis: 1)
true_classes = Nx.argmax(y_test, axis: 1)
accuracy = Nx.mean(Nx.equal(pred_classes, true_classes) |> Nx.as_type(:f32))
IO.puts("Test accuracy: #{Float.round(Nx.to_number(accuracy) * 100, 2)}%")
# Single prediction
sample = x_test[0]
probs = predict_fn.(trained_state, Nx.new_axis(sample, 0)) |> Nx.squeeze()
IO.puts("Sample prediction: class #{Nx.argmax(probs) |> Nx.to_number()} probs=#{inspect(Nx.to_flat_list(probs) |> Enum.map(&Float.round(&1, 4)))}")
```
### 6.5 Visualize Training
```elixir
# Visualize predictions vs actual on first 2 features
alias VegaLite, as: Vl
scatter_data =
Enum.map(0..(min(Nx.axis_size(x_test, 0), 500) - 1), fn i ->
%{
"f1" => Nx.to_number(x_test[i][0]),
"f2" => Nx.to_number(x_test[i][1]),
"actual" => Nx.to_number(Nx.argmax(y_test[i])),
"predicted" => Nx.to_number(pred_classes[i])
}
end)
Vl.new(
title: "Test Predictions (first 2 features)",
width: 500,
height: 400
)
|> Vl.data_from_values(scatter_data)
|> Vl.layers([
Vl.new()
|> Vl.mark(:circle, opacity: 0.7, size: 40)
|> Vl.encode_field(:x, "f1", type: :quantitative)
|> Vl.encode_field(:y, "f2", type: :quantitative)
|> Vl.encode_field(:color, "actual", type: :nominal, title: "Actual"),
Vl.new()
|> Vl.mark(:point, opacity: 0.3, size: 20, shape: "cross")
|> Vl.encode_field(:x, "f1", type: :quantitative)
|> Vl.encode_field(:y, "f2", type: :quantitative)
|> Vl.encode_field(:color, "predicted", type: :nominal, title: "Predicted")
])
```
---
## Section 7 β€” Nx.Serving: Production Inference
`Nx.Serving` batches requests from multiple clients and runs them efficiently
on GPU β€” essential for production deployment.
### 7.1 Serving Architecture
```elixir
# Start a named serving process (typically in your Application supervisor)
# In production, you would do:
#
# Nx.Serving.start_link(
# serving: sentiment_serving,
# name: MyApp.SentimentServing,
# batch_size: 16,
# batch_timeout: 100
# )
#
# Then in your Phoenix controller:
#
# Nx.Serving.run(MyApp.SentimentServing, text)
# For demonstration, run directly:
IO.puts("Serving is stateless β€” just call Nx.Serving.run/2")
IO.puts("For production, wrap in Nx.Serving.start_link/1 for automatic batching")
```
### 7.2 Benchmark Inference
```elixir
# Benchmark a single inference
{us_single, _} = :timer.tc(fn ->
Nx.Serving.run(sentiment_serving, "This is a test sentence for benchmarking.")
end)
IO.puts("Single inference: #{Float.round(us_single / 1000, 2)}ms")
# Benchmark batch
batch = for i <- 1..8, do: "Test sentence number #{i} for batch benchmarking."
{us_batch, _} = :timer.tc(fn ->
Enum.map(batch, fn text -> Nx.Serving.run(sentiment_serving, text) end)
end)
IO.puts("8 sequential inferences: #{Float.round(us_batch / 1000, 2)}ms")
IO.puts("Per-request (batched): #{Float.round(us_batch / 8000, 2)}ms")
```
---
## Section 7.5 β€” Fine-tuning with Bumblebee
Bumblebee supports **boosted training** β€” taking a pre-trained model head and
fine-tuning it on your own labeled data. This is far more practical than
training from scratch when you have limited data.
### 7.5.1 Load Pre-trained Model for Fine-tuning
```elixir
# Load a pre-trained BERT with a classification head
{:ok, ft_spec} = Bumblebee.load_spec({:hf, "google-bert/bert-base-uncased"},
module: Bumblebee.Text.BertForSequenceClassification
)
# Configure for your number of classes
ft_spec = Bumblebee.configure(ft_spec, num_labels: 3)
# Load with the custom spec β€” only the encoder weights are pre-trained,
# the classification head is randomly initialized
{:ok, ft_model_info} = Bumblebee.load_model({:hf, "google-bert/bert-base-uncased"},
spec: ft_spec
)
%{model: ft_model, params: ft_params} = ft_model_info
IO.puts("Model loaded. Pre-trained encoder + fresh classification head.")
IO.puts("Total params: #{inspect(Nx.size(ft_params.parameters))}")
```
### 7.5.2 Prepare Labeled Data
```elixir
# Your labeled dataset β€” in practice, load from CSV/JSON
training_texts = [
"The BEAM VM provides fault-tolerant concurrent computing",
"Nx brings numerical computing to the Elixir ecosystem",
"Phoenix LiveView enables real-time web applications",
"Bumblebee integrates Hugging Face models into Elixir",
"Axon provides a functional API for neural networks",
"The weather is sunny and warm today",
"Football season starts next month",
"The stock market rallied on positive earnings reports",
# ... add more samples per class
]
training_labels = [0, 0, 0, 0, 0, 1, 1, 2] # 0=tech, 1=sports, 2=finance
# Tokenize
{:ok, ft_tokenizer} = Bumblebee.load_tokenizer({:hf, "google-bert/bert-base-uncased"})
encoded = Bumblebee.apply_tokenizer(ft_tokenizer, training_texts)
IO.inspect(encoded, label: "Tokenized input")
```
### 7.5.3 Training Loop
```elixir
defmodule FineTuner do
import Nx.Defn
defn cross_entropy_loss(logits, labels) do
log_probs = Axon.Activations.log_softmax(logits, axis: -1)
-Nx.mean(Nx.sum(log_probs * labels, axes: [-1]))
end
def train_step(model, params, batch, learning_rate \\ 2.0e-5) do
{loss, gradient} = Nx.Defn.value_and_grad(params, fn p ->
output = Axon.predict(model, p, batch)
cross_entropy_loss(output.logits, batch["labels"])
end)
new_params =
Map.new(params, fn {k, v} ->
{k, Nx.subtract(v, Nx.multiply(learning_rate, gradient[k] || 0))}
end)
{loss, new_params}
end
end
# Example training (simplified β€” real training uses Axon.Loop)
epochs = 3
batch_size = 4
for epoch <- 1..epochs do
encoded
|> Nx.to_batched(batch_size)
|> Enum.reduce(ft_params, fn batch, params ->
labels = Nx.eye(3) |> Nx.take(Nx.tensor(Enum.take(training_labels, batch_size)))
batch_with_labels = Map.put(batch, "labels", labels)
{loss, new_params} = FineTuner.train_step(ft_model, params, batch_with_labels)
if rem(epoch, 1) == 0 do
IO.puts("Epoch #{epoch}, Loss: #{Float.round(Nx.to_number(loss), 4)}")
end
new_params
end)
end
```
> **Tip:** For production fine-tuning, use `Axon.Loop` with proper data
> pipelines, learning rate scheduling, and mixed precision. The above
> demonstrates the concept β€” see `Bumblebee` examples on GitHub for
> full fine-tuning recipes.
---
## Section 7.6 β€” Model Export (ONNX / GGUF)
### 7.6.1 Export a Bumblebee model to ONNX
```elixir
# Load a pretrained model
{:ok, spec} = Bumblebee.load_spec({:hf, "distilbert/bert-base-uncased"}, module: Bumblebee.Text.BertForSequenceClassification)
{:ok, model_info} = Bumblebee.load_model({:hf, "distilbert/bert-base-uncased"}, spec: spec)
# Export to ONNX (requires `onnx` library installed)
Bumblebee.export(model_info.model, format: :onnx, path: "distilbert.onnx")
```
The resulting `distilbert.onnx` can be loaded in any ONNX runtime.
### 7.6.2 Export to GGUF (via `gguf-converter`)
```bash
gguf-converter --onnx distilbert.onnx --output distilbert.gguf
```
> **Note:** GGUF is primarily for decoder models (e.g., Llama). Ensure compatibility.
---
## Section 7.7 β€” Phoenix LiveView Integration
Deploy ML models into Phoenix web apps using `Nx.Serving` and LiveView.
### 7.7.1 Application Supervision Tree
```elixir
# In your Phoenix app's application.ex:
defmodule MyApp.Application do
use Application
@impl true
def start(_type, _args) do
children = [
MyAppWeb.Telemetry,
{DNSCluster, query: Application.get_env(:my_app, :dns_cluster_query) || :ignore},
{Phoenix.PubSub, name: MyApp.PubSub},
# ─── ML Servings ───────────────────────────────────────
# Sentiment analysis serving
{Nx.Serving,
serving: sentiment_serving(),
name: MyApp.SentimentServing,
batch_size: 16,
batch_timeout: 100},
# Image classification serving
{Nx.Serving,
serving: image_serving(),
name: MyApp.ImageServing,
batch_size: 8,
batch_timeout: 200},
# Embedding serving (for similarity search)
{Nx.Serving,
serving: embedding_serving(),
name: MyApp.EmbeddingServing,
batch_size: 32,
batch_timeout: 50},
MyAppWeb.Endpoint
]
opts = [strategy: :one_for_one, name: MyApp.Supervisor]
Supervisor.start_link(children, opts)
end
defp sentiment_serving do
{:ok, model} = Bumblebee.load_model({:hf, "distilbert/distilbert-base-uncased-finetuned-sst-2-english"})
{:ok, tokenizer} = Bumblebee.load_tokenizer({:hf, "distilbert/distilbert-base-uncased-finetuned-sst-2-english"})
Bumblebee.Text.Classification.text_classification(model, tokenizer)
end
defp image_serving do
{:ok, model} = Bumblebee.load_model({:hf, "google/vit-base-patch16-224"})
{:ok, featurizer} = Bumblebee.load_featurizer({:hf, "google/vit-base-patch16-224"})
Bumblebee.Vision.ImageClassification.image_classification(model, featurizer)
end
defp embedding_serving do
{:ok, model} = Bumblebee.load_model({:hf, "sentence-transformers/all-MiniLM-L6-v2"})
{:ok, tokenizer} = Bumblebee.load_tokenizer({:hf, "sentence-transformers/all-MiniLM-L6-v2"})
Bumblebee.Text.TextEmbedding.text_embedding(model, tokenizer)
end
end
```
### 7.7.2 LiveView for Sentiment Analysis
```elixir
# lib/my_app_web/live/sentiment_live.ex
defmodule MyAppWeb.SentimentLive do
use MyAppWeb, :live_view
def mount(_params, _session, socket) do
{:ok, assign(socket, text: "", result: nil, loading: false)}
end
def handle_event("analyze", %{"text" => text}, socket) do
# Nx.Serving.run is synchronous and fast (batches with other requests)
result = Nx.Serving.run(MyApp.SentimentServing, text)
label = hd(result.predictions)
{:noreply, assign(socket,
text: text,
result: %{
label: label.label,
score: Float.round(label.score * 100, 1)
},
loading: false
)}
end
def handle_event("update_text", %{"text" => text}, socket) do
{:noreply, assign(socket, text: text)}
end
def render(assigns) do
~H"""
<div class="max-w-lg mx-auto p-6">
<h1 class="text-2xl font-bold mb-4">🐝 Sentiment Analysis</h1>
<form phx-submit="analyze">
<textarea
name="text"
phx-change="update_text"
class="w-full p-3 border rounded-lg"
rows="3"
placeholder="Enter text to analyze..."
><%= @text %></textarea>
<button type="submit" class="mt-2 px-4 py-2 bg-purple-600 text-white rounded-lg">
Analyze
</button>
</form>
<%= if @result do %>
<div class="mt-4 p-4 bg-gray-100 rounded-lg">
<p class="text-lg">
<strong><%= @result.label %></strong>
β€” <%= @result.score %>%
</p>
</div>
<% end %>
</div>
"""
end
end
```
### 7.7.3 LiveView for Image Classification
```elixir
# lib/my_app_web/live/vision_live.ex
defmodule MyAppWeb.VisionLive do
use MyAppWeb, :live_view
def mount(_params, _session, socket) do
{:ok, assign(socket, predictions: nil)}
end
def handle_event("classify", %{"image" => %Plug.Upload{path: path}}, socket) do
{:ok, image} = StbImage.read_file(path)
result = Nx.Serving.run(MyApp.ImageServing, image)
predictions =
result.predictions
|> Enum.take(5)
|> Enum.map(fn %{label: label, score: score} ->
%{label: label, score: Float.round(score * 100, 1)}
end)
{:noreply, assign(socket, predictions: predictions)}
end
def render(assigns) do
~H"""
<div class="max-w-lg mx-auto p-6">
<h1 class="text-2xl font-bold mb-4">πŸ–ΌοΈ Image Classification</h1>
<form phx-submit="classify" multipart>
<input type="file" name="image" accept="image/*" class="mb-2" />
<button type="submit" class="px-4 py-2 bg-purple-600 text-white rounded-lg">
Classify
</button>
</form>
<%= if @predictions do %>
<div class="mt-4">
<%= for pred <- @predictions do %>
<div class="flex justify-between py-1 border-b">
<span><%= pred.label %></span>
<span class="font-mono"><%= pred.score %>%</span>
</div>
<% end %>
</div>
<% end %>
</div>
"""
end
end
```
### 7.7.4 API Endpoint (REST)
```elixir
# lib/my_app_web/controllers/prediction_controller.ex
defmodule MyAppWeb.PredictionController do
use MyAppWeb, :controller
def sentiment(conn, %{"text" => text}) do
result = Nx.Serving.run(MyApp.SentimentServing, text)
json(conn, %{
predictions: Enum.map(result.predictions, fn p ->
%{label: p.label, score: Float.round(p.score, 4)}
end)
})
end
def embed(conn, %{"text" => text}) do
result = Nx.Serving.run(MyApp.EmbeddingServing, text)
json(conn, %{
embedding: Nx.to_flat_list(result.embedding),
dimensions: Nx.size(result.embedding)
})
end
end
# In router.ex:
# scope "/api", MyAppWeb do
# post "/sentiment", PredictionController, :sentiment
# post "/embed", PredictionController, :embed
# end
```
---
## Section 8 β€” Interactive Playground
### 8.1 Text Classification UI
```elixir
alias Kino.Input
text_input = Kino.Input.textarea("Enter text to classify", default: "Elixir is the best language for building scalable ML systems!")
class_form = Kino.Control.form(%{text: text_input}, submit: "Classify Sentiment")
Kino.listen(class_form, fn %{data: %{text: text}} ->
result = Nx.Serving.run(sentiment_serving, text)
label_text =
result
|> Map.get(:predictions, [])
|> Enum.map(fn %{label: label, score: score} ->
"#{label}: #{Float.round(score * 100, 1)}%"
end)
|> Enum.join(" | ")
Kino.Text.new(label_text)
end)
class_form
```
### 8.2 Embedding Similarity UI
```elixir
ref_input = Kino.Input.textarea("Reference text", default: "Nx provides numerical computing for the BEAM")
query_input = Kino.Input.textarea("Query text", default: "How do I do math in Elixir?")
sim_form = Kino.Control.form(%{ref: ref_input, query: query_input}, submit: "Compute Similarity")
Kino.listen(sim_form, fn %{data: %{ref: ref, query: query}} ->
ref_emb = Nx.Serving.run(embedding_serving, ref).embedding
query_emb = Nx.Serving.run(embedding_serving, query).embedding
score = Similarity.cosine_similarity(ref_emb, query_emb) |> Nx.to_number()
Kino.Text.new("Cosine similarity: #{Float.round(score, 6)}")
end)
sim_form
```
---
## Section 8.3 β€” Distributed Training on BEAM Nodes
### 8.3.1 Using `Nx` with `Node.spawn/4`
```elixir
# Assuming you have a cluster of BEAM nodes: node1@host, node2@host, ...
nodes = [:"node1@host", :"node2@host"]
# Distribute a tensor computation across nodes
defmodule DistTrainer do
def compute(tensor) do
Enum.map(nodes, fn node ->
Node.spawn(node, fn -> Nx.mean(tensor) end)
end)
end
end
tensor = Nx.tensor([[1, 2, 3], [4, 5, 6]])
results = DistTrainer.compute(tensor)
IO.inspect(results, label: "Means from each node")
```
### 8.3.2 Distributed Axon training
```elixir
# Using Axon with `Nx.Cluster` (requires `:zx` and `Nx.Cluster` setup)
defmodule ClusterTrainer do
use Axon
# Define a simple model
defmodel do
input({nil, 784})
|> dense(128, activation: :relu)
|> dense(10, activation: :softmax)
end
# Launch training across nodes
def train(data, labels) do
model = model()
# `Nx.Cluster.train/5` will split data and run steps on each BEAM node
Nx.Cluster.train(model, data, labels, nodes: nodes, epochs: 5)
end
end
```
> **Tip:** Ensure all nodes have the same code version and the required dependencies (`nx`, `axon`). Use `:net_kernel.connect_node/1` or a clustering tool such as `libcluster` to form the cluster.
---
## Section 9 β€” Summary & Next Steps
```elixir
Kino.Markdown.new("""
## What You've Built
| Pipeline Stage | Implementation | Key Library |
|----------------|----------------|-------------|
| **Tensors** | Creation, ops, broadcasting, gradients | `Nx` |
| **JIT Compile** | GPU-accelerated inference | `EXLA` |
| **Fill-Mask** | BERT masked language modeling | `Bumblebee` |
| **Sentiment** | DistilBERT text classification | `Bumblebee` |
| **NER** | Named entity recognition | `Bumblebee` |
| **Zero-Shot** | Classify without fine-tuning | `Bumblebee` |
| **Image CLS** | Vision Transformer (ViT) | `Bumblebee` |
| **Audio** | Whisper speech-to-text | `Bumblebee` |
| **Stable Diffusion** | Text-to-image generation | `Bumblebee` |
| **Text Gen** | GPT-2 autoregressive generation | `Bumblebee` |
| **Embeddings** | Sentence similarity search | `Bumblebee` |
| **Custom MLP** | Train from scratch with Axon | `Axon` |
| **Fine-tuning** | Boosted training on pre-trained models | `Bumblebee` |
| **Serving** | Production batched inference | `Nx.Serving` |
| **Phoenix** | LiveView + REST API deployment | `Phoenix` |
| **Interactive** | Kino live forms | `Kino` |
### Companion Notebooks
| Format | Path | Deploy To |
|--------|------|-----------|
| Livebook | `ml_e2e_template.livemd` | HF Spaces (Docker), Livebook Teams |
| Jupyter | `colab_kaggle/ml_e2e_python.ipynb` | Google Colab, Kaggle |
| Gradio | `gradio_hf_deploy/app.py` | HF Spaces (sdk: gradio) |
| marimo | `marimo/ml_e2e_marimo.py` | Anywhere Python runs |
### Resources
* [Bumblebee docs](https://hexdocs.pm/bumblebee) β€” Pre-trained models
* [Nx docs](https://hexdocs.pm/nx) β€” Numerical computing
* [Axon docs](https://hexdocs.pm/axon) β€” Neural networks
* [EXLA docs](https://hexdocs.pm/exla) β€” GPU backend
* [Phoenix docs](https://hexdocs.pm/phoenix) β€” Web framework
* [Hugging Face Hub](https://huggingface.co/models) β€” 500k+ models
* [marimo docs](https://docs.marimo.io) β€” Reactive Python notebooks
* _Machine Learning in Elixir_ β€” Sean Moriarity, Pragmatic Bookshelf
### Deploy
```bash
just check # Verify all files and tools
just livebook # Open Livebook locally
just deploy-livebook # Push to HF Spaces
just marimo # Open marimo editor
just gradio # Run Gradio app
```
""")
```