ml_in_elixir_code / ml_e2e_template.livemd

Add files using upload-large-folder tool

a554e78 verified 3 months ago

36.9 kB

	# Machine Learning in Elixir — End-to-End with Bumblebee + Hugging Face

	<!-- livebook:{"persist_outputs":true} -->

	## Overview

	## Skills
	- `hf_cli.md` – Hugging Face CLI usage
	- `hf_jobs.md` – Running workloads on HF Jobs
	- `training_trl.md` – TRL model training
	- `hf_dataset_viewer.md` – Dataset Viewer API
	- `gradio.md` – Gradio UI integration
	- (Full catalog at https://skills.sh/huggingface/skills)

	This Livebook is a complete end-to-end ML template built on the Elixir ML ecosystem
	from _Machine Learning in Elixir_ by Sean Moriarity, with Bumblebee as the core
	integration layer to the Hugging Face Hub.

	What we cover:

	\| Section \| Library \| Task \|
	\|---------\|---------\|------\|
	\| Foundations \| `Nx` \| Tensors, gradients, JIT compilation \|
	\| Pre-trained NLP \| `Bumblebee` \| Fill-mask, sentiment, NER, zero-shot \|
	\| Pre-trained Vision \| `Bumblebee` \| Image classification (ViT, ResNet) \|
	\| Audio \| `Bumblebee` \| Speech-to-text (Whisper) \|
	\| Generative AI \| `Bumblebee` \| Text generation (GPT-2) & Stable Diffusion \|
	\| Embeddings \| `Bumblebee` \| Sentence similarity search \|
	\| Custom Training \| `Axon` \| Build & train from scratch \|
	\| Fine-tuning \| `Bumblebee` \| Boosted training on pre-trained models \|
	\| Serving \| `Nx.Serving` \| Production batched inference \|
	\| Deployment \| `Phoenix` \| LiveView integration pattern \|
	\| Interactive UI \| `Kino` \| Live input forms & charts \|

	---

	## Section 0 — Install & Configure

	```elixir
	Mix.install([
	{:nx, "~> 0.10"},
	{:axon, "~> 0.7"},
	{:exla, "~> 0.10"},
	{:bumblebee, "~> 0.6"},
	{:kino, "~> 0.15"},
	{:kino_vega_lite, "~> 0.1"},
	{:vega_lite, "~> 0.1"},
	{:stb_image, "~> 0.6"},
	{:req, "~> 0.5"}
	])

	Nx.global_default_backend(EXLA.Backend)

	IO.puts("Nx version: #{Nx.version()}")
	IO.puts("Axon version: #{Axon.version()}")
	IO.puts("EXLA backend: #{inspect(Nx.default_backend())}")
	IO.puts("Bumblebee loaded: #{Code.ensure_loaded?(Bumblebee)}")
	IO.puts("Cache dir: #{Bumblebee.cache_dir()}")
	```

	---

	## Section 1 — Nx Foundations

	Before diving into Bumblebee, let's ground ourselves in Nx — the numerical
	backbone that every Elixir ML library builds on.

	### 1.1 Tensors

	```elixir
	import Nx

	# Scalars, vectors, matrices, higher-order
	scalar = Nx.tensor(3.14)
	vector = Nx.tensor([1.0, 2.0, 3.0])
	matrix = Nx.tensor([[1, 2, 3], [4, 5, 6]])
	cube = Nx.iota({2, 3, 4})

	IO.puts("scalar shape=#{inspect(Nx.shape(scalar))} type=#{Nx.type(scalar)}")
	IO.puts("vector shape=#{inspect(Nx.shape(vector))} type=#{Nx.type(vector)}")
	IO.puts("matrix shape=#{inspect(Nx.shape(matrix))} type=#{Nx.type(matrix)}")
	IO.puts("cube shape=#{inspect(Nx.shape(cube))} type=#{Nx.type(cube)}")

	{scalar, vector, matrix}
	```

	### 1.2 Operations & Broadcasting

	```elixir
	a = Nx.tensor([1.0, 2.0, 3.0])
	b = Nx.tensor([10.0, 20.0, 30.0])

	# Element-wise
	IO.puts("add: #{inspect(Nx.add(a, b))}")
	IO.puts("multiply: #{inspect(Nx.multiply(a, b))}")
	IO.puts("pow: #{inspect(Nx.pow(a, 2))}")

	# Reductions
	IO.puts("sum: #{Nx.sum(a)}")
	IO.puts("mean: #{Nx.mean(a)}")
	IO.puts("std: #{Nx.standard_deviation(a)}")

	# Dot product
	IO.puts("dot: #{Nx.dot(a, b)}")

	# Matrix multiply
	m1 = Nx.tensor([[1.0, 2.0], [3.0, 4.0]])
	m2 = Nx.tensor([[5.0, 6.0], [7.0, 8.0]])
	IO.puts("matmul: #{inspect(Nx.dot(m1, m2))}")
	```

	### 1.3 Automatic Differentiation

	This is how models learn — computing gradients of loss with respect to parameters.

	```elixir
	defmodule AutoDiff do
	import Nx.Defn

	# Define a function f(x) = x³ + 2x²
	defnp f(x), do: Nx.pow(x, 3) + 2 * Nx.pow(x, 2)

	# Compute gradient symbolically
	def grad_f(x), do: Nx.Defn.grad(x, &f/1)

	# Gradient of MSE loss
	defnp mse_loss(y_true, y_pred) do
	Nx.mean(Nx.pow(y_true - y_pred, 2))
	end

	def grad_mse(y_true, y_pred, w) do
	Nx.Defn.grad(w, fn weights ->
	predictions = Nx.dot(y_pred, weights)
	mse_loss(y_true, predictions)
	end)
	end
	end

	x = Nx.tensor(3.0)
	IO.puts("f(3) = #{Nx.to_number(AutoDiff.f(x))}")
	IO.puts("f'(3) = #{Nx.to_number(AutoDiff.grad_f(x))}")
	IO.puts("expected = 39 + 223 = #{3 9 + 2 * 2 * 3}")
	```

	### 1.4 JIT Compilation

	```elixir
	# JIT compiles for GPU/CPU acceleration — critical for inference speed
	defmodule FastMath do
	import Nx.Defn

	defn slow_sigmoid(x) do
	1 / (1 + Nx.exp(-x))
	end
	end

	# JIT-compiled version
	fast_sigmoid = Nx.Defn.jit(&FastMath.slow_sigmoid/1)

	input = Nx.tensor([[-2.0, -1.0, 0.0, 1.0, 2.0]])

	# Benchmark
	{us, result} = :timer.tc(fn -> fast_sigmoid.(input) end)
	IO.puts("JIT sigmoid: #{us}μs result=#{inspect(result)}")
	```

	---

	## Section 2 — Bumblebee: Pre-trained NLP Models

	Bumblebee loads pre-trained models from the Hugging Face Hub and wraps them
	in `Nx.Serving` for production-ready batched inference.

	### 2.1 Fill-Mask (BERT)

	```elixir
	{:ok, bert_model} = Bumblebee.load_model({:hf, "google-bert/bert-base-uncased"})
	{:ok, bert_tokenizer} = Bumblebee.load_tokenizer({:hf, "google-bert/bert-base-uncased"})

	bert_fill_mask = Bumblebee.Text.fill_mask(bert_model, bert_tokenizer)

	results = Nx.Serving.run(bert_fill_mask, "Elixir is a [MASK] language.")
	IO.inspect(results, label: "Fill-Mask Results")
	```

	### 2.2 Sentiment Analysis (DistilBERT)

	```elixir
	{:ok, sentiment_model} =
	Bumblebee.load_model({:hf, "distilbert/distilbert-base-uncased-finetuned-sst-2-english"})

	{:ok, sentiment_tokenizer} =
	Bumblebee.load_tokenizer({:hf, "distilbert/distilbert-base-uncased-finetuned-sst-2-english"})

	sentiment_serving = Bumblebee.Text.Classification.text_classification(
	sentiment_model,
	sentiment_tokenizer
	)

	texts = [
	"Machine learning in Elixir is amazing!",
	"This tutorial is boring and confusing.",
	"The BEAM VM handles concurrent ML workloads well.",
	"I love how functional programming simplifies ML pipelines."
	]

	Enum.each(texts, fn text ->
	result = Nx.Serving.run(sentiment_serving, text)
	IO.puts("#{inspect(result)} ← \"#{String.slice(text, 0..50)}...\"")
	end)
	```

	### 2.3 Named Entity Recognition (BERT-NER)

	```elixir
	{:ok, ner_model} = Bumblebee.load_model({:hf, "dslim/bert-base-NER"})
	{:ok, ner_tokenizer} = Bumblebee.load_tokenizer({:hf, "google-bert/bert-base-uncased"})

	ner_serving = Bumblebee.Text.TokenClassification.token_classification(
	ner_model,
	ner_tokenizer
	)

	ner_text = "Sean Moriarity wrote Machine Learning in Elixir for Pragmatic Bookshelf. He lives in Austin, Texas."
	ner_result = Nx.Serving.run(ner_serving, ner_text)

	IO.puts("Input: #{ner_text}")
	IO.inspect(ner_result, label: "NER Entities")
	```

	### 2.4 Zero-Shot Classification

	No fine-tuning needed — classify arbitrary text into custom categories.

	```elixir
	{:ok, zs_model} = Bumblebee.load_model({:hf, "facebook/bart-large-mnli"})
	{:ok, zs_tokenizer} = Bumblebee.load_tokenizer({:hf, "facebook/bart-large-mnli"})

	zs_serving = Bumblebee.Text.ZeroShotClassification.zero_shot_classification(
	zs_model,
	zs_tokenizer
	)

	article = """
	Nx brings numerical computing to the BEAM, enabling machine learning
	pipelines that leverage Elixir's concurrency and fault tolerance.
	Bumblebee provides access to thousands of pre-trained models from
	the Hugging Face Hub directly in Livebook.
	"""

	labels = ["technology", "sports", "politics", "science", "finance"]
	zs_result = Nx.Serving.run(zs_serving, %{text: article, labels: labels})

	IO.inspect(zs_result, label: "Zero-Shot Classification")
	```

	---

	## Section 3 — Bumblebee: Vision Models

	### 3.1 Image Classification (ViT / ResNet)

	```elixir
	{:ok, vit_model} = Bumblebee.load_model({:hf, "google/vit-base-patch16-224"})
	{:ok, vit_featurizer} = Bumblebee.load_featurizer({:hf, "google/vit-base-patch16-224"})

	vit_serving = Bumblebee.Vision.ImageClassification.image_classification(
	vit_model,
	vit_featurizer
	)

	# Download a sample image
	image_url = "https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/pipeline-cat-chonk.jpeg"
	image_data = Req.get!(image_url).body

	# Save and load
	File.write!("/tmp/sample.jpg", image_data)
	{:ok, image} = StbImage.read_file("/tmp/sample.jpg")

	IO.puts("Image: #{StbImage.width(image)}x#{StbImage.height(image)}")

	img_result = Nx.Serving.run(vit_serving, image)
	IO.inspect(img_result, label: "Image Classification")
	```

	### 3.2 Batch Image Classification

	```elixir
	# Nx.Serving automatically batches multiple requests for GPU efficiency
	images = [image] # In production, this would be multiple images

	batch_result = Nx.Serving.run(vit_serving, images)
	IO.inspect(batch_result, label: "Batch Classification")
	```

	---

	## Section 4 — Bumblebee: Text Generation

	### 4.1 GPT-2 Text Generation

	```elixir
	{:ok, gpt2_model} = Bumblebee.load_model({:hf, "openai-community/gpt2"})
	{:ok, gpt2_tokenizer} = Bumblebee.load_tokenizer({:hf, "openai-community/gpt2"})
	{:ok, gpt2_generation_config} = Bumblebee.load_generation_config({:hf, "openai-community/gpt2"})

	gpt2_serving = Bumblebee.Text.generation(
	gpt2_model,
	gpt2_tokenizer,
	gpt2_generation_config,
	compile: [batch_size: 1, sequence_length: 64],
	defn_options: [compiler: EXLA]
	)

	prompt = "Machine learning in Elixir is"
	gen_result = Nx.Serving.run(gpt2_serving, prompt)

	IO.puts("Prompt: #{prompt}")
	IO.puts("Output: #{inspect(gen_result)}")
	```

	### 4.2 Interactive Text Generation

	<!-- livebook:{"attrs":{"source":"# Interactive Text Generation\nalias Kino.Input\n\nprompt_input = Kino.Input.text(\"Prompt\", default: \"The future of ML in Elixir is\")\nmax_tokens = Kino.Input.number(\"Max tokens\", default: 50)\n\nform = Kino.Control.form(\n %{prompt: prompt_input, max_tokens: max_tokens},\n submit: \"Generate\"\n)\n\nKino.listen(form, fn %{data: %{prompt: prompt, max_tokens: max_tokens}} ->\n config = Bumblebee.configure(gpt2_generation_config, max_new_tokens: trunc(max_tokens))\n serving = Bumblebee.Text.generation(gpt2_model, gpt2_tokenizer, config)\n result = Nx.Serving.run(serving, prompt)\n Kino.Text.new(\"#{prompt}#{result.text}\")\nend)\n\nform","title":"GPT-2 Generator"},"chunks":[{"chunk":"","type":"Elixir"}],"kind":"Elixir","source_type":"cell"} -->

	```elixir
	alias Kino.Input

	prompt_input = Kino.Input.text("Prompt", default: "The future of ML in Elixir is")
	max_tokens_input = Kino.Input.number("Max tokens", default: 50)

	form =
	Kino.Control.form(
	%{prompt: prompt_input, max_tokens: max_tokens_input},
	submit: "Generate"
	)

	Kino.listen(form, fn %{data: %{prompt: prompt, max_tokens: max_tokens}} ->
	config = Bumblebee.configure(gpt2_generation_config, max_new_tokens: trunc(max_tokens))

	serving =
	Bumblebee.Text.generation(
	gpt2_model,
	gpt2_tokenizer,
	config,
	defn_options: [compiler: EXLA]
	)

	result = Nx.Serving.run(serving, prompt)
	Kino.Text.new("#{prompt}#{result.text}")
	end)

	form
	```

	---

	## Section 5 — Bumblebee: Embeddings & Similarity

	### 5.1 Sentence Embeddings

	```elixir
	{:ok, emb_model} = Bumblebee.load_model({:hf, "sentence-transformers/all-MiniLM-L6-v2"})
	{:ok, emb_tokenizer} = Bumblebee.load_tokenizer({:hf, "sentence-transformers/all-MiniLM-L6-v2"})

	embedding_serving = Bumblebee.Text.TextEmbedding.text_embedding(
	emb_model,
	emb_tokenizer
	)

	sentences = [
	"Nx provides numerical computing for Elixir",
	"Axon is a neural network library built on Nx",
	"Bumblebee connects Elixir to the Hugging Face Hub",
	"I enjoy cooking Italian food on weekends",
	"The weather forecast predicts rain tomorrow"
	]

	embeddings =
	Enum.map(sentences, fn s ->
	result = Nx.Serving.run(embedding_serving, s)
	result.embedding
	end)

	IO.puts("Generated #{length(embeddings)} embeddings")
	IO.puts("Embedding dim: #{inspect(Nx.shape(hd(embeddings)))}")
	```

	### 5.2 Cosine Similarity Search

	```elixir
	defmodule Similarity do
	import Nx

	defn cosine_similarity(a, b) do
	a_norm = a / Nx.sqrt(Nx.sum(a * a))
	b_norm = b / Nx.sqrt(Nx.sum(b * b))
	Nx.sum(a_norm * b_norm)
	end

	def find_most_similar(query_embedding, corpus_embeddings) do
	corpus_embeddings
	\|> Enum.map(fn emb -> Nx.to_number(cosine_similarity(query_embedding, emb)) end)
	\|> Enum.with_index()
	\|> Enum.sort_by(fn {score, _idx} -> -score end)
	end
	end

	query = "How do I build neural networks in Elixir?"
	query_emb = Nx.Serving.run(embedding_serving, query).embedding

	IO.puts("Query: \"#{query}\"\n")

	Similarity.find_most_similar(query_emb, embeddings)
	\|> Enum.each(fn {score, idx} ->
	IO.puts(" #{Float.round(score, 4)} #{Enum.at(sentences, idx)}")
	end)
	```

	---

	## Section 5.5 — Bumblebee: Audio (Whisper Speech-to-Text)

	Bumblebee wraps OpenAI's Whisper for speech-to-text directly in Elixir.

	```elixir
	{:ok, whisper_model} = Bumblebee.load_model({:hf, "openai/whisper-tiny"})
	{:ok, whisper_featurizer} = Bumblebee.load_featurizer({:hf, "openai/whisper-tiny"})
	{:ok, whisper_tokenizer} = Bumblebee.load_tokenizer({:hf, "openai/whisper-tiny"})
	{:ok, whisper_generation_config} = Bumblebee.load_generation_config({:hf, "openai/whisper-tiny"})

	whisper_serving = Bumblebee.Audio.speech_to_text(
	whisper_model,
	whisper_featurizer,
	whisper_tokenizer,
	whisper_generation_config
	)

	# Download a sample audio file
	audio_url = "https://huggingface.co/datasets/Narsil/asr_dummy/resolve/main/mlk.flac"
	audio_data = Req.get!(audio_url).body
	File.write!("/tmp/sample_audio.flac", audio_data)

	# Transcribe
	{:ok, audio_info} = Bumblebee.Audio.LoadedAudio.from_file("/tmp/sample_audio.flac")
	whisper_result = Nx.Serving.run(whisper_serving, audio_info)

	IO.puts("Transcription: #{whisper_result.text}")
	```

	### Interactive Audio Transcription

	```elixir
	audio_file_input = Kino.Input.file("Upload audio (WAV/FLAC/MP3)")

	audio_form = Kino.Control.form(%{file: audio_file_input}, submit: "Transcribe")

	Kino.listen(audio_form, fn %{data: %{file: file}} ->
	if file do
	{:ok, audio} = Bumblebee.Audio.LoadedAudio.from_file(file.path)
	result = Nx.Serving.run(whisper_serving, audio)
	Kino.Text.new(result.text)
	else
	Kino.Text.new("Please upload an audio file.")
	end
	end)

	audio_form
	```

	---

	## Section 5.6 — Bumblebee: Stable Diffusion (Image Generation)

	Generate images from text prompts using Stable Diffusion — all within Elixir.

	> Note: This section requires a GPU with 4GB+ VRAM. On CPU it will be very slow.

	```elixir
	{:ok, sd_info} = Bumblebee.load_model({:hf, "CompVis/stable-diffusion-v1-4", subdir: "unet"})
	{:ok, sd_vae} = Bumblebee.load_model({:hf, "CompVis/stable-diffusion-v1-4", subdir: "vae"})
	{:ok, sd_clip} = Bumblebee.load_model({:hf, "CompVis/stable-diffusion-v1-4", subdir: "text_encoder"})
	{:ok, sd_tokenizer} = Bumblebee.load_tokenizer({:hf, "CompVis/stable-diffusion-v1-4", subdir: "tokenizer"})
	{:ok, sd_scheduler} = Bumblebee.load_scheduler({:hf, "CompVis/stable-diffusion-v1-4", subdir: "scheduler"})

	# Image generation serving
	sd_serving = Bumblebee.Diffusion.StableDiffusion.text_to_image(
	sd_info,
	sd_vae,
	sd_clip,
	sd_tokenizer,
	sd_scheduler,
	num_steps: 20,
	guidance_scale: 7.5
	)

	# Generate
	sd_result = Nx.Serving.run(sd_serving, %{
	prompt: "a photograph of a bee programming in elixir, highly detailed, 4k",
	negative_prompt: "blurry, low quality"
	})

	# Display the generated image
	Kino.Image.new(sd_result.image)
	```

	### Interactive Image Generation

	```elixir
	prompt_input = Kino.Input.text("Prompt", default: "a cute robot bee coding in Elixir")
	neg_input = Kino.Input.text("Negative prompt", default: "blurry, ugly, low quality")
	steps_input = Kino.Input.number("Steps", default: 20)

	sd_form = Kino.Control.form(
	%{prompt: prompt_input, negative: neg_input, steps: steps_input},
	submit: "Generate"
	)

	Kino.listen(sd_form, fn %{data: %{prompt: prompt, negative: negative, steps: steps}} ->
	result = Nx.Serving.run(sd_serving, %{
	prompt: prompt,
	negative_prompt: negative,
	num_steps: trunc(steps)
	})
	Kino.Layout.grid([Kino.Image.new(result.image)], columns: 1)
	end)

	sd_form
	```

	---

	## Section 6 — Custom Training with Axon

	Beyond pre-trained models, Axon lets you build and train from scratch.

	### 6.1 Synthetic Data

	```elixir
	defmodule Data do
	import Nx

	def make_classification(n \\ 2000, features \\ 4, classes \\ 3, seed \\ 42) do
	key = Nx.Random.key(seed)
	{centers, key} = Nx.Random.normal(key, 0, 2, shape: {classes, features})
	{labels_raw, key} = Nx.Random.randint(key, 0, classes, shape: {n})
	{noise, _key} = Nx.Random.normal(key, 0, 0.4, shape: {n, features})

	x = Nx.take(centers, labels_raw) \|> Nx.add(noise)
	y = Nx.equal(Nx.new_axis(labels_raw, 1), Nx.iota({1, classes})) \|> Nx.as_type(:f32)

	# Normalize
	mu = Nx.mean(x, axes: [0])
	sigma = Nx.standard_deviation(x, axes: [0])
	x_norm = Nx.divide(Nx.subtract(x, mu), sigma)

	# Split
	split = round(n * 0.8)
	{{x_norm[0..(split - 1)//1], y[0..(split - 1)//1]},
	{x_norm[split..(n - 1)//1], y[split..(n - 1)//1]}}
	end
	end

	{{x_train, y_train}, {x_test, y_test} = _test_data} = Data.make_classification()
	IO.puts("Train: #{Nx.axis_size(x_train, 0)} \| Test: #{Nx.axis_size(x_test, 0)}")
	IO.puts("Features: #{Nx.axis_size(x_train, 1)} \| Classes: #{Nx.axis_size(y_train, 1)}")
	```

	### 6.2 Define Model

	```elixir
	import Axon

	n_features = Nx.axis_size(x_train, 1)
	n_classes = Nx.axis_size(y_train, 1)

	model =
	Axon.input("features", shape: {nil, n_features})
	\|> Axon.dense(64, activation: :relu, name: "hidden_1")
	\|> Axon.batch_norm(name: "bn_1")
	\|> Axon.dropout(rate: 0.2, name: "drop_1")
	\|> Axon.dense(32, activation: :relu, name: "hidden_2")
	\|> Axon.batch_norm(name: "bn_2")
	\|> Axon.dropout(rate: 0.2, name: "drop_2")
	\|> Axon.dense(n_classes, activation: :softmax, name: "output")

	Axon.Display.as_table(model, Nx.template({1, n_features}, :f32)) \|> IO.puts()
	```

	### 6.3 Train

	```elixir
	train_data =
	x_train
	\|> Nx.to_batched(64)
	\|> Enum.zip(Nx.to_batched(y_train, 64))
	\|> Stream.map(fn {xb, yb} -> %{"features" => xb, "targets" => yb} end)

	val_data =
	x_test
	\|> Nx.to_batched(64)
	\|> Enum.zip(Nx.to_batched(y_test, 64))
	\|> Stream.map(fn {xb, yb} -> %{"features" => xb, "targets" => yb} end)

	trained_state =
	model
	\|> Axon.Loop.trainer(:categorical_cross_entropy, Axon.Optimizers.adam(0.001))
	\|> Axon.Loop.metric(:accuracy, "acc")
	\|> Axon.Loop.validate(model, val_data)
	\|> Axon.Loop.early_stopping("validation_loss", patience: 5, mode: :min)
	\|> Axon.Loop.run(train_data, %{}, epochs: 30, compiler: EXLA)

	IO.puts("Training complete!")
	```

	### 6.4 Evaluate & Predict

	```elixir
	# JIT-compiled prediction
	predict_fn = Nx.Defn.jit(fn params, input -> Axon.predict(model, params, input) end)

	# Evaluate on test set
	test_preds = predict_fn.(trained_state, x_test)
	pred_classes = Nx.argmax(test_preds, axis: 1)
	true_classes = Nx.argmax(y_test, axis: 1)

	accuracy = Nx.mean(Nx.equal(pred_classes, true_classes) \|> Nx.as_type(:f32))
	IO.puts("Test accuracy: #{Float.round(Nx.to_number(accuracy) * 100, 2)}%")

	# Single prediction
	sample = x_test[0]
	probs = predict_fn.(trained_state, Nx.new_axis(sample, 0)) \|> Nx.squeeze()
	IO.puts("Sample prediction: class #{Nx.argmax(probs) \|> Nx.to_number()} probs=#{inspect(Nx.to_flat_list(probs) \|> Enum.map(&Float.round(&1, 4)))}")
	```

	### 6.5 Visualize Training

	```elixir
	# Visualize predictions vs actual on first 2 features
	alias VegaLite, as: Vl

	scatter_data =
	Enum.map(0..(min(Nx.axis_size(x_test, 0), 500) - 1), fn i ->
	%{
	"f1" => Nx.to_number(x_test[i][0]),
	"f2" => Nx.to_number(x_test[i][1]),
	"actual" => Nx.to_number(Nx.argmax(y_test[i])),
	"predicted" => Nx.to_number(pred_classes[i])
	}
	end)

	Vl.new(
	title: "Test Predictions (first 2 features)",
	width: 500,
	height: 400
	)
	\|> Vl.data_from_values(scatter_data)
	\|> Vl.layers([
	Vl.new()
	\|> Vl.mark(:circle, opacity: 0.7, size: 40)
	\|> Vl.encode_field(:x, "f1", type: :quantitative)
	\|> Vl.encode_field(:y, "f2", type: :quantitative)
	\|> Vl.encode_field(:color, "actual", type: :nominal, title: "Actual"),
	Vl.new()
	\|> Vl.mark(:point, opacity: 0.3, size: 20, shape: "cross")
	\|> Vl.encode_field(:x, "f1", type: :quantitative)
	\|> Vl.encode_field(:y, "f2", type: :quantitative)
	\|> Vl.encode_field(:color, "predicted", type: :nominal, title: "Predicted")
	])
	```

	---

	## Section 7 — Nx.Serving: Production Inference

	`Nx.Serving` batches requests from multiple clients and runs them efficiently
	on GPU — essential for production deployment.

	### 7.1 Serving Architecture

	```elixir
	# Start a named serving process (typically in your Application supervisor)
	# In production, you would do:
	#
	# Nx.Serving.start_link(
	# serving: sentiment_serving,
	# name: MyApp.SentimentServing,
	# batch_size: 16,
	# batch_timeout: 100
	# )
	#
	# Then in your Phoenix controller:
	#
	# Nx.Serving.run(MyApp.SentimentServing, text)

	# For demonstration, run directly:
	IO.puts("Serving is stateless — just call Nx.Serving.run/2")
	IO.puts("For production, wrap in Nx.Serving.start_link/1 for automatic batching")
	```

	### 7.2 Benchmark Inference

	```elixir
	# Benchmark a single inference
	{us_single, _} = :timer.tc(fn ->
	Nx.Serving.run(sentiment_serving, "This is a test sentence for benchmarking.")
	end)

	IO.puts("Single inference: #{Float.round(us_single / 1000, 2)}ms")

	# Benchmark batch
	batch = for i <- 1..8, do: "Test sentence number #{i} for batch benchmarking."
	{us_batch, _} = :timer.tc(fn ->
	Enum.map(batch, fn text -> Nx.Serving.run(sentiment_serving, text) end)
	end)

	IO.puts("8 sequential inferences: #{Float.round(us_batch / 1000, 2)}ms")
	IO.puts("Per-request (batched): #{Float.round(us_batch / 8000, 2)}ms")
	```

	---

	## Section 7.5 — Fine-tuning with Bumblebee

	Bumblebee supports boosted training — taking a pre-trained model head and
	fine-tuning it on your own labeled data. This is far more practical than
	training from scratch when you have limited data.

	### 7.5.1 Load Pre-trained Model for Fine-tuning

	```elixir
	# Load a pre-trained BERT with a classification head
	{:ok, ft_spec} = Bumblebee.load_spec({:hf, "google-bert/bert-base-uncased"},
	module: Bumblebee.Text.BertForSequenceClassification
	)

	# Configure for your number of classes
	ft_spec = Bumblebee.configure(ft_spec, num_labels: 3)

	# Load with the custom spec — only the encoder weights are pre-trained,
	# the classification head is randomly initialized
	{:ok, ft_model_info} = Bumblebee.load_model({:hf, "google-bert/bert-base-uncased"},
	spec: ft_spec
	)

	%{model: ft_model, params: ft_params} = ft_model_info
	IO.puts("Model loaded. Pre-trained encoder + fresh classification head.")
	IO.puts("Total params: #{inspect(Nx.size(ft_params.parameters))}")
	```

	### 7.5.2 Prepare Labeled Data

	```elixir
	# Your labeled dataset — in practice, load from CSV/JSON
	training_texts = [
	"The BEAM VM provides fault-tolerant concurrent computing",
	"Nx brings numerical computing to the Elixir ecosystem",
	"Phoenix LiveView enables real-time web applications",
	"Bumblebee integrates Hugging Face models into Elixir",
	"Axon provides a functional API for neural networks",
	"The weather is sunny and warm today",
	"Football season starts next month",
	"The stock market rallied on positive earnings reports",
	# ... add more samples per class
	]

	training_labels = [0, 0, 0, 0, 0, 1, 1, 2] # 0=tech, 1=sports, 2=finance

	# Tokenize
	{:ok, ft_tokenizer} = Bumblebee.load_tokenizer({:hf, "google-bert/bert-base-uncased"})

	encoded = Bumblebee.apply_tokenizer(ft_tokenizer, training_texts)
	IO.inspect(encoded, label: "Tokenized input")
	```

	### 7.5.3 Training Loop

	```elixir
	defmodule FineTuner do
	import Nx.Defn

	defn cross_entropy_loss(logits, labels) do
	log_probs = Axon.Activations.log_softmax(logits, axis: -1)
	-Nx.mean(Nx.sum(log_probs * labels, axes: [-1]))
	end

	def train_step(model, params, batch, learning_rate \\ 2.0e-5) do
	{loss, gradient} = Nx.Defn.value_and_grad(params, fn p ->
	output = Axon.predict(model, p, batch)
	cross_entropy_loss(output.logits, batch["labels"])
	end)

	new_params =
	Map.new(params, fn {k, v} ->
	{k, Nx.subtract(v, Nx.multiply(learning_rate, gradient[k] \|\| 0))}
	end)

	{loss, new_params}
	end
	end

	# Example training (simplified — real training uses Axon.Loop)
	epochs = 3
	batch_size = 4

	for epoch <- 1..epochs do
	encoded
	\|> Nx.to_batched(batch_size)
	\|> Enum.reduce(ft_params, fn batch, params ->
	labels = Nx.eye(3) \|> Nx.take(Nx.tensor(Enum.take(training_labels, batch_size)))
	batch_with_labels = Map.put(batch, "labels", labels)

	{loss, new_params} = FineTuner.train_step(ft_model, params, batch_with_labels)

	if rem(epoch, 1) == 0 do
	IO.puts("Epoch #{epoch}, Loss: #{Float.round(Nx.to_number(loss), 4)}")
	end

	new_params
	end)
	end
	```

	> Tip: For production fine-tuning, use `Axon.Loop` with proper data
	> pipelines, learning rate scheduling, and mixed precision. The above
	> demonstrates the concept — see `Bumblebee` examples on GitHub for
	> full fine-tuning recipes.

	---

	## Section 7.6 — Model Export (ONNX / GGUF)

	### 7.6.1 Export a Bumblebee model to ONNX

	```elixir
	# Load a pretrained model
	{:ok, spec} = Bumblebee.load_spec({:hf, "distilbert/bert-base-uncased"}, module: Bumblebee.Text.BertForSequenceClassification)
	{:ok, model_info} = Bumblebee.load_model({:hf, "distilbert/bert-base-uncased"}, spec: spec)

	# Export to ONNX (requires `onnx` library installed)
	Bumblebee.export(model_info.model, format: :onnx, path: "distilbert.onnx")
	```

	The resulting `distilbert.onnx` can be loaded in any ONNX runtime.

	### 7.6.2 Export to GGUF (via `gguf-converter`)

	```bash
	gguf-converter --onnx distilbert.onnx --output distilbert.gguf
	```

	> Note: GGUF is primarily for decoder models (e.g., Llama). Ensure compatibility.

	---

	## Section 7.7 — Phoenix LiveView Integration

	Deploy ML models into Phoenix web apps using `Nx.Serving` and LiveView.

	### 7.7.1 Application Supervision Tree

	```elixir
	# In your Phoenix app's application.ex:
	defmodule MyApp.Application do
	use Application

	@impl true
	def start(_type, _args) do
	children = [
	MyAppWeb.Telemetry,
	{DNSCluster, query: Application.get_env(:my_app, :dns_cluster_query) \|\| :ignore},
	{Phoenix.PubSub, name: MyApp.PubSub},

	# ─── ML Servings ───────────────────────────────────────
	# Sentiment analysis serving
	{Nx.Serving,
	serving: sentiment_serving(),
	name: MyApp.SentimentServing,
	batch_size: 16,
	batch_timeout: 100},

	# Image classification serving
	{Nx.Serving,
	serving: image_serving(),
	name: MyApp.ImageServing,
	batch_size: 8,
	batch_timeout: 200},

	# Embedding serving (for similarity search)
	{Nx.Serving,
	serving: embedding_serving(),
	name: MyApp.EmbeddingServing,
	batch_size: 32,
	batch_timeout: 50},

	MyAppWeb.Endpoint
	]

	opts = [strategy: :one_for_one, name: MyApp.Supervisor]
	Supervisor.start_link(children, opts)
	end

	defp sentiment_serving do
	{:ok, model} = Bumblebee.load_model({:hf, "distilbert/distilbert-base-uncased-finetuned-sst-2-english"})
	{:ok, tokenizer} = Bumblebee.load_tokenizer({:hf, "distilbert/distilbert-base-uncased-finetuned-sst-2-english"})
	Bumblebee.Text.Classification.text_classification(model, tokenizer)
	end

	defp image_serving do
	{:ok, model} = Bumblebee.load_model({:hf, "google/vit-base-patch16-224"})
	{:ok, featurizer} = Bumblebee.load_featurizer({:hf, "google/vit-base-patch16-224"})
	Bumblebee.Vision.ImageClassification.image_classification(model, featurizer)
	end

	defp embedding_serving do
	{:ok, model} = Bumblebee.load_model({:hf, "sentence-transformers/all-MiniLM-L6-v2"})
	{:ok, tokenizer} = Bumblebee.load_tokenizer({:hf, "sentence-transformers/all-MiniLM-L6-v2"})
	Bumblebee.Text.TextEmbedding.text_embedding(model, tokenizer)
	end
	end
	```

	### 7.7.2 LiveView for Sentiment Analysis

	```elixir
	# lib/my_app_web/live/sentiment_live.ex
	defmodule MyAppWeb.SentimentLive do
	use MyAppWeb, :live_view

	def mount(_params, _session, socket) do
	{:ok, assign(socket, text: "", result: nil, loading: false)}
	end

	def handle_event("analyze", %{"text" => text}, socket) do
	# Nx.Serving.run is synchronous and fast (batches with other requests)
	result = Nx.Serving.run(MyApp.SentimentServing, text)

	label = hd(result.predictions)
	{:noreply, assign(socket,
	text: text,
	result: %{
	label: label.label,
	score: Float.round(label.score * 100, 1)
	},
	loading: false
	)}
	end

	def handle_event("update_text", %{"text" => text}, socket) do
	{:noreply, assign(socket, text: text)}
	end

	def render(assigns) do
	~H"""
	<div class="max-w-lg mx-auto p-6">
	<h1 class="text-2xl font-bold mb-4">🐝 Sentiment Analysis</h1>

	<form phx-submit="analyze">
	<textarea
	name="text"
	phx-change="update_text"
	class="w-full p-3 border rounded-lg"
	rows="3"
	placeholder="Enter text to analyze..."
	><%= @text %></textarea>
	<button type="submit" class="mt-2 px-4 py-2 bg-purple-600 text-white rounded-lg">
	Analyze
	</button>
	</form>

	<%= if @result do %>
	<div class="mt-4 p-4 bg-gray-100 rounded-lg">
	<p class="text-lg">
	<strong><%= @result.label %></strong>
	— <%= @result.score %>%
	</p>
	</div>
	<% end %>
	</div>
	"""
	end
	end
	```

	### 7.7.3 LiveView for Image Classification

	```elixir
	# lib/my_app_web/live/vision_live.ex
	defmodule MyAppWeb.VisionLive do
	use MyAppWeb, :live_view

	def mount(_params, _session, socket) do
	{:ok, assign(socket, predictions: nil)}
	end

	def handle_event("classify", %{"image" => %Plug.Upload{path: path}}, socket) do
	{:ok, image} = StbImage.read_file(path)

	result = Nx.Serving.run(MyApp.ImageServing, image)

	predictions =
	result.predictions
	\|> Enum.take(5)
	\|> Enum.map(fn %{label: label, score: score} ->
	%{label: label, score: Float.round(score * 100, 1)}
	end)

	{:noreply, assign(socket, predictions: predictions)}
	end

	def render(assigns) do
	~H"""
	<div class="max-w-lg mx-auto p-6">
	<h1 class="text-2xl font-bold mb-4">🖼️ Image Classification</h1>

	<form phx-submit="classify" multipart>
	<input type="file" name="image" accept="image/*" class="mb-2" />
	<button type="submit" class="px-4 py-2 bg-purple-600 text-white rounded-lg">
	Classify
	</button>
	</form>

	<%= if @predictions do %>
	<div class="mt-4">
	<%= for pred <- @predictions do %>
	<div class="flex justify-between py-1 border-b">
	<span><%= pred.label %></span>
	<span class="font-mono"><%= pred.score %>%</span>
	</div>
	<% end %>
	</div>
	<% end %>
	</div>
	"""
	end
	end
	```

	### 7.7.4 API Endpoint (REST)

	```elixir
	# lib/my_app_web/controllers/prediction_controller.ex
	defmodule MyAppWeb.PredictionController do
	use MyAppWeb, :controller

	def sentiment(conn, %{"text" => text}) do
	result = Nx.Serving.run(MyApp.SentimentServing, text)

	json(conn, %{
	predictions: Enum.map(result.predictions, fn p ->
	%{label: p.label, score: Float.round(p.score, 4)}
	end)
	})
	end

	def embed(conn, %{"text" => text}) do
	result = Nx.Serving.run(MyApp.EmbeddingServing, text)

	json(conn, %{
	embedding: Nx.to_flat_list(result.embedding),
	dimensions: Nx.size(result.embedding)
	})
	end
	end

	# In router.ex:
	# scope "/api", MyAppWeb do
	# post "/sentiment", PredictionController, :sentiment
	# post "/embed", PredictionController, :embed
	# end
	```

	---

	## Section 8 — Interactive Playground

	### 8.1 Text Classification UI

	```elixir
	alias Kino.Input

	text_input = Kino.Input.textarea("Enter text to classify", default: "Elixir is the best language for building scalable ML systems!")

	class_form = Kino.Control.form(%{text: text_input}, submit: "Classify Sentiment")

	Kino.listen(class_form, fn %{data: %{text: text}} ->
	result = Nx.Serving.run(sentiment_serving, text)

	label_text =
	result
	\|> Map.get(:predictions, [])
	\|> Enum.map(fn %{label: label, score: score} ->
	"#{label}: #{Float.round(score * 100, 1)}%"
	end)
	\|> Enum.join(" \| ")

	Kino.Text.new(label_text)
	end)

	class_form
	```

	### 8.2 Embedding Similarity UI

	```elixir
	ref_input = Kino.Input.textarea("Reference text", default: "Nx provides numerical computing for the BEAM")
	query_input = Kino.Input.textarea("Query text", default: "How do I do math in Elixir?")

	sim_form = Kino.Control.form(%{ref: ref_input, query: query_input}, submit: "Compute Similarity")

	Kino.listen(sim_form, fn %{data: %{ref: ref, query: query}} ->
	ref_emb = Nx.Serving.run(embedding_serving, ref).embedding
	query_emb = Nx.Serving.run(embedding_serving, query).embedding
	score = Similarity.cosine_similarity(ref_emb, query_emb) \|> Nx.to_number()

	Kino.Text.new("Cosine similarity: #{Float.round(score, 6)}")
	end)

	sim_form
	```

	---
	## Section 8.3 — Distributed Training on BEAM Nodes

	### 8.3.1 Using `Nx` with `Node.spawn/4`

	```elixir
	# Assuming you have a cluster of BEAM nodes: node1@host, node2@host, ...
	nodes = [:"node1@host", :"node2@host"]

	# Distribute a tensor computation across nodes
	defmodule DistTrainer do
	def compute(tensor) do
	Enum.map(nodes, fn node ->
	Node.spawn(node, fn -> Nx.mean(tensor) end)
	end)
	end
	end

	tensor = Nx.tensor([[1, 2, 3], [4, 5, 6]])
	results = DistTrainer.compute(tensor)
	IO.inspect(results, label: "Means from each node")
	```

	### 8.3.2 Distributed Axon training

	```elixir
	# Using Axon with `Nx.Cluster` (requires `:zx` and `Nx.Cluster` setup)
	defmodule ClusterTrainer do
	use Axon

	# Define a simple model
	defmodel do
	input({nil, 784})
	\|> dense(128, activation: :relu)
	\|> dense(10, activation: :softmax)
	end

	# Launch training across nodes
	def train(data, labels) do
	model = model()
	# `Nx.Cluster.train/5` will split data and run steps on each BEAM node
	Nx.Cluster.train(model, data, labels, nodes: nodes, epochs: 5)
	end
	end
	```

	> Tip: Ensure all nodes have the same code version and the required dependencies (`nx`, `axon`). Use `:net_kernel.connect_node/1` or a clustering tool such as `libcluster` to form the cluster.

	---

	## Section 9 — Summary & Next Steps

	```elixir
	Kino.Markdown.new("""
	## What You've Built

	\| Pipeline Stage \| Implementation \| Key Library \|
	\|----------------\|----------------\|-------------\|
	\| Tensors \| Creation, ops, broadcasting, gradients \| `Nx` \|
	\| JIT Compile \| GPU-accelerated inference \| `EXLA` \|
	\| Fill-Mask \| BERT masked language modeling \| `Bumblebee` \|
	\| Sentiment \| DistilBERT text classification \| `Bumblebee` \|
	\| NER \| Named entity recognition \| `Bumblebee` \|
	\| Zero-Shot \| Classify without fine-tuning \| `Bumblebee` \|
	\| Image CLS \| Vision Transformer (ViT) \| `Bumblebee` \|
	\| Audio \| Whisper speech-to-text \| `Bumblebee` \|
	\| Stable Diffusion \| Text-to-image generation \| `Bumblebee` \|
	\| Text Gen \| GPT-2 autoregressive generation \| `Bumblebee` \|
	\| Embeddings \| Sentence similarity search \| `Bumblebee` \|
	\| Custom MLP \| Train from scratch with Axon \| `Axon` \|
	\| Fine-tuning \| Boosted training on pre-trained models \| `Bumblebee` \|
	\| Serving \| Production batched inference \| `Nx.Serving` \|
	\| Phoenix \| LiveView + REST API deployment \| `Phoenix` \|
	\| Interactive \| Kino live forms \| `Kino` \|

	### Companion Notebooks

	\| Format \| Path \| Deploy To \|
	\|--------\|------\|-----------\|
	\| Livebook \| `ml_e2e_template.livemd` \| HF Spaces (Docker), Livebook Teams \|
	\| Jupyter \| `colab_kaggle/ml_e2e_python.ipynb` \| Google Colab, Kaggle \|
	\| Gradio \| `gradio_hf_deploy/app.py` \| HF Spaces (sdk: gradio) \|
	\| marimo \| `marimo/ml_e2e_marimo.py` \| Anywhere Python runs \|

	### Resources

	* [Bumblebee docs](https://hexdocs.pm/bumblebee) — Pre-trained models
	* [Nx docs](https://hexdocs.pm/nx) — Numerical computing
	* [Axon docs](https://hexdocs.pm/axon) — Neural networks
	* [EXLA docs](https://hexdocs.pm/exla) — GPU backend
	* [Phoenix docs](https://hexdocs.pm/phoenix) — Web framework
	* [Hugging Face Hub](https://huggingface.co/models) — 500k+ models
	* [marimo docs](https://docs.marimo.io) — Reactive Python notebooks
	* _Machine Learning in Elixir_ — Sean Moriarity, Pragmatic Bookshelf

	### Deploy

	```bash
	just check # Verify all files and tools
	just livebook # Open Livebook locally
	just deploy-livebook # Push to HF Spaces
	just marimo # Open marimo editor
	just gradio # Run Gradio app
	```
	""")
	```