# Machine Learning in Elixir — End-to-End with Bumblebee + Hugging Face ## Overview ## Skills - `hf_cli.md` – Hugging Face CLI usage - `hf_jobs.md` – Running workloads on HF Jobs - `training_trl.md` – TRL model training - `hf_dataset_viewer.md` – Dataset Viewer API - `gradio.md` – Gradio UI integration - *(Full catalog at https://skills.sh/huggingface/skills)* This Livebook is a complete end-to-end ML template built on the Elixir ML ecosystem from _Machine Learning in Elixir_ by Sean Moriarity, with **Bumblebee** as the core integration layer to the **Hugging Face Hub**. **What we cover:** | Section | Library | Task | |---------|---------|------| | Foundations | `Nx` | Tensors, gradients, JIT compilation | | Pre-trained NLP | `Bumblebee` | Fill-mask, sentiment, NER, zero-shot | | Pre-trained Vision | `Bumblebee` | Image classification (ViT, ResNet) | | Audio | `Bumblebee` | Speech-to-text (Whisper) | | Generative AI | `Bumblebee` | Text generation (GPT-2) & Stable Diffusion | | Embeddings | `Bumblebee` | Sentence similarity search | | Custom Training | `Axon` | Build & train from scratch | | Fine-tuning | `Bumblebee` | Boosted training on pre-trained models | | Serving | `Nx.Serving` | Production batched inference | | Deployment | `Phoenix` | LiveView integration pattern | | Interactive UI | `Kino` | Live input forms & charts | --- ## Section 0 — Install & Configure ```elixir Mix.install([ {:nx, "~> 0.10"}, {:axon, "~> 0.7"}, {:exla, "~> 0.10"}, {:bumblebee, "~> 0.6"}, {:kino, "~> 0.15"}, {:kino_vega_lite, "~> 0.1"}, {:vega_lite, "~> 0.1"}, {:stb_image, "~> 0.6"}, {:req, "~> 0.5"} ]) Nx.global_default_backend(EXLA.Backend) IO.puts("Nx version: #{Nx.version()}") IO.puts("Axon version: #{Axon.version()}") IO.puts("EXLA backend: #{inspect(Nx.default_backend())}") IO.puts("Bumblebee loaded: #{Code.ensure_loaded?(Bumblebee)}") IO.puts("Cache dir: #{Bumblebee.cache_dir()}") ``` --- ## Section 1 — Nx Foundations Before diving into Bumblebee, let's ground ourselves in Nx — the numerical backbone that every Elixir ML library builds on. ### 1.1 Tensors ```elixir import Nx # Scalars, vectors, matrices, higher-order scalar = Nx.tensor(3.14) vector = Nx.tensor([1.0, 2.0, 3.0]) matrix = Nx.tensor([[1, 2, 3], [4, 5, 6]]) cube = Nx.iota({2, 3, 4}) IO.puts("scalar shape=#{inspect(Nx.shape(scalar))} type=#{Nx.type(scalar)}") IO.puts("vector shape=#{inspect(Nx.shape(vector))} type=#{Nx.type(vector)}") IO.puts("matrix shape=#{inspect(Nx.shape(matrix))} type=#{Nx.type(matrix)}") IO.puts("cube shape=#{inspect(Nx.shape(cube))} type=#{Nx.type(cube)}") {scalar, vector, matrix} ``` ### 1.2 Operations & Broadcasting ```elixir a = Nx.tensor([1.0, 2.0, 3.0]) b = Nx.tensor([10.0, 20.0, 30.0]) # Element-wise IO.puts("add: #{inspect(Nx.add(a, b))}") IO.puts("multiply: #{inspect(Nx.multiply(a, b))}") IO.puts("pow: #{inspect(Nx.pow(a, 2))}") # Reductions IO.puts("sum: #{Nx.sum(a)}") IO.puts("mean: #{Nx.mean(a)}") IO.puts("std: #{Nx.standard_deviation(a)}") # Dot product IO.puts("dot: #{Nx.dot(a, b)}") # Matrix multiply m1 = Nx.tensor([[1.0, 2.0], [3.0, 4.0]]) m2 = Nx.tensor([[5.0, 6.0], [7.0, 8.0]]) IO.puts("matmul: #{inspect(Nx.dot(m1, m2))}") ``` ### 1.3 Automatic Differentiation This is how models learn — computing gradients of loss with respect to parameters. ```elixir defmodule AutoDiff do import Nx.Defn # Define a function f(x) = x³ + 2x² defnp f(x), do: Nx.pow(x, 3) + 2 * Nx.pow(x, 2) # Compute gradient symbolically def grad_f(x), do: Nx.Defn.grad(x, &f/1) # Gradient of MSE loss defnp mse_loss(y_true, y_pred) do Nx.mean(Nx.pow(y_true - y_pred, 2)) end def grad_mse(y_true, y_pred, w) do Nx.Defn.grad(w, fn weights -> predictions = Nx.dot(y_pred, weights) mse_loss(y_true, predictions) end) end end x = Nx.tensor(3.0) IO.puts("f(3) = #{Nx.to_number(AutoDiff.f(x))}") IO.puts("f'(3) = #{Nx.to_number(AutoDiff.grad_f(x))}") IO.puts("expected = 3*9 + 2*2*3 = #{3 * 9 + 2 * 2 * 3}") ``` ### 1.4 JIT Compilation ```elixir # JIT compiles for GPU/CPU acceleration — critical for inference speed defmodule FastMath do import Nx.Defn defn slow_sigmoid(x) do 1 / (1 + Nx.exp(-x)) end end # JIT-compiled version fast_sigmoid = Nx.Defn.jit(&FastMath.slow_sigmoid/1) input = Nx.tensor([[-2.0, -1.0, 0.0, 1.0, 2.0]]) # Benchmark {us, result} = :timer.tc(fn -> fast_sigmoid.(input) end) IO.puts("JIT sigmoid: #{us}μs result=#{inspect(result)}") ``` --- ## Section 2 — Bumblebee: Pre-trained NLP Models Bumblebee loads pre-trained models from the Hugging Face Hub and wraps them in `Nx.Serving` for production-ready batched inference. ### 2.1 Fill-Mask (BERT) ```elixir {:ok, bert_model} = Bumblebee.load_model({:hf, "google-bert/bert-base-uncased"}) {:ok, bert_tokenizer} = Bumblebee.load_tokenizer({:hf, "google-bert/bert-base-uncased"}) bert_fill_mask = Bumblebee.Text.fill_mask(bert_model, bert_tokenizer) results = Nx.Serving.run(bert_fill_mask, "Elixir is a [MASK] language.") IO.inspect(results, label: "Fill-Mask Results") ``` ### 2.2 Sentiment Analysis (DistilBERT) ```elixir {:ok, sentiment_model} = Bumblebee.load_model({:hf, "distilbert/distilbert-base-uncased-finetuned-sst-2-english"}) {:ok, sentiment_tokenizer} = Bumblebee.load_tokenizer({:hf, "distilbert/distilbert-base-uncased-finetuned-sst-2-english"}) sentiment_serving = Bumblebee.Text.Classification.text_classification( sentiment_model, sentiment_tokenizer ) texts = [ "Machine learning in Elixir is amazing!", "This tutorial is boring and confusing.", "The BEAM VM handles concurrent ML workloads well.", "I love how functional programming simplifies ML pipelines." ] Enum.each(texts, fn text -> result = Nx.Serving.run(sentiment_serving, text) IO.puts("#{inspect(result)} ← \"#{String.slice(text, 0..50)}...\"") end) ``` ### 2.3 Named Entity Recognition (BERT-NER) ```elixir {:ok, ner_model} = Bumblebee.load_model({:hf, "dslim/bert-base-NER"}) {:ok, ner_tokenizer} = Bumblebee.load_tokenizer({:hf, "google-bert/bert-base-uncased"}) ner_serving = Bumblebee.Text.TokenClassification.token_classification( ner_model, ner_tokenizer ) ner_text = "Sean Moriarity wrote Machine Learning in Elixir for Pragmatic Bookshelf. He lives in Austin, Texas." ner_result = Nx.Serving.run(ner_serving, ner_text) IO.puts("Input: #{ner_text}") IO.inspect(ner_result, label: "NER Entities") ``` ### 2.4 Zero-Shot Classification No fine-tuning needed — classify arbitrary text into custom categories. ```elixir {:ok, zs_model} = Bumblebee.load_model({:hf, "facebook/bart-large-mnli"}) {:ok, zs_tokenizer} = Bumblebee.load_tokenizer({:hf, "facebook/bart-large-mnli"}) zs_serving = Bumblebee.Text.ZeroShotClassification.zero_shot_classification( zs_model, zs_tokenizer ) article = """ Nx brings numerical computing to the BEAM, enabling machine learning pipelines that leverage Elixir's concurrency and fault tolerance. Bumblebee provides access to thousands of pre-trained models from the Hugging Face Hub directly in Livebook. """ labels = ["technology", "sports", "politics", "science", "finance"] zs_result = Nx.Serving.run(zs_serving, %{text: article, labels: labels}) IO.inspect(zs_result, label: "Zero-Shot Classification") ``` --- ## Section 3 — Bumblebee: Vision Models ### 3.1 Image Classification (ViT / ResNet) ```elixir {:ok, vit_model} = Bumblebee.load_model({:hf, "google/vit-base-patch16-224"}) {:ok, vit_featurizer} = Bumblebee.load_featurizer({:hf, "google/vit-base-patch16-224"}) vit_serving = Bumblebee.Vision.ImageClassification.image_classification( vit_model, vit_featurizer ) # Download a sample image image_url = "https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/pipeline-cat-chonk.jpeg" image_data = Req.get!(image_url).body # Save and load File.write!("/tmp/sample.jpg", image_data) {:ok, image} = StbImage.read_file("/tmp/sample.jpg") IO.puts("Image: #{StbImage.width(image)}x#{StbImage.height(image)}") img_result = Nx.Serving.run(vit_serving, image) IO.inspect(img_result, label: "Image Classification") ``` ### 3.2 Batch Image Classification ```elixir # Nx.Serving automatically batches multiple requests for GPU efficiency images = [image] # In production, this would be multiple images batch_result = Nx.Serving.run(vit_serving, images) IO.inspect(batch_result, label: "Batch Classification") ``` --- ## Section 4 — Bumblebee: Text Generation ### 4.1 GPT-2 Text Generation ```elixir {:ok, gpt2_model} = Bumblebee.load_model({:hf, "openai-community/gpt2"}) {:ok, gpt2_tokenizer} = Bumblebee.load_tokenizer({:hf, "openai-community/gpt2"}) {:ok, gpt2_generation_config} = Bumblebee.load_generation_config({:hf, "openai-community/gpt2"}) gpt2_serving = Bumblebee.Text.generation( gpt2_model, gpt2_tokenizer, gpt2_generation_config, compile: [batch_size: 1, sequence_length: 64], defn_options: [compiler: EXLA] ) prompt = "Machine learning in Elixir is" gen_result = Nx.Serving.run(gpt2_serving, prompt) IO.puts("Prompt: #{prompt}") IO.puts("Output: #{inspect(gen_result)}") ``` ### 4.2 Interactive Text Generation ```elixir alias Kino.Input prompt_input = Kino.Input.text("Prompt", default: "The future of ML in Elixir is") max_tokens_input = Kino.Input.number("Max tokens", default: 50) form = Kino.Control.form( %{prompt: prompt_input, max_tokens: max_tokens_input}, submit: "Generate" ) Kino.listen(form, fn %{data: %{prompt: prompt, max_tokens: max_tokens}} -> config = Bumblebee.configure(gpt2_generation_config, max_new_tokens: trunc(max_tokens)) serving = Bumblebee.Text.generation( gpt2_model, gpt2_tokenizer, config, defn_options: [compiler: EXLA] ) result = Nx.Serving.run(serving, prompt) Kino.Text.new("#{prompt}#{result.text}") end) form ``` --- ## Section 5 — Bumblebee: Embeddings & Similarity ### 5.1 Sentence Embeddings ```elixir {:ok, emb_model} = Bumblebee.load_model({:hf, "sentence-transformers/all-MiniLM-L6-v2"}) {:ok, emb_tokenizer} = Bumblebee.load_tokenizer({:hf, "sentence-transformers/all-MiniLM-L6-v2"}) embedding_serving = Bumblebee.Text.TextEmbedding.text_embedding( emb_model, emb_tokenizer ) sentences = [ "Nx provides numerical computing for Elixir", "Axon is a neural network library built on Nx", "Bumblebee connects Elixir to the Hugging Face Hub", "I enjoy cooking Italian food on weekends", "The weather forecast predicts rain tomorrow" ] embeddings = Enum.map(sentences, fn s -> result = Nx.Serving.run(embedding_serving, s) result.embedding end) IO.puts("Generated #{length(embeddings)} embeddings") IO.puts("Embedding dim: #{inspect(Nx.shape(hd(embeddings)))}") ``` ### 5.2 Cosine Similarity Search ```elixir defmodule Similarity do import Nx defn cosine_similarity(a, b) do a_norm = a / Nx.sqrt(Nx.sum(a * a)) b_norm = b / Nx.sqrt(Nx.sum(b * b)) Nx.sum(a_norm * b_norm) end def find_most_similar(query_embedding, corpus_embeddings) do corpus_embeddings |> Enum.map(fn emb -> Nx.to_number(cosine_similarity(query_embedding, emb)) end) |> Enum.with_index() |> Enum.sort_by(fn {score, _idx} -> -score end) end end query = "How do I build neural networks in Elixir?" query_emb = Nx.Serving.run(embedding_serving, query).embedding IO.puts("Query: \"#{query}\"\n") Similarity.find_most_similar(query_emb, embeddings) |> Enum.each(fn {score, idx} -> IO.puts(" #{Float.round(score, 4)} #{Enum.at(sentences, idx)}") end) ``` --- ## Section 5.5 — Bumblebee: Audio (Whisper Speech-to-Text) Bumblebee wraps OpenAI's Whisper for speech-to-text directly in Elixir. ```elixir {:ok, whisper_model} = Bumblebee.load_model({:hf, "openai/whisper-tiny"}) {:ok, whisper_featurizer} = Bumblebee.load_featurizer({:hf, "openai/whisper-tiny"}) {:ok, whisper_tokenizer} = Bumblebee.load_tokenizer({:hf, "openai/whisper-tiny"}) {:ok, whisper_generation_config} = Bumblebee.load_generation_config({:hf, "openai/whisper-tiny"}) whisper_serving = Bumblebee.Audio.speech_to_text( whisper_model, whisper_featurizer, whisper_tokenizer, whisper_generation_config ) # Download a sample audio file audio_url = "https://huggingface.co/datasets/Narsil/asr_dummy/resolve/main/mlk.flac" audio_data = Req.get!(audio_url).body File.write!("/tmp/sample_audio.flac", audio_data) # Transcribe {:ok, audio_info} = Bumblebee.Audio.LoadedAudio.from_file("/tmp/sample_audio.flac") whisper_result = Nx.Serving.run(whisper_serving, audio_info) IO.puts("Transcription: #{whisper_result.text}") ``` ### Interactive Audio Transcription ```elixir audio_file_input = Kino.Input.file("Upload audio (WAV/FLAC/MP3)") audio_form = Kino.Control.form(%{file: audio_file_input}, submit: "Transcribe") Kino.listen(audio_form, fn %{data: %{file: file}} -> if file do {:ok, audio} = Bumblebee.Audio.LoadedAudio.from_file(file.path) result = Nx.Serving.run(whisper_serving, audio) Kino.Text.new(result.text) else Kino.Text.new("Please upload an audio file.") end end) audio_form ``` --- ## Section 5.6 — Bumblebee: Stable Diffusion (Image Generation) Generate images from text prompts using Stable Diffusion — all within Elixir. > **Note:** This section requires a GPU with 4GB+ VRAM. On CPU it will be very slow. ```elixir {:ok, sd_info} = Bumblebee.load_model({:hf, "CompVis/stable-diffusion-v1-4", subdir: "unet"}) {:ok, sd_vae} = Bumblebee.load_model({:hf, "CompVis/stable-diffusion-v1-4", subdir: "vae"}) {:ok, sd_clip} = Bumblebee.load_model({:hf, "CompVis/stable-diffusion-v1-4", subdir: "text_encoder"}) {:ok, sd_tokenizer} = Bumblebee.load_tokenizer({:hf, "CompVis/stable-diffusion-v1-4", subdir: "tokenizer"}) {:ok, sd_scheduler} = Bumblebee.load_scheduler({:hf, "CompVis/stable-diffusion-v1-4", subdir: "scheduler"}) # Image generation serving sd_serving = Bumblebee.Diffusion.StableDiffusion.text_to_image( sd_info, sd_vae, sd_clip, sd_tokenizer, sd_scheduler, num_steps: 20, guidance_scale: 7.5 ) # Generate sd_result = Nx.Serving.run(sd_serving, %{ prompt: "a photograph of a bee programming in elixir, highly detailed, 4k", negative_prompt: "blurry, low quality" }) # Display the generated image Kino.Image.new(sd_result.image) ``` ### Interactive Image Generation ```elixir prompt_input = Kino.Input.text("Prompt", default: "a cute robot bee coding in Elixir") neg_input = Kino.Input.text("Negative prompt", default: "blurry, ugly, low quality") steps_input = Kino.Input.number("Steps", default: 20) sd_form = Kino.Control.form( %{prompt: prompt_input, negative: neg_input, steps: steps_input}, submit: "Generate" ) Kino.listen(sd_form, fn %{data: %{prompt: prompt, negative: negative, steps: steps}} -> result = Nx.Serving.run(sd_serving, %{ prompt: prompt, negative_prompt: negative, num_steps: trunc(steps) }) Kino.Layout.grid([Kino.Image.new(result.image)], columns: 1) end) sd_form ``` --- ## Section 6 — Custom Training with Axon Beyond pre-trained models, Axon lets you build and train from scratch. ### 6.1 Synthetic Data ```elixir defmodule Data do import Nx def make_classification(n \\ 2000, features \\ 4, classes \\ 3, seed \\ 42) do key = Nx.Random.key(seed) {centers, key} = Nx.Random.normal(key, 0, 2, shape: {classes, features}) {labels_raw, key} = Nx.Random.randint(key, 0, classes, shape: {n}) {noise, _key} = Nx.Random.normal(key, 0, 0.4, shape: {n, features}) x = Nx.take(centers, labels_raw) |> Nx.add(noise) y = Nx.equal(Nx.new_axis(labels_raw, 1), Nx.iota({1, classes})) |> Nx.as_type(:f32) # Normalize mu = Nx.mean(x, axes: [0]) sigma = Nx.standard_deviation(x, axes: [0]) x_norm = Nx.divide(Nx.subtract(x, mu), sigma) # Split split = round(n * 0.8) {{x_norm[0..(split - 1)//1], y[0..(split - 1)//1]}, {x_norm[split..(n - 1)//1], y[split..(n - 1)//1]}} end end {{x_train, y_train}, {x_test, y_test} = _test_data} = Data.make_classification() IO.puts("Train: #{Nx.axis_size(x_train, 0)} | Test: #{Nx.axis_size(x_test, 0)}") IO.puts("Features: #{Nx.axis_size(x_train, 1)} | Classes: #{Nx.axis_size(y_train, 1)}") ``` ### 6.2 Define Model ```elixir import Axon n_features = Nx.axis_size(x_train, 1) n_classes = Nx.axis_size(y_train, 1) model = Axon.input("features", shape: {nil, n_features}) |> Axon.dense(64, activation: :relu, name: "hidden_1") |> Axon.batch_norm(name: "bn_1") |> Axon.dropout(rate: 0.2, name: "drop_1") |> Axon.dense(32, activation: :relu, name: "hidden_2") |> Axon.batch_norm(name: "bn_2") |> Axon.dropout(rate: 0.2, name: "drop_2") |> Axon.dense(n_classes, activation: :softmax, name: "output") Axon.Display.as_table(model, Nx.template({1, n_features}, :f32)) |> IO.puts() ``` ### 6.3 Train ```elixir train_data = x_train |> Nx.to_batched(64) |> Enum.zip(Nx.to_batched(y_train, 64)) |> Stream.map(fn {xb, yb} -> %{"features" => xb, "targets" => yb} end) val_data = x_test |> Nx.to_batched(64) |> Enum.zip(Nx.to_batched(y_test, 64)) |> Stream.map(fn {xb, yb} -> %{"features" => xb, "targets" => yb} end) trained_state = model |> Axon.Loop.trainer(:categorical_cross_entropy, Axon.Optimizers.adam(0.001)) |> Axon.Loop.metric(:accuracy, "acc") |> Axon.Loop.validate(model, val_data) |> Axon.Loop.early_stopping("validation_loss", patience: 5, mode: :min) |> Axon.Loop.run(train_data, %{}, epochs: 30, compiler: EXLA) IO.puts("Training complete!") ``` ### 6.4 Evaluate & Predict ```elixir # JIT-compiled prediction predict_fn = Nx.Defn.jit(fn params, input -> Axon.predict(model, params, input) end) # Evaluate on test set test_preds = predict_fn.(trained_state, x_test) pred_classes = Nx.argmax(test_preds, axis: 1) true_classes = Nx.argmax(y_test, axis: 1) accuracy = Nx.mean(Nx.equal(pred_classes, true_classes) |> Nx.as_type(:f32)) IO.puts("Test accuracy: #{Float.round(Nx.to_number(accuracy) * 100, 2)}%") # Single prediction sample = x_test[0] probs = predict_fn.(trained_state, Nx.new_axis(sample, 0)) |> Nx.squeeze() IO.puts("Sample prediction: class #{Nx.argmax(probs) |> Nx.to_number()} probs=#{inspect(Nx.to_flat_list(probs) |> Enum.map(&Float.round(&1, 4)))}") ``` ### 6.5 Visualize Training ```elixir # Visualize predictions vs actual on first 2 features alias VegaLite, as: Vl scatter_data = Enum.map(0..(min(Nx.axis_size(x_test, 0), 500) - 1), fn i -> %{ "f1" => Nx.to_number(x_test[i][0]), "f2" => Nx.to_number(x_test[i][1]), "actual" => Nx.to_number(Nx.argmax(y_test[i])), "predicted" => Nx.to_number(pred_classes[i]) } end) Vl.new( title: "Test Predictions (first 2 features)", width: 500, height: 400 ) |> Vl.data_from_values(scatter_data) |> Vl.layers([ Vl.new() |> Vl.mark(:circle, opacity: 0.7, size: 40) |> Vl.encode_field(:x, "f1", type: :quantitative) |> Vl.encode_field(:y, "f2", type: :quantitative) |> Vl.encode_field(:color, "actual", type: :nominal, title: "Actual"), Vl.new() |> Vl.mark(:point, opacity: 0.3, size: 20, shape: "cross") |> Vl.encode_field(:x, "f1", type: :quantitative) |> Vl.encode_field(:y, "f2", type: :quantitative) |> Vl.encode_field(:color, "predicted", type: :nominal, title: "Predicted") ]) ``` --- ## Section 7 — Nx.Serving: Production Inference `Nx.Serving` batches requests from multiple clients and runs them efficiently on GPU — essential for production deployment. ### 7.1 Serving Architecture ```elixir # Start a named serving process (typically in your Application supervisor) # In production, you would do: # # Nx.Serving.start_link( # serving: sentiment_serving, # name: MyApp.SentimentServing, # batch_size: 16, # batch_timeout: 100 # ) # # Then in your Phoenix controller: # # Nx.Serving.run(MyApp.SentimentServing, text) # For demonstration, run directly: IO.puts("Serving is stateless — just call Nx.Serving.run/2") IO.puts("For production, wrap in Nx.Serving.start_link/1 for automatic batching") ``` ### 7.2 Benchmark Inference ```elixir # Benchmark a single inference {us_single, _} = :timer.tc(fn -> Nx.Serving.run(sentiment_serving, "This is a test sentence for benchmarking.") end) IO.puts("Single inference: #{Float.round(us_single / 1000, 2)}ms") # Benchmark batch batch = for i <- 1..8, do: "Test sentence number #{i} for batch benchmarking." {us_batch, _} = :timer.tc(fn -> Enum.map(batch, fn text -> Nx.Serving.run(sentiment_serving, text) end) end) IO.puts("8 sequential inferences: #{Float.round(us_batch / 1000, 2)}ms") IO.puts("Per-request (batched): #{Float.round(us_batch / 8000, 2)}ms") ``` --- ## Section 7.5 — Fine-tuning with Bumblebee Bumblebee supports **boosted training** — taking a pre-trained model head and fine-tuning it on your own labeled data. This is far more practical than training from scratch when you have limited data. ### 7.5.1 Load Pre-trained Model for Fine-tuning ```elixir # Load a pre-trained BERT with a classification head {:ok, ft_spec} = Bumblebee.load_spec({:hf, "google-bert/bert-base-uncased"}, module: Bumblebee.Text.BertForSequenceClassification ) # Configure for your number of classes ft_spec = Bumblebee.configure(ft_spec, num_labels: 3) # Load with the custom spec — only the encoder weights are pre-trained, # the classification head is randomly initialized {:ok, ft_model_info} = Bumblebee.load_model({:hf, "google-bert/bert-base-uncased"}, spec: ft_spec ) %{model: ft_model, params: ft_params} = ft_model_info IO.puts("Model loaded. Pre-trained encoder + fresh classification head.") IO.puts("Total params: #{inspect(Nx.size(ft_params.parameters))}") ``` ### 7.5.2 Prepare Labeled Data ```elixir # Your labeled dataset — in practice, load from CSV/JSON training_texts = [ "The BEAM VM provides fault-tolerant concurrent computing", "Nx brings numerical computing to the Elixir ecosystem", "Phoenix LiveView enables real-time web applications", "Bumblebee integrates Hugging Face models into Elixir", "Axon provides a functional API for neural networks", "The weather is sunny and warm today", "Football season starts next month", "The stock market rallied on positive earnings reports", # ... add more samples per class ] training_labels = [0, 0, 0, 0, 0, 1, 1, 2] # 0=tech, 1=sports, 2=finance # Tokenize {:ok, ft_tokenizer} = Bumblebee.load_tokenizer({:hf, "google-bert/bert-base-uncased"}) encoded = Bumblebee.apply_tokenizer(ft_tokenizer, training_texts) IO.inspect(encoded, label: "Tokenized input") ``` ### 7.5.3 Training Loop ```elixir defmodule FineTuner do import Nx.Defn defn cross_entropy_loss(logits, labels) do log_probs = Axon.Activations.log_softmax(logits, axis: -1) -Nx.mean(Nx.sum(log_probs * labels, axes: [-1])) end def train_step(model, params, batch, learning_rate \\ 2.0e-5) do {loss, gradient} = Nx.Defn.value_and_grad(params, fn p -> output = Axon.predict(model, p, batch) cross_entropy_loss(output.logits, batch["labels"]) end) new_params = Map.new(params, fn {k, v} -> {k, Nx.subtract(v, Nx.multiply(learning_rate, gradient[k] || 0))} end) {loss, new_params} end end # Example training (simplified — real training uses Axon.Loop) epochs = 3 batch_size = 4 for epoch <- 1..epochs do encoded |> Nx.to_batched(batch_size) |> Enum.reduce(ft_params, fn batch, params -> labels = Nx.eye(3) |> Nx.take(Nx.tensor(Enum.take(training_labels, batch_size))) batch_with_labels = Map.put(batch, "labels", labels) {loss, new_params} = FineTuner.train_step(ft_model, params, batch_with_labels) if rem(epoch, 1) == 0 do IO.puts("Epoch #{epoch}, Loss: #{Float.round(Nx.to_number(loss), 4)}") end new_params end) end ``` > **Tip:** For production fine-tuning, use `Axon.Loop` with proper data > pipelines, learning rate scheduling, and mixed precision. The above > demonstrates the concept — see `Bumblebee` examples on GitHub for > full fine-tuning recipes. --- ## Section 7.6 — Model Export (ONNX / GGUF) ### 7.6.1 Export a Bumblebee model to ONNX ```elixir # Load a pretrained model {:ok, spec} = Bumblebee.load_spec({:hf, "distilbert/bert-base-uncased"}, module: Bumblebee.Text.BertForSequenceClassification) {:ok, model_info} = Bumblebee.load_model({:hf, "distilbert/bert-base-uncased"}, spec: spec) # Export to ONNX (requires `onnx` library installed) Bumblebee.export(model_info.model, format: :onnx, path: "distilbert.onnx") ``` The resulting `distilbert.onnx` can be loaded in any ONNX runtime. ### 7.6.2 Export to GGUF (via `gguf-converter`) ```bash gguf-converter --onnx distilbert.onnx --output distilbert.gguf ``` > **Note:** GGUF is primarily for decoder models (e.g., Llama). Ensure compatibility. --- ## Section 7.7 — Phoenix LiveView Integration Deploy ML models into Phoenix web apps using `Nx.Serving` and LiveView. ### 7.7.1 Application Supervision Tree ```elixir # In your Phoenix app's application.ex: defmodule MyApp.Application do use Application @impl true def start(_type, _args) do children = [ MyAppWeb.Telemetry, {DNSCluster, query: Application.get_env(:my_app, :dns_cluster_query) || :ignore}, {Phoenix.PubSub, name: MyApp.PubSub}, # ─── ML Servings ─────────────────────────────────────── # Sentiment analysis serving {Nx.Serving, serving: sentiment_serving(), name: MyApp.SentimentServing, batch_size: 16, batch_timeout: 100}, # Image classification serving {Nx.Serving, serving: image_serving(), name: MyApp.ImageServing, batch_size: 8, batch_timeout: 200}, # Embedding serving (for similarity search) {Nx.Serving, serving: embedding_serving(), name: MyApp.EmbeddingServing, batch_size: 32, batch_timeout: 50}, MyAppWeb.Endpoint ] opts = [strategy: :one_for_one, name: MyApp.Supervisor] Supervisor.start_link(children, opts) end defp sentiment_serving do {:ok, model} = Bumblebee.load_model({:hf, "distilbert/distilbert-base-uncased-finetuned-sst-2-english"}) {:ok, tokenizer} = Bumblebee.load_tokenizer({:hf, "distilbert/distilbert-base-uncased-finetuned-sst-2-english"}) Bumblebee.Text.Classification.text_classification(model, tokenizer) end defp image_serving do {:ok, model} = Bumblebee.load_model({:hf, "google/vit-base-patch16-224"}) {:ok, featurizer} = Bumblebee.load_featurizer({:hf, "google/vit-base-patch16-224"}) Bumblebee.Vision.ImageClassification.image_classification(model, featurizer) end defp embedding_serving do {:ok, model} = Bumblebee.load_model({:hf, "sentence-transformers/all-MiniLM-L6-v2"}) {:ok, tokenizer} = Bumblebee.load_tokenizer({:hf, "sentence-transformers/all-MiniLM-L6-v2"}) Bumblebee.Text.TextEmbedding.text_embedding(model, tokenizer) end end ``` ### 7.7.2 LiveView for Sentiment Analysis ```elixir # lib/my_app_web/live/sentiment_live.ex defmodule MyAppWeb.SentimentLive do use MyAppWeb, :live_view def mount(_params, _session, socket) do {:ok, assign(socket, text: "", result: nil, loading: false)} end def handle_event("analyze", %{"text" => text}, socket) do # Nx.Serving.run is synchronous and fast (batches with other requests) result = Nx.Serving.run(MyApp.SentimentServing, text) label = hd(result.predictions) {:noreply, assign(socket, text: text, result: %{ label: label.label, score: Float.round(label.score * 100, 1) }, loading: false )} end def handle_event("update_text", %{"text" => text}, socket) do {:noreply, assign(socket, text: text)} end def render(assigns) do ~H"""
<%= @result.label %> — <%= @result.score %>%