Instructions to use google/gemma-7b-it with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use google/gemma-7b-it with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="google/gemma-7b-it")
messages = [
    {"role": "user", "content": "Who are you?"},
]
pipe(messages)

# Load model directly
from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained("google/gemma-7b-it")
model = AutoModelForCausalLM.from_pretrained("google/gemma-7b-it")
messages = [
    {"role": "user", "content": "Who are you?"},
]
inputs = tokenizer.apply_chat_template(
	messages,
	add_generation_prompt=True,
	tokenize=True,
	return_dict=True,
	return_tensors="pt",
).to(model.device)

outputs = model.generate(**inputs, max_new_tokens=40)
print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:]))

llama-cpp-python

How to use google/gemma-7b-it with llama-cpp-python:

# !pip install llama-cpp-python

from llama_cpp import Llama

llm = Llama.from_pretrained(
	repo_id="google/gemma-7b-it",
	filename="gemma-7b-it.gguf",
)

llm.create_chat_completion(
	messages = [
		{
			"role": "user",
			"content": "What is the capital of France?"
		}
	]
)

Inference
Notebooks
Google Colab
Kaggle
Local Apps

llama.cpp

How to use google/gemma-7b-it with llama.cpp:

Install from brew

brew install llama.cpp
# Start a local OpenAI-compatible server with a web UI:
llama-server -hf google/gemma-7b-it
# Run inference directly in the terminal:
llama-cli -hf google/gemma-7b-it

Install from WinGet (Windows)

winget install llama.cpp
# Start a local OpenAI-compatible server with a web UI:
llama-server -hf google/gemma-7b-it
# Run inference directly in the terminal:
llama-cli -hf google/gemma-7b-it

Use pre-built binary

# Download pre-built binary from:
# https://github.com/ggerganov/llama.cpp/releases
# Start a local OpenAI-compatible server with a web UI:
./llama-server -hf google/gemma-7b-it
# Run inference directly in the terminal:
./llama-cli -hf google/gemma-7b-it

Build from source code

git clone https://github.com/ggerganov/llama.cpp.git
cd llama.cpp
cmake -B build
cmake --build build -j --target llama-server llama-cli
# Start a local OpenAI-compatible server with a web UI:
./build/bin/llama-server -hf google/gemma-7b-it
# Run inference directly in the terminal:
./build/bin/llama-cli -hf google/gemma-7b-it

Use Docker

docker model run hf.co/google/gemma-7b-it

LM Studio
Jan

vLLM

How to use google/gemma-7b-it with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "google/gemma-7b-it"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "google/gemma-7b-it",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker

docker model run hf.co/google/gemma-7b-it

SGLang

How to use google/gemma-7b-it with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "google/gemma-7b-it" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "google/gemma-7b-it",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "google/gemma-7b-it" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "google/gemma-7b-it",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Ollama
How to use google/gemma-7b-it with Ollama:
```
ollama run hf.co/google/gemma-7b-it
```

Unsloth Studio new

How to use google/gemma-7b-it with Unsloth Studio:

Install Unsloth Studio (macOS, Linux, WSL)

curl -fsSL https://unsloth.ai/install.sh | sh
# Run unsloth studio
unsloth studio -H 0.0.0.0 -p 8888
# Then open http://localhost:8888 in your browser
# Search for google/gemma-7b-it to start chatting

Install Unsloth Studio (Windows)

irm https://unsloth.ai/install.ps1 | iex
# Run unsloth studio
unsloth studio -H 0.0.0.0 -p 8888
# Then open http://localhost:8888 in your browser
# Search for google/gemma-7b-it to start chatting

Using HuggingFace Spaces for Unsloth

# No setup required
# Open https://huggingface.co/spaces/unsloth/studio in your browser
# Search for google/gemma-7b-it to start chatting

Docker Model Runner
How to use google/gemma-7b-it with Docker Model Runner:
```
docker model run hf.co/google/gemma-7b-it
```

Lemonade

How to use google/gemma-7b-it with Lemonade:

Pull the model

# Download Lemonade from https://lemonade-server.ai/
lemonade pull google/gemma-7b-it

Run and chat with the model

lemonade run user.gemma-7b-it-{{QUANT_TAG}}

List all available models

lemonade list

Difficulty importing Pipeline - AttributeError: module 'keras._tf_keras.keras' has no attribute 'internal'

#71

by mqureshi - opened Mar 1, 2024

Discussion

mqureshi

Mar 1, 2024

Can't seem to do:
from transformers import pipeline
what versions of keras, tensorflow, transformers, etc are you guys using? Full traceback below:

AttributeError Traceback (most recent call last)
File /opt/conda/lib/python3.10/site-packages/transformers/utils/import_utils.py:1390, in _LazyModule._get_module(self, module_name)
1389 try:
-> 1390 return importlib.import_module("." + module_name, self.name)
1391 except Exception as e:

File /opt/conda/lib/python3.10/importlib/init.py:126, in import_module(name, package)
125 level += 1
--> 126 return _bootstrap._gcd_import(name[level:], package, level)

File :1050, in _gcd_import(name, package, level)

File :1027, in find_and_load(name, import)

File :1006, in find_and_load_unlocked(name, import)

File :688, in _load_unlocked(spec)

File :883, in exec_module(self, module)

File :241, in _call_with_frames_removed(f, *args, **kwds)

File /opt/conda/lib/python3.10/site-packages/transformers/pipelines/init.py:74
73 from .question_answering import QuestionAnsweringArgumentHandler, QuestionAnsweringPipeline
---> 74 from .table_question_answering import TableQuestionAnsweringArgumentHandler, TableQuestionAnsweringPipeline
75 from .text2text_generation import SummarizationPipeline, Text2TextGenerationPipeline, TranslationPipeline

File /opt/conda/lib/python3.10/site-packages/transformers/pipelines/table_question_answering.py:26
25 import tensorflow as tf
---> 26 import tensorflow_probability as tfp
28 from ..models.auto.modeling_tf_auto import (
29 TF_MODEL_FOR_SEQ_TO_SEQ_CAUSAL_LM_MAPPING_NAMES,
30 TF_MODEL_FOR_TABLE_QUESTION_ANSWERING_MAPPING_NAMES,
31 )

File /opt/conda/lib/python3.10/site-packages/tensorflow_probability/init.py:20
17 # Contributors to the python/ dir should not alter this file; instead update
18 # python/__init__.py as necessary.
---> 20 from tensorflow_probability import substrates
21 # from tensorflow_probability.google import staging # DisableOnExport
22 # from tensorflow_probability.google import tfp_google # DisableOnExport

File /opt/conda/lib/python3.10/site-packages/tensorflow_probability/substrates/init.py:17
15 """TensorFlow Probability alternative substrates."""
---> 17 from tensorflow_probability.python.internal import all_util
18 from tensorflow_probability.python.internal import lazy_loader # pylint: disable=g-direct-tensorflow-import

File /opt/conda/lib/python3.10/site-packages/tensorflow_probability/python/init.py:138
137 for pkg_name in _maybe_nonlazy_load:
--> 138 dir(globals()[pkg_name]) # Forces loading the package from its lazy loader.
141 all_util.remove_undocumented(name, _lazy_load + _maybe_nonlazy_load)

File /opt/conda/lib/python3.10/site-packages/tensorflow_probability/python/internal/lazy_loader.py:57, in LazyLoader.dir(self)
56 def dir(self):
---> 57 module = self._load()
58 return dir(module)

File /opt/conda/lib/python3.10/site-packages/tensorflow_probability/python/internal/lazy_loader.py:40, in LazyLoader._load(self)
39 # Import the target module and insert it into the parent's namespace
---> 40 module = importlib.import_module(self.name)
41 if self._parent_module_globals is not None:

File /opt/conda/lib/python3.10/importlib/init.py:126, in import_module(name, package)
125 level += 1
--> 126 return _bootstrap._gcd_import(name[level:], package, level)

File /opt/conda/lib/python3.10/site-packages/tensorflow_probability/python/experimental/init.py:31
30 from tensorflow_probability.python.experimental import auto_batching
---> 31 from tensorflow_probability.python.experimental import bayesopt
32 from tensorflow_probability.python.experimental import bijectors

File /opt/conda/lib/python3.10/site-packages/tensorflow_probability/python/experimental/bayesopt/init.py:17
15 """TensorFlow Probability experimental Bayesopt package."""
---> 17 from tensorflow_probability.python.experimental.bayesopt import acquisition
18 from tensorflow_probability.python.internal import all_util

File /opt/conda/lib/python3.10/site-packages/tensorflow_probability/python/experimental/bayesopt/acquisition/init.py:19
18 from tensorflow_probability.python.experimental.bayesopt.acquisition.acquisition_function import MCMCReducer
---> 19 from tensorflow_probability.python.experimental.bayesopt.acquisition.expected_improvement import GaussianProcessExpectedImprovement
20 from tensorflow_probability.python.experimental.bayesopt.acquisition.expected_improvement import ParallelExpectedImprovement

File /opt/conda/lib/python3.10/site-packages/tensorflow_probability/python/experimental/bayesopt/acquisition/expected_improvement.py:19
17 import tensorflow.compat.v2 as tf
---> 19 from tensorflow_probability.python.distributions import normal
20 from tensorflow_probability.python.distributions import student_t

File /opt/conda/lib/python3.10/site-packages/tensorflow_probability/python/distributions/init.py:110
109 from tensorflow_probability.python.distributions.pert import PERT
--> 110 from tensorflow_probability.python.distributions.pixel_cnn import PixelCNN
111 from tensorflow_probability.python.distributions.plackett_luce import PlackettLuce

File /opt/conda/lib/python3.10/site-packages/tensorflow_probability/python/distributions/pixel_cnn.py:33
32 from tensorflow_probability.python.internal import tensorshape_util
---> 33 from tensorflow_probability.python.layers import weight_norm
36 class PixelCNN(distribution.Distribution):

File /opt/conda/lib/python3.10/site-packages/tensorflow_probability/python/layers/init.py:27
26 from tensorflow_probability.python.layers.dense_variational_v2 import DenseVariational
---> 27 from tensorflow_probability.python.layers.distribution_layer import CategoricalMixtureOfOneHotCategorical
28 from tensorflow_probability.python.layers.distribution_layer import DistributionLambda

File /opt/conda/lib/python3.10/site-packages/tensorflow_probability/python/layers/distribution_layer.py:68
50 all = [
51 'CategoricalMixtureOfOneHotCategorical',
52 'DistributionLambda',
(...)
64 'VariationalGaussianProcess',
65 ]
---> 68 tf.keras.internal.utils.register_symbolic_tensor_type(dtc._TensorCoercible) # pylint: disable=protected-access
71 def _event_size(event_shape, name=None):

AttributeError: module 'keras._tf_keras.keras' has no attribute 'internal'

The above exception was the direct cause of the following exception:

RuntimeError Traceback (most recent call last)
Cell In[1], line 2
1 #from tensorflow import keras
----> 2 from transformers import pipeline

File :1075, in handle_fromlist(module, fromlist, import, recursive)

File /opt/conda/lib/python3.10/site-packages/transformers/utils/import_utils.py:1380, in _LazyModule.getattr(self, name)
1378 value = self._get_module(name)
1379 elif name in self._class_to_module.keys():
-> 1380 module = self._get_module(self._class_to_module[name])
1381 value = getattr(module, name)
1382 else:

File /opt/conda/lib/python3.10/site-packages/transformers/utils/import_utils.py:1392, in _LazyModule._get_module(self, module_name)
1390 return importlib.import_module("." + module_name, self.name)
1391 except Exception as e:
-> 1392 raise RuntimeError(
1393 f"Failed to import {self.name}.{module_name} because of the following error (look up to see its"
1394 f" traceback):\n{e}"
1395 ) from e

RuntimeError: Failed to import transformers.pipelines because of the following error (look up to see its traceback):
module 'keras._tf_keras.keras' has no attribute 'internal'

vidishanevatia

Mar 13, 2024

Seeing the same error. Any solutions?

osanseviero

Google org Mar 13, 2024

Hi there! Could you please open an issue in transformers with the details about the environment and version?
https://github.com/huggingface/transformers

lysandre

Mar 13, 2024

Hey! It looks like a potential error with tensorflow_probability/keras? Can you try uninstalling it to see if it fixes your issue?

Also happy to follow this in Transformers issues as @osanseviero recommends above.

osanseviero

Google org Mar 13, 2024

Yes, this seems related to https://github.com/tensorflow/probability/issues/1774#issuecomment-1979642276 which has not been released yet

Rocketknight1

Mar 13, 2024

Hi all, we've just merged a workaround in transformers here. If you install transformers from main with pip install git+https://github.com/huggingface/transformers.git the issue should now be resolved. The fix will also be included in the next transformers release.

If this doesn't resolve the issue for you, please ping me and I'll keep investigating!

hamedelc

Mar 26, 2024

I upgraded tensorflow-probability to version 0.24.0 then installed tensorflow-keras ---> problem is solved!

rubtal

Mar 28, 2024

As @hamedelc commented:

https://stackoverflow.com/a/78233592/13086128

lkv

Google org Oct 8, 2024

Hi @hamedelc , @rubtal , I hope the issue has been resolved. Please let us know if any further assistance is needed. Thanks!

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment

Difficulty importing Pipeline - AttributeError: module 'keras._tf_keras.keras' has no attribute '__internal__'

Difficulty importing Pipeline - AttributeError: module 'keras._tf_keras.keras' has no attribute 'internal'