Instructions to use google/gemma-3n-E4B-it with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use google/gemma-3n-E4B-it with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("image-text-to-text", model="google/gemma-3n-E4B-it") messages = [ { "role": "user", "content": [ {"type": "image", "url": "https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/p-blog/candy.JPG"}, {"type": "text", "text": "What animal is on the candy?"} ] }, ] pipe(text=messages)# Load model directly from transformers import AutoProcessor, AutoModelForImageTextToText processor = AutoProcessor.from_pretrained("google/gemma-3n-E4B-it") model = AutoModelForImageTextToText.from_pretrained("google/gemma-3n-E4B-it") messages = [ { "role": "user", "content": [ {"type": "image", "url": "https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/p-blog/candy.JPG"}, {"type": "text", "text": "What animal is on the candy?"} ] }, ] inputs = processor.apply_chat_template( messages, add_generation_prompt=True, tokenize=True, return_dict=True, return_tensors="pt", ).to(model.device) outputs = model.generate(**inputs, max_new_tokens=40) print(processor.decode(outputs[0][inputs["input_ids"].shape[-1]:])) - Inference
- HuggingChat
- Notebooks
- Google Colab
- Kaggle
- Local Apps
- vLLM
How to use google/gemma-3n-E4B-it with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "google/gemma-3n-E4B-it" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "google/gemma-3n-E4B-it", "messages": [ { "role": "user", "content": [ { "type": "text", "text": "Describe this image in one sentence." }, { "type": "image_url", "image_url": { "url": "https://cdn.britannica.com/61/93061-050-99147DCE/Statue-of-Liberty-Island-New-York-Bay.jpg" } } ] } ] }'Use Docker
docker model run hf.co/google/gemma-3n-E4B-it
- SGLang
How to use google/gemma-3n-E4B-it with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "google/gemma-3n-E4B-it" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "google/gemma-3n-E4B-it", "messages": [ { "role": "user", "content": [ { "type": "text", "text": "Describe this image in one sentence." }, { "type": "image_url", "image_url": { "url": "https://cdn.britannica.com/61/93061-050-99147DCE/Statue-of-Liberty-Island-New-York-Bay.jpg" } } ] } ] }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "google/gemma-3n-E4B-it" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "google/gemma-3n-E4B-it", "messages": [ { "role": "user", "content": [ { "type": "text", "text": "Describe this image in one sentence." }, { "type": "image_url", "image_url": { "url": "https://cdn.britannica.com/61/93061-050-99147DCE/Statue-of-Liberty-Island-New-York-Bay.jpg" } } ] } ] }' - Docker Model Runner
How to use google/gemma-3n-E4B-it with Docker Model Runner:
docker model run hf.co/google/gemma-3n-E4B-it
tgi fails saying - upgrade tranformers version
tried to run this on huggingface hosted TGI. it fails with error - upgrade tranformers.
do I need to copy the repo and then add requirements.txt file with transformers version.
or you are going to fix it?
Hi @Tollring ,
Welcome to Google Gemma family of open source model, if you would like run the model by downloading into you local you have to upgraded the latest version of the transformers by running the ! pip install -U transformers. The newly released Gemma models doesn't support older version of the transformers.
Please try and let me know if any additional assistance is required.
Thanks.
env: transformers==4.54.0.dev0
error:
RuntimeError Traceback (most recent call last)
Cell In[11], line 10
7 # model_id = "/data/bangguo/fastvla/google/gemma-3n-E4B-it"
8 model_id = "google/gemma-3n-e4b-it"
---> 10 model = Gemma3nForConditionalGeneration.from_pretrained(model_id, device_map="cuda", torch_dtype=torch.bfloat16,).eval()
12 processor = AutoProcessor.from_pretrained(model_id)
File ~/miniconda3/envs/fastvla-gemma/lib/python3.10/site-packages/transformers/modeling_utils.py:311, in restore_default_torch_dtype.._wrapper(*args, **kwargs)
309 old_dtype = torch.get_default_dtype()
310 try:
--> 311 return func(*args, **kwargs)
312 finally:
313 torch.set_default_dtype(old_dtype)
File ~/miniconda3/envs/fastvla-gemma/lib/python3.10/site-packages/transformers/modeling_utils.py:4760, in PreTrainedModel.from_pretrained(cls, pretrained_model_name_or_path, config, cache_dir, ignore_mismatched_sizes, force_download, local_files_only, token, revision, use_safetensors, weights_only, *model_args, **kwargs)
4752 config = cls._autoset_attn_implementation(
4753 config,
4754 torch_dtype=torch_dtype,
4755 device_map=device_map,
4756 )
4758 with ContextManagers(model_init_context):
4759 # Let's make sure we don't run the init function of buffer modules
-> 4760 model = cls(config, *model_args, **model_kwargs)
4762 # Make sure to tie the weights correctly
4763 model.tie_weights()
File ~/miniconda3/envs/fastvla-gemma/lib/python3.10/site-packages/transformers/models/gemma3n/modeling_gemma3n.py:2196, in Gemma3nForConditionalGeneration.init(self, config)
2194 def init(self, config: Gemma3nConfig):
2195 super().init(config)
-> 2196 self.model = Gemma3nModel(config)
2197 self.lm_head = nn.Linear(config.text_config.hidden_size, config.text_config.vocab_size, bias=False)
2198 self.post_init()
File ~/miniconda3/envs/fastvla-gemma/lib/python3.10/site-packages/transformers/models/gemma3n/modeling_gemma3n.py:1948, in Gemma3nModel.init(self, config)
1946 def init(self, config: Gemma3nConfig):
1947 super().init(config)
-> 1948 self.vision_tower = AutoModel.from_config(config=config.vision_config)
1949 self.vocab_size = config.text_config.vocab_size
1951 language_model = AutoModel.from_config(config=config.text_config)
File ~/miniconda3/envs/fastvla-gemma/lib/python3.10/site-packages/transformers/models/auto/auto_factory.py:456, in _BaseAutoModelClass.from_config(cls, config, **kwargs)
454 elif type(config) in cls._model_mapping.keys():
455 model_class = _get_model_class(config, cls._model_mapping)
--> 456 return model_class._from_config(config, **kwargs)
458 raise ValueError(
459 f"Unrecognized configuration class {config.class} for this kind of AutoModel: {cls.name}.\n"
460 f"Model type should be one of {', '.join(c.name for c in cls._model_mapping.keys())}."
461 )
File ~/miniconda3/envs/fastvla-gemma/lib/python3.10/site-packages/transformers/modeling_utils.py:311, in restore_default_torch_dtype.._wrapper(*args, **kwargs)
309 old_dtype = torch.get_default_dtype()
310 try:
--> 311 return func(*args, **kwargs)
312 finally:
313 torch.set_default_dtype(old_dtype)
File ~/miniconda3/envs/fastvla-gemma/lib/python3.10/site-packages/transformers/modeling_utils.py:2208, in PreTrainedModel._from_config(cls, config, **kwargs)
2205 model = cls(config, **kwargs)
2207 else:
-> 2208 model = cls(config, **kwargs)
2210 # restore default dtype if it was modified
2211 if dtype_orig is not None:
File ~/miniconda3/envs/fastvla-gemma/lib/python3.10/site-packages/transformers/models/timm_wrapper/modeling_timm_wrapper.py:120, in TimmWrapperModel.init(self, config)
118 # using num_classes=0 to avoid creating classification head
119 extra_init_kwargs = config.model_args or {}
--> 120 self.timm_model = timm.create_model(config.architecture, pretrained=False, num_classes=0, **extra_init_kwargs)
121 self.post_init()
File ~/miniconda3/envs/fastvla-gemma/lib/python3.10/site-packages/timm/models/_factory.py:122, in create_model(model_name, pretrained, pretrained_cfg, pretrained_cfg_overlay, checkpoint_path, cache_dir, scriptable, exportable, no_jit, **kwargs)
119 pretrained_cfg = pretrained_tag
121 if not is_model(model_name):
--> 122 raise RuntimeError('Unknown model (%s)' % model_name)
124 create_fn = model_entrypoint(model_name)
125 with set_layer_config(scriptable=scriptable, exportable=exportable, no_jit=no_jit):
RuntimeError: Unknown model (mobilenetv5_300m_enc)
It seems that the code of vision encoder mobilenetv5_300m_enc is not uploaded to transformers?
#uninstall and update timm
!pip uninstall -y timm
!pip install timm --upgrade
Hi @Ethan-pooh ,
Please upgrade the timm to the latest version that should resolve the mobilenetv5_300m_enc issue. Please let me know if further assistance required.
!pip uninstall -y timm
!pip install -U timm
Thanks.