Instructions to use AdaptLLM/biomed-gemma-3-4b-it with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use AdaptLLM/biomed-gemma-3-4b-it with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("image-text-to-text", model="AdaptLLM/biomed-gemma-3-4b-it")
messages = [
    {
        "role": "user",
        "content": [
            {"type": "image", "url": "https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/p-blog/candy.JPG"},
            {"type": "text", "text": "What animal is on the candy?"}
        ]
    },
]
pipe(text=messages)

# Load model directly
from transformers import AutoProcessor, AutoModelForImageTextToText

processor = AutoProcessor.from_pretrained("AdaptLLM/biomed-gemma-3-4b-it")
model = AutoModelForImageTextToText.from_pretrained("AdaptLLM/biomed-gemma-3-4b-it")
messages = [
    {
        "role": "user",
        "content": [
            {"type": "image", "url": "https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/p-blog/candy.JPG"},
            {"type": "text", "text": "What animal is on the candy?"}
        ]
    },
]
inputs = processor.apply_chat_template(
	messages,
	add_generation_prompt=True,
	tokenize=True,
	return_dict=True,
	return_tensors="pt",
).to(model.device)

outputs = model.generate(**inputs, max_new_tokens=40)
print(processor.decode(outputs[0][inputs["input_ids"].shape[-1]:]))

Notebooks
Google Colab
Kaggle
Local Apps

vLLM

How to use AdaptLLM/biomed-gemma-3-4b-it with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "AdaptLLM/biomed-gemma-3-4b-it"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "AdaptLLM/biomed-gemma-3-4b-it",
		"messages": [
			{
				"role": "user",
				"content": [
					{
						"type": "text",
						"text": "Describe this image in one sentence."
					},
					{
						"type": "image_url",
						"image_url": {
							"url": "https://cdn.britannica.com/61/93061-050-99147DCE/Statue-of-Liberty-Island-New-York-Bay.jpg"
						}
					}
				]
			}
		]
	}'

Use Docker

docker model run hf.co/AdaptLLM/biomed-gemma-3-4b-it

SGLang

How to use AdaptLLM/biomed-gemma-3-4b-it with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "AdaptLLM/biomed-gemma-3-4b-it" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "AdaptLLM/biomed-gemma-3-4b-it",
		"messages": [
			{
				"role": "user",
				"content": [
					{
						"type": "text",
						"text": "Describe this image in one sentence."
					},
					{
						"type": "image_url",
						"image_url": {
							"url": "https://cdn.britannica.com/61/93061-050-99147DCE/Statue-of-Liberty-Island-New-York-Bay.jpg"
						}
					}
				]
			}
		]
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "AdaptLLM/biomed-gemma-3-4b-it" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "AdaptLLM/biomed-gemma-3-4b-it",
		"messages": [
			{
				"role": "user",
				"content": [
					{
						"type": "text",
						"text": "Describe this image in one sentence."
					},
					{
						"type": "image_url",
						"image_url": {
							"url": "https://cdn.britannica.com/61/93061-050-99147DCE/Statue-of-Liberty-Island-New-York-Bay.jpg"
						}
					}
				]
			}
		]
	}'

Docker Model Runner
How to use AdaptLLM/biomed-gemma-3-4b-it with Docker Model Runner:
```
docker model run hf.co/AdaptLLM/biomed-gemma-3-4b-it
```

Adapting Multimodal Large Language Models to Domains via Post-Training (EMNLP 2025)

This repos contains the biomedicine MLLM developed from gemma-3-4b-it in our paper: On Domain-Adaptive Post-Training for Multimodal Large Language Models. The correspoding training dataset is in biomed-visual-instructions.

The main project page is: Adapt-MLLM-to-Domains

1. To Chat with AdaMLLM

Our model architecture aligns with the base model: gemma-3-4b-it. We provide a usage example below, and you may refer to the official google/gemma-3-4b-it for more advanced usage instructions.

Note: For AdaMLLM, always place the image at the beginning of the input instruction in the messages.

Click to expand

Below, there are some code snippets on how to get quickly started with running the model. First, install the Transformers library. Gemma 3 is supported starting from transformers 4.50.0.

$ pip install -U transformers

Then, copy the snippet from the section that is relevant for your use case.

Running with the `pipeline` API

You can initialize the model and processor for inference with pipeline as follows.

from transformers import pipeline
import torch

pipe = pipeline(
    "image-text-to-text",
    model="AdaptLLM/biomed-gemma-3-4b-it",
    device="cuda",
    torch_dtype=torch.bfloat16
)

With instruction-tuned models, you need to use chat templates to process our inputs first. Then, you can pass it to the pipeline.

messages = [
    {
        "role": "system",
        "content": [{"type": "text", "text": "You are a helpful assistant."}]
    },
    {
        "role": "user",
        "content": [
            {"type": "image", "url": "https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/p-blog/candy.JPG"},
            {"type": "text", "text": "What animal is on the candy?"}
        ]
    }
]

output = pipe(text=messages, max_new_tokens=200)
print(output[0]["generated_text"][-1]["content"])

2. Domain-Specific Benchmarks

We provide biomed-VQA-benchmark to evaluate any MLLMs.

3. To Reproduce this Domain-Adapted MLLM

Using our training data, biomed-visual-instructions, you can easily reproduce our models based on the LlamaFactory repository.

For reference, we train from google/gemma-3-4b-it for 1 epoch with a learning rate of 1e-5, and a global batch size of 128.

Citation

If you find our work helpful, please cite us.

Adapt MLLM to Domains (EMNLP 2025 Findings)

@article{adamllm,
  title={On Domain-Adaptive Post-Training for Multimodal Large Language Models},
  author={Cheng, Daixuan and Huang, Shaohan and Zhu, Ziyu and Zhang, Xintong and Zhao, Wayne Xin and Luan, Zhongzhi and Dai, Bo and Zhang, Zhenliang},
  journal={arXiv preprint arXiv:2411.19930},
  year={2024}
}

Adapt LLM to Domains (ICLR 2024)

@inproceedings{
cheng2024adapting,
title={Adapting Large Language Models via Reading Comprehension},
author={Daixuan Cheng and Shaohan Huang and Furu Wei},
booktitle={The Twelfth International Conference on Learning Representations},
year={2024},
url={https://openreview.net/forum?id=y886UXPEZ0}
}

Downloads last month: 7

Safetensors

Model size

5B params

Tensor type

BF16

Model tree for AdaptLLM/biomed-gemma-3-4b-it

Base model

google/gemma-3-4b-pt

Finetuned

google/gemma-3-4b-it

Finetuned

(681)

this model

Dataset used to train AdaptLLM/biomed-gemma-3-4b-it

Papers for AdaptLLM/biomed-gemma-3-4b-it

On Domain-Specific Post-Training for Multimodal Large Language Models

Paper • 2411.19930 • Published Nov 29, 2024 • 30

Adapting Large Language Models via Reading Comprehension

Paper • 2309.09530 • Published Sep 18, 2023 • 82