Instructions to use AdaptLLM/biomed-gemma-3-4b-it with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use AdaptLLM/biomed-gemma-3-4b-it with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("image-text-to-text", model="AdaptLLM/biomed-gemma-3-4b-it") messages = [ { "role": "user", "content": [ {"type": "image", "url": "https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/p-blog/candy.JPG"}, {"type": "text", "text": "What animal is on the candy?"} ] }, ] pipe(text=messages)# Load model directly from transformers import AutoProcessor, AutoModelForImageTextToText processor = AutoProcessor.from_pretrained("AdaptLLM/biomed-gemma-3-4b-it") model = AutoModelForImageTextToText.from_pretrained("AdaptLLM/biomed-gemma-3-4b-it") messages = [ { "role": "user", "content": [ {"type": "image", "url": "https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/p-blog/candy.JPG"}, {"type": "text", "text": "What animal is on the candy?"} ] }, ] inputs = processor.apply_chat_template( messages, add_generation_prompt=True, tokenize=True, return_dict=True, return_tensors="pt", ).to(model.device) outputs = model.generate(**inputs, max_new_tokens=40) print(processor.decode(outputs[0][inputs["input_ids"].shape[-1]:])) - Notebooks
- Google Colab
- Kaggle
- Local Apps
- vLLM
How to use AdaptLLM/biomed-gemma-3-4b-it with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "AdaptLLM/biomed-gemma-3-4b-it" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "AdaptLLM/biomed-gemma-3-4b-it", "messages": [ { "role": "user", "content": [ { "type": "text", "text": "Describe this image in one sentence." }, { "type": "image_url", "image_url": { "url": "https://cdn.britannica.com/61/93061-050-99147DCE/Statue-of-Liberty-Island-New-York-Bay.jpg" } } ] } ] }'Use Docker
docker model run hf.co/AdaptLLM/biomed-gemma-3-4b-it
- SGLang
How to use AdaptLLM/biomed-gemma-3-4b-it with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "AdaptLLM/biomed-gemma-3-4b-it" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "AdaptLLM/biomed-gemma-3-4b-it", "messages": [ { "role": "user", "content": [ { "type": "text", "text": "Describe this image in one sentence." }, { "type": "image_url", "image_url": { "url": "https://cdn.britannica.com/61/93061-050-99147DCE/Statue-of-Liberty-Island-New-York-Bay.jpg" } } ] } ] }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "AdaptLLM/biomed-gemma-3-4b-it" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "AdaptLLM/biomed-gemma-3-4b-it", "messages": [ { "role": "user", "content": [ { "type": "text", "text": "Describe this image in one sentence." }, { "type": "image_url", "image_url": { "url": "https://cdn.britannica.com/61/93061-050-99147DCE/Statue-of-Liberty-Island-New-York-Bay.jpg" } } ] } ] }' - Docker Model Runner
How to use AdaptLLM/biomed-gemma-3-4b-it with Docker Model Runner:
docker model run hf.co/AdaptLLM/biomed-gemma-3-4b-it
Adapting Multimodal Large Language Models to Domains via Post-Training (EMNLP 2025)
This repos contains the biomedicine MLLM developed from gemma-3-4b-it in our paper: On Domain-Adaptive Post-Training for Multimodal Large Language Models. The correspoding training dataset is in biomed-visual-instructions.
The main project page is: Adapt-MLLM-to-Domains
1. To Chat with AdaMLLM
Our model architecture aligns with the base model: gemma-3-4b-it. We provide a usage example below, and you may refer to the official google/gemma-3-4b-it for more advanced usage instructions.
Note: For AdaMLLM, always place the image at the beginning of the input instruction in the messages.
Click to expand
Below, there are some code snippets on how to get quickly started with running the model. First, install the Transformers library. Gemma 3 is supported starting from transformers 4.50.0.
$ pip install -U transformers
Then, copy the snippet from the section that is relevant for your use case.
Running with the pipeline API
You can initialize the model and processor for inference with pipeline as follows.
from transformers import pipeline
import torch
pipe = pipeline(
"image-text-to-text",
model="AdaptLLM/biomed-gemma-3-4b-it",
device="cuda",
torch_dtype=torch.bfloat16
)
With instruction-tuned models, you need to use chat templates to process our inputs first. Then, you can pass it to the pipeline.
messages = [
{
"role": "system",
"content": [{"type": "text", "text": "You are a helpful assistant."}]
},
{
"role": "user",
"content": [
{"type": "image", "url": "https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/p-blog/candy.JPG"},
{"type": "text", "text": "What animal is on the candy?"}
]
}
]
output = pipe(text=messages, max_new_tokens=200)
print(output[0]["generated_text"][-1]["content"])
2. Domain-Specific Benchmarks
We provide biomed-VQA-benchmark to evaluate any MLLMs.
3. To Reproduce this Domain-Adapted MLLM
Using our training data, biomed-visual-instructions, you can easily reproduce our models based on the LlamaFactory repository.
For reference, we train from google/gemma-3-4b-it for 1 epoch with a learning rate of 1e-5, and a global batch size of 128.
Citation
If you find our work helpful, please cite us.
Adapt MLLM to Domains (EMNLP 2025 Findings)
@article{adamllm,
title={On Domain-Adaptive Post-Training for Multimodal Large Language Models},
author={Cheng, Daixuan and Huang, Shaohan and Zhu, Ziyu and Zhang, Xintong and Zhao, Wayne Xin and Luan, Zhongzhi and Dai, Bo and Zhang, Zhenliang},
journal={arXiv preprint arXiv:2411.19930},
year={2024}
}
Adapt LLM to Domains (ICLR 2024)
@inproceedings{
cheng2024adapting,
title={Adapting Large Language Models via Reading Comprehension},
author={Daixuan Cheng and Shaohan Huang and Furu Wei},
booktitle={The Twelfth International Conference on Learning Representations},
year={2024},
url={https://openreview.net/forum?id=y886UXPEZ0}
}
- Downloads last month
- 7