Instructions to use nkkbr/hiera-base-plus-in-sam2.1 with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- sam2
How to use nkkbr/hiera-base-plus-in-sam2.1 with sam2:
# Use SAM2 with images import torch from sam2.sam2_image_predictor import SAM2ImagePredictor predictor = SAM2ImagePredictor.from_pretrained(nkkbr/hiera-base-plus-in-sam2.1) with torch.inference_mode(), torch.autocast("cuda", dtype=torch.bfloat16): predictor.set_image(<your_image>) masks, _, _ = predictor.predict(<input_prompts>)# Use SAM2 with videos import torch from sam2.sam2_video_predictor import SAM2VideoPredictor predictor = SAM2VideoPredictor.from_pretrained(nkkbr/hiera-base-plus-in-sam2.1) with torch.inference_mode(), torch.autocast("cuda", dtype=torch.bfloat16): state = predictor.init_state(<your_video>) # add new prompts and instantly get the output on the same frame frame_idx, object_ids, masks = predictor.add_new_points(state, <your_prompts>): # propagate the prompts to get masklets throughout the video for frame_idx, object_ids, masks in predictor.propagate_in_video(state): ... - Notebooks
- Google Colab
- Kaggle
Hiera Encoder from Meta's SAM2.1 (Segment Anything Model)
Meta's SAM2 (Segment Anything Model v2) demonstrates state-of-the-art video segmentation capabilities. A core component enabling this is the Hiera module, which, through supervised training on object segmentation, has learned a strong understanding of hierarchical visual features.
While Meta has released the full SAM2 models and their weights, these releases are based on PyTorch code and not integrated with Hugging Face Transformers or common training frameworks such as Trainer, DeepSpeed, etc.
This repository extracts the Hiera module from SAM2 and wraps it with Hugging Face compatibility, including integration with PretrainedConfig, PreTrainedModel, etc., allowing seamless use in Hugging Face-style training and inference workflows.
Model Details
- Original Model: facebook/sam2.1-hiera-base-plus
- This Model:
nkkbr/hiera-base-plus-in-sam2.1
This model exposes only the Hiera encoder extracted from SAM2.1, wrapped for Hugging Face usage.
Installation
You first need to install Meta’s original SAM2 code:
git clone https://github.com/facebookresearch/sam2.git && cd sam2
pip install -e .
Usage
from hiera_encoder import HieraVisionModel
# Load the Hiera module from Hugging Face
model = HieraVisionModel.from_pretrained("nkkbr/hiera-base-plus-in-sam2.1")
# Get the raw Hiera model
model = model.hiera
# Print model parameters
for name, param in model.named_parameters():
print(f"{name:50} {param.shape}")
Weight Consistency Check
To verify that the weights are identical to those in Meta's original SAM2.1 Hiera module:
import torch
from sam2.sam2_image_predictor import SAM2ImagePredictor
# Load SAM2.1 predictor from Meta's official release
predictor = SAM2ImagePredictor.from_pretrained("facebook/sam2.1-hiera-base-plus")
hiera_model_in_predictor = predictor.model.image_encoder.trunk
# Compare weights
for name, param in model.named_parameters():
if not torch.equal(param, hiera_model_in_predictor.state_dict()[name]):
print(f"The parameter {name} has different weights in the two models.")
print("Comparison complete!")
License
Please refer to the SAM2 repository for license and usage terms.
- Downloads last month
- -