YAML Metadata Warning:empty or missing yaml metadata in repo card
Check out the documentation for more information.
AudioLLM Model
This repository contains the trained weights for an AudioLLM model, which combines LLaMA and Whisper models for audio-enhanced language understanding and generation.
Model Details
- Base LLaMA model:
meta-llama/Llama-3.2-3B-Instruct - Base Whisper model:
openai/whisper-large-v3-turbo - LoRA rank: 32
Usage
You can use this model with the inference.py script available in this repository:
from inference import load_audio_llm, transcribe_and_generate
# Load the model
model = load_audio_llm(
repo_id="cdreetz/audio-llama-v1.1",
llama_path="meta-llama/Llama-3.2-3B-Instruct",
whisper_path="openai/whisper-large-v3-turbo"
)
# Generate text from an audio file
response = transcribe_and_generate(
model=model,
audio_path="path/to/audio.wav",
prompt="Describe what you hear in this audio:"
)
print(response)
For more details, see the included inference script.
- Downloads last month
- 1
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support