Parakeet-TDT-ExecuTorch-XNNPACK
Pre-exported ExecuTorch .pte file
for Parakeet TDT 0.6B with
XNNPACK backend (CPU). A fast speech-to-text model with word-level timestamps.
This export is unquantized (fp32, 2.4 GB). For a quantized version, see the export guide.
Installation
git clone https://github.com/pytorch/executorch/ ~/executorch
cd ~/executorch && ./install_executorch.sh
make parakeet-cpu
Download
pip install huggingface_hub
huggingface-cli download younghan-meta/Parakeet-TDT-ExecuTorch-XNNPACK --local-dir ~/parakeet
Run
cmake-out/examples/models/parakeet/parakeet_runner \
--model_path ~/parakeet/model.pte \
--tokenizer_path ~/parakeet/tokenizer.model \
--audio_path ~/parakeet/poem.wav
Optional flags:
--timestamps segment-- timestamp granularity:none|token|word|segment|all(default:segment)
Export Command
pip install "nemo_toolkit[asr]"
python examples/models/parakeet/export_parakeet_tdt.py \
--backend xnnpack --output-dir ./parakeet_xnnpack
For quantized export (smaller, faster):
python examples/models/parakeet/export_parakeet_tdt.py \
--backend xnnpack --qlinear_encoder 8da4w --qlinear 8da4w --qembedding 8w
More Info
- Downloads last month
- 9
Model tree for younghan-meta/Parakeet-TDT-ExecuTorch-XNNPACK
Base model
nvidia/parakeet-tdt-0.6b-v3