Instructions to use SIRIS-Lab/impuls-salamandra-7b-query-parser with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use SIRIS-Lab/impuls-salamandra-7b-query-parser with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-generation", model="SIRIS-Lab/impuls-salamandra-7b-query-parser") messages = [ {"role": "user", "content": "Who are you?"}, ] pipe(messages)# Load model directly from transformers import AutoTokenizer, AutoModelForCausalLM tokenizer = AutoTokenizer.from_pretrained("SIRIS-Lab/impuls-salamandra-7b-query-parser") model = AutoModelForCausalLM.from_pretrained("SIRIS-Lab/impuls-salamandra-7b-query-parser") messages = [ {"role": "user", "content": "Who are you?"}, ] inputs = tokenizer.apply_chat_template( messages, add_generation_prompt=True, tokenize=True, return_dict=True, return_tensors="pt", ).to(model.device) outputs = model.generate(**inputs, max_new_tokens=40) print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:])) - Notebooks
- Google Colab
- Kaggle
- Local Apps
- vLLM
How to use SIRIS-Lab/impuls-salamandra-7b-query-parser with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "SIRIS-Lab/impuls-salamandra-7b-query-parser" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "SIRIS-Lab/impuls-salamandra-7b-query-parser", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker
docker model run hf.co/SIRIS-Lab/impuls-salamandra-7b-query-parser
- SGLang
How to use SIRIS-Lab/impuls-salamandra-7b-query-parser with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "SIRIS-Lab/impuls-salamandra-7b-query-parser" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "SIRIS-Lab/impuls-salamandra-7b-query-parser", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "SIRIS-Lab/impuls-salamandra-7b-query-parser" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "SIRIS-Lab/impuls-salamandra-7b-query-parser", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }' - Docker Model Runner
How to use SIRIS-Lab/impuls-salamandra-7b-query-parser with Docker Model Runner:
docker model run hf.co/SIRIS-Lab/impuls-salamandra-7b-query-parser
license: apache-2.0
language:
- ca
- es
- en
base_model: BSC-LT/salamandra-7b-instruct-tools
library_name: transformers
pipeline_tag: text-generation
tags:
- query-parsing
- semantic-search
- structured-output
- json-generation
- multilingual
- catalan
- spanish
- LoRA
- fine-tuned
- AINA
- R&D
datasets:
- SIRIS-Lab/impuls-query-parsing
metrics:
- accuracy
model-index:
- name: IMPULS-Salamandra-7B-Query-Parser
results:
- task:
type: text-generation
name: Query Parsing
metrics:
- name: JSON Validity
type: accuracy
value: 1
- name: Strict Accuracy
type: accuracy
value: 0.51
- name: Relaxed Accuracy
type: accuracy
value: 0.65
- name: Language Match
type: accuracy
value: 0.87
IMPULS-Salamandra-7B-Query-Parser
A fine-tuned version of BSC-LT/salamandra-7b-instruct-tools for converting natural language queries into structured JSON for R&D project semantic search.
Model Description
This model was developed as part of the IMPULS project (AINA Challenge 2024), a collaboration between SIRIS Academic and Generalitat de Catalunya to build a multilingual semantic search system for Catalonia's R&D ecosystem (RIS3-MCAT platform).
The model converts natural language queries in Catalan, Spanish, and English into structured JSON containing:
- Semantic query: Core thematic content for vector search
- Filters: Structured metadata (funding programme, year range, location, organization type)
- Query rewrite: Human-readable interpretation of the query
- Metadata: Language detection and processing notes
Example
Input (Catalan):
projectes d'IA en salut finançats per H2020 des de 2020
Output:
{
"doc_type": "projects",
"filters": {
"programme": "Horizon 2020",
"year": ">=2020"
},
"organisations": [],
"semantic_query": "intel·ligència artificial salut",
"query_rewrite": "Projectes sobre IA en salut del programa H2020 des de 2020",
"meta": {
"lang": "CA"
}
}
Training Details
Base Model
- Model: BSC-LT/salamandra-7b-instruct-tools
- Architecture: LlamaForCausalLM (7B parameters)
Fine-tuning Method
- Technique: LoRA (Low-Rank Adaptation)
- Trainable parameters:
1% of total (50MB adapter)
LoRA Configuration
| Parameter | Value |
|---|---|
| Rank (r) | 16 |
| Alpha | 32 |
| Dropout | 0.05 |
| Target modules | q_proj, k_proj, v_proj, o_proj, gate_proj, up_proj, down_proj |
Training Hyperparameters
| Parameter | Value |
|---|---|
| Epochs | 3 |
| Batch size | 16 (effective) |
| Learning rate | 2e-4 |
| Sequence length | 2048 |
| Precision | FP16 (mixed) |
| Optimizer | AdamW |
| LR scheduler | Cosine |
| Warmup ratio | 0.1 |
Training Data
- Dataset: SIRIS-Lab/impuls-query-parsing
- Training split: 682 multilingual queries (synthetic, template-generated)
- Language distribution: ~33% Catalan, ~33% Spanish, ~33% English
- Query types: Discover (88%), Quantify (12%)
Evaluation Data
- Test split: 100 real queries from domain experts (SIRIS Academic)
- Annotation: Manual gold-standard JSON for each query
Evaluation Results
Overall Performance
| Metric | Base Model | Fine-tuned |
|---|---|---|
| JSON Validity | 100% | 100% |
| Strict Accuracy | 15% | 51% |
| Relaxed Accuracy | 29% | 65% |
| Language Match | 53% | 87% |
| Semantic Query Accuracy | 44% | 86% |
Component-level Accuracy
| Component | Accuracy |
|---|---|
| Programme (H2020, FEDER, etc.) | 96% |
| Year extraction | 98% |
| Location | 91% |
| Organizations | 77% |
| Semantic Query | 86% |
Performance by Language
| Language | Relaxed Accuracy |
|---|---|
| English | 72% |
| Catalan | 64% |
| Spanish | 52% |
Comparison with Other Models
| Model | Strict Accuracy | Relaxed Accuracy | JSON Valid |
|---|---|---|---|
| Salamandra-7B (ours) | 51% | 65% | 100% |
| Qwen 2.5-7B | 47% | 65% | 100% |
| Mistral-7B | 24% | 55% | 100% |
Usage
Basic Usage
from transformers import AutoModelForCausalLM, AutoTokenizer
import torch
model_id = "SIRIS-Lab/impuls-salamandra-7b-query-parser"
tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(
model_id,
torch_dtype=torch.float16,
device_map="auto"
)
# System prompt (simplified version)
system_prompt = """Convert natural language queries into structured JSON for R&D project search.
Output only valid JSON with the required schema."""
query = "projectes d'hidrogen finançats per H2020 des de 2020"
messages = [
{"role": "system", "content": system_prompt},
{"role": "user", "content": query}
]
input_text = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
inputs = tokenizer(input_text, return_tensors="pt").to(model.device)
with torch.no_grad():
outputs = model.generate(
**inputs,
max_new_tokens=512,
temperature=0.1,
do_sample=True
)
response = tokenizer.decode(outputs[0][inputs['input_ids'].shape[1]:], skip_special_tokens=True)
print(response)
With 4-bit Quantization (Recommended for limited VRAM)
from transformers import AutoModelForCausalLM, AutoTokenizer, BitsAndBytesConfig
import torch
quantization_config = BitsAndBytesConfig(
load_in_4bit=True,
bnb_4bit_compute_dtype=torch.float16,
bnb_4bit_quant_type="nf4"
)
model = AutoModelForCausalLM.from_pretrained(
"SIRIS-Lab/impuls-salamandra-7b-query-parser",
quantization_config=quantization_config,
device_map="auto"
)
# Reduces memory from ~14GB to ~3.5GB
Output Schema
{
"doc_type": "projects",
"filters": {
"programme": "string | null",
"funding_level": "string | null",
"year": "string | null",
"location": "string | null",
"location_level": "region | province | country | null"
},
"organisations": [
{
"type": "university | research_center | hospital | company | null",
"name": "string | null",
"location": "string | null",
"location_level": "string | null"
}
],
"semantic_query": "string | null",
"query_rewrite": "string",
"meta": {
"lang": "CA | ES | EN",
"notes": "string | null"
}
}
Hardware Requirements
| Configuration | VRAM Required |
|---|---|
| FP16 (full precision) | ~14 GB |
| 4-bit quantization | ~3.5 GB |
Recommended: GPU with 24GB+ VRAM (A100) or 4-bit quantization on consumer GPUs.
Limitations
- Domain-specific: Optimized for R&D project search queries; may not generalize well to other domains
- Schema-bound: Outputs follow a fixed JSON schema; cannot handle arbitrary structured formats
- Language coverage: Best performance on Catalan and English; Spanish accuracy is lower
- Complex queries: Struggles with queries requiring numerical aggregation or ranking operations
Intended Use
This model is designed for:
- R&D project discovery platforms (RIS3CAT, Horizon Europe portals)
- Scientific literature search systems
- Multilingual semantic search applications
- Query understanding in Catalan, Spanish, and English
Ethical Considerations
- The model was trained on synthetic queries generated from templates and real queries from domain experts
- No personal or sensitive data was used in training
- The model is intended for search query parsing and does not generate harmful content
Citation
If you use this model, please cite:
@misc{impuls-salamandra-2024,
author = {SIRIS Academic},
title = {IMPULS-Salamandra-7B-Query-Parser: Multilingual Query Parsing for R&D Semantic Search},
year = {2024},
publisher = {Hugging Face},
howpublished = {\url{https://huggingface.co/SIRIS-Lab/impuls-salamandra-7b-query-parser}}
}
Acknowledgments
- Barcelona Supercomputing Center (BSC) - For the Salamandra base model and AINA infrastructure
- Generalitat de Catalunya - For funding and the RIS3-MCAT platform
- AINA Project - For the AINA Challenge 2024 framework
License
This model is released under the Apache 2.0 License, consistent with the base Salamandra model.
Links
- Training Dataset: SIRIS-Lab/impuls-query-parsing
- Project Repository: github.com/sirisacademic/aina-impulse
- Base Model: BSC-LT/salamandra-7b-instruct-tools
- AINA Project: projecteaina.cat
- SIRIS Academic: sirisacademic.com