From Segments to Scenes: Temporal Understanding in Autonomous Driving via Vision-Language Model Paper • 2512.05277 • Published 21 days ago • 4
CASP: Compression of Large Multimodal Models Based on Attention Sparsity Paper • 2503.05936 • Published Mar 7 • 2
AMUSE: Adaptive Multi-Segment Encoding for Dataset Watermarking Paper • 2403.05628 • Published Mar 8, 2024
Towards Secure and Usable 3D Assets: A Novel Framework for Automatic Visible Watermarking Paper • 2409.00314 • Published Aug 31, 2024 • 1
EBJR: Energy-Based Joint Reasoning for Adaptive Inference Paper • 2110.10343 • Published Oct 20, 2021 • 1
E-LANG: Energy-Based Joint Inferencing of Super and Swift Language Models Paper • 2203.00748 • Published Mar 1, 2022 • 1
ArchBERT: Bi-Modal Understanding of Neural Architectures and Natural Languages Paper • 2310.17737 • Published Oct 26, 2023
Task-Agnostic Language Model Watermarking via High Entropy Passthrough Layers Paper • 2412.12563 • Published Dec 17, 2024 • 1
GOLD: Generalized Knowledge Distillation via Out-of-Distribution-Guided Language Data Generation Paper • 2403.19754 • Published Mar 28, 2024
LaWa: Using Latent Space for In-Generation Image Watermarking Paper • 2408.05868 • Published Aug 11, 2024 • 3
DivPrune: Diversity-based Visual Token Pruning for Large Multimodal Models Paper • 2503.02175 • Published Mar 4 • 3
Spatial Reasoning with Vision-Language Models in Ego-Centric Multi-View Scenes Paper • 2509.06266 • Published Sep 8 • 11