new

Get trending papers in your email inbox once a day!

Get trending papers in your email inbox!

Daily Papers

byAK and the research community

Apr 9

Submitted by

akhaliq

Ferret-UI: Grounded Mobile UI Understanding with Multimodal LLMs

·
8 authors

Submitted by

akhaliq

MagicTime: Time-lapse Video Generation Models as Metamorphic Simulators

·
9 authors

Submitted by

akhaliq

SwapAnything: Enabling Arbitrary Object Swapping in Personalized Visual Editing

·
10 authors

Submitted by

akhaliq

ByteEdit: Boost, Comply and Accelerate Generative Image Editing

·
14 authors

Submitted by

akhaliq

SpatialTracker: Tracking Any 2D Pixels in 3D Space

·
7 authors

Submitted by

akhaliq

UniFL: Improve Stable Diffusion via Unified Feedback Learning

·
12 authors

Submitted by

akhaliq

MA-LMM: Memory-Augmented Large Multimodal Model for Long-Term Video Understanding

·
8 authors

Submitted by

akhaliq

BeyondScene: Higher-Resolution Human-Centric Scene Generation With Pretrained Diffusion

·
5 authors

Submitted by

akhaliq

YaART: Yet Another ART Rendering Technology

·
23 authors

Submitted by

akhaliq

PhysAvatar: Learning the Physics of Dressed 3D Avatars from Visual Observations

·
11 authors

Submitted by

akhaliq

MoMA: Multimodal LLM Adapter for Fast Personalized Image Generation

·
6 authors

Submitted by

akhaliq

Aligning Diffusion Models by Optimizing Human Utility

·
5 authors

Submitted by

akhaliq

Diffusion-RWKV: Scaling RWKV-Like Architectures for Diffusion Models

·
5 authors

Submitted by

akhaliq

DATENeRF: Depth-Aware Text-based Editing of NeRFs

·
7 authors

Submitted by

akhaliq

Koala: Key frame-conditioned long video-LLM

·
8 authors