Workflow : I2V & T2V storytelling with multi-event temporal control with Prompt Relay

#106
by RuneXX - opened

I2V & T2V Multi-Event Temporal Control with Prompt Relay

With the new ComfyUI node for Prompt Relay by Kijai, you can create multi-event or multi-scene videos with each segment having its own prompt and length.
A neat and easy way to set pace timing and length for intervals/scenes of your video or cue up a sequence of actions with temporal control.
And it "forces" you to be more structural and story based when you prompt, and with the timing segment part you can get some nice short stories or sequence of events....

The node seems to be work-in-progress going by the message on the repro, and so is this workflow as well ...
Updates and tweaks might come ;-) But seems to work quite nicely already... feel free to test it out ;-)

You need this node: https://github.com/kijai/ComfyUI-PromptRelay
(and you can read more about the concept here: https://gordonchen19.github.io/Prompt-Relay/ )

A workflow here (and there is a workflow at the repro of the node as well):
https://huggingface.co/RuneXX/LTX-2.3-Workflows/tree/main/Movie-Maker

Looks amazing, but.. I can't find the node in the manager and It does not install through git pull

Unpacking objects: 100% (13/13), 13.72 KiB | 246.00 KiB/s, done.
From https://github.com/kijai/ComfyUI-PromptRelay

  • branch HEAD -> FETCH_HEAD
    fatal: refusing to merge unrelated histories
Owner

Sounds like something went wrong with the git pull.
Just delete that node folder, and try git clone again... hopefully that fixes it

I fixed it. it is amazing to be honest.. thank you so much.
Just a question: can you upgrade this workflow to include a starting image for each segment?

Yeah the prompt relay thing is pretty neat, can do some really nice stuff

Just a question: can you upgrade this workflow to include a starting image for each segment?

I was actually making a workflow a bit like that (before the prompt relay came). Based a on the "music video creator" workflow from earlier. Where you set length, image and prompt per segment.

But using the Prompt Relay might also actually work... If each image is set to the exact frame start of each prompt segment.
Worth a try ;-) will do

(for simpler "stories" you can prompt for scene change already, with the Prompt Relay - see video 2 in the example above. But for a complete new camera angle, image input would give more control)

yes please, thank you.. and I agree, for simple long video with camera going back and forth, it is easy with a text prompt only.. but for a complex camera angle, a guiding image would be perfect

Gave it a try, unfortunately its not going to be 100% what you had in mind perhaps, since the frame inject node only give a new frame image. But LTX might interpret that either as a new scene, OR transition to that image (first last frame logic). But still an interesting workflow ;-) will upload in a few

depending on what you had in mind with back and forth. It might switch between say 2 persons, or it might transition/move the camera slowly between 2 persons.
With some prompting you can probably control it though. Starting the prompt for a segment with : "Scene cut/new scene" that LTX seems to understand ;-)

(the other workflow based on the "music creator" is a totally new scene at each segment - that can be different or similar to previous scene. Will upload that too, just testing out a few things for consistent audio)

works great! I could not pull this off properly with FMLF, with t2v, i2v list wf continues ..I kept trying but something would always be wrong...here it was done on 3rd attempt when prompting was dialed in properly.

(VERY low resolution render, since i was just testing stuff... but you get the idea)

Adding some CFG might be a good choice for the more "complex" prompts with lots of things happening. Will upload a Dev model workflow for that. Still uses distilled lora, but more steps and cfg 3. It will be a little slower (compared to distilled only), but not a lot since its still with distilled lora. Set to 20 ish steps (but you can adjust the steps higher when needed)

Also noticed that VBVR reasoning lora for LTX can help with logic.. but that goes for any wf really. But since the prompt relay invites you to be a bit more creative, the VBVR did help me getting nice results a few times when the prompt was a bit "complex" (like opening doors, walking into this or that etc)

In case someone wanna try VBVR from https://huggingface.co/Video-Reason/ (but no comfy compatible yet at their repro... i think)
Comfy versions:
https://huggingface.co/siraxe/VBVR-LTX2.3-diffsynth_comfyui
https://huggingface.co/LiconStudio/Ltx2.3-VBVR-lora-I2V/

The tiny VAE memory use is a problem, the VAE is currently storing the temporary results on GPU, this PR when merged should address it: https://github.com/Comfy-Org/ComfyUI/pull/13617

This is currently merged into main comfy
For those who had some memory struggles try update Comfy

Prompt Relay with Custom Audio

Workflow added for prompt relay + custom audio. Single input image (but will add one for multi image as well).
And Kijai added an advanced option to the node to play around with for those who want to tinker and experiment ;-)

https://huggingface.co/RuneXX/LTX-2.3-Workflows/tree/main/Movie-Maker/Prompt-Relay-Custom-Audio

The tiny VAE memory use is a problem, the VAE is currently storing the temporary results on GPU, this PR when merged should address it: https://github.com/Comfy-Org/ComfyUI/pull/13617

This is currently merged into main comfy
For those who had some memory struggles try update Comfy

yeah it works fine now! 😃

Hello RuneXX.

Thank you as always for your helpful workflows.
Regarding LTX-2.3_-_I2V_T2V_Short-Story_PromptRelay-Timeline_custom_audio.json, even when I specify the width and height in the video settings, the generated video has a long side of 1536 pixels.

Could you please check this?

the generated video has a long side of 1536 pixels.
Could you please check this?

Ah... a little accident there, will fix that

Updated with correct resizing

the generated video has a long side of 1536 pixels.
Could you please check this?

Ah... a little accident there, will fix that

Updated with correct resizing

The new workflow is working correctly.
Thank you!

Prompt Relay + Custom Audio with multi reference image

Added a variant of the custom audio workflow that can take multi image references.
Although you can get quite the same with single reference image and prompting for "scene cut / new scene" etc with prompt describing the new scene, the multi ref image can give a bit more manual control.

A "music video" type of example, with song as custom audio.. but can be used for any type of storytelling :

Sign up or log in to comment