Overview
Video to Video is the 2D-tab operation for transforming an existing video clip into a new one — restyle, recompose against references, change the look, retime motion, or build a composite. Unlike Image to Video (which generates motion from a still) or Text to Video (which generates from prompt alone), V2V always anchors on a real source clip you supply.
Each vendor handles V2V differently — some preserve motion strictly, some treat the source as a composition anchor, some accept a single reference and others accept several. The cascade exposes all four so you can pick the right one for the shot.
The cascade
Video to Video uses the four-step Operation → Platform → Vendor → Model cascade. You pick the operation (Video to Video), then narrow by platform, then vendor, then the specific model variant. The dropdowns filter each other so you only see valid combinations.
Operation: Video to Video
Platform: fal | piapi | runway
Vendor: Bytedance | Lightricks | Runway
Model: <variants per vendor>
The same cascade shape is used for the adjacent Motion Control operation and the new Text to Video operation — once you learn it for V2V, you can move between them without re-learning the UI.
Vendors at a glance
Four V2V routes are wired today. The summary table covers reference caps, resolution support, and output-duration range — the things that most often decide which one fits a given shot.
| Vendor | Style | Video refs | Image refs | Audio refs | Resolution |
|---|---|---|---|---|---|
| fal · Seedance 2.0 Bytedance, standard + fast |
Motion-preserving V2V | up to 3 (1 source + 2 extra) |
up to 9 | up to 3 | 480p · 720p |
| fal · LTX 2.3 Ref Lightricks |
Single-source restyle | 1 (source only) |
— | — | LTX native |
| piapi · Seedance Omni Reference Bytedance, composition mode |
Multi-reference composition | up to 3 (1 source + 2 extra) |
up to 9 | up to 3 | 480p · 720p · 1080p |
| runway · Gen-4 Aleph Runway |
Runway V2V | 1 (source) |
— | — | Runway native |
All four vendors accept a duration in the 4–15 second range, set via the 2D tab's Duration column. Each vendor has its own per-clip cap on reference material that's enforced automatically — very long reference clips are trimmed before upload so you never need to pre-trim them yourself.
fal · Seedance 2.0 (and Fast)
Seedance 2.0 V2V on fal is a true motion-preserving video-to-video model — the source clip's motion is the dominant anchor, and the prompt + extra references shape the look. Two variants: Standard for quality, Fast for cost and turnaround.
Accepts 1 source video + up to 2 extra reference videos, up to 9 reference images, and up to 3 reference audio refs. Caps out at 720p — if you need 1080p, see Resolution & upscaling.
Pick this when: you want the source clip's motion to come through faithfully and only the look to change. Examples: rotoscope-style stylisation, plate-to-plate restyle, on-model character reskin.
fal · LTX 2.3 Ref V2V
Lightricks' LTX 2.3 Reference V2V is a single-input model —
one source video, no separate image or audio reference slots. It
exposes a video_strength knob (via the Extra Params
column) so you can dial how strongly the source's motion and
composition carry through versus the prompt.
Pick this when: the shot is simple, you don't have extra reference media, and you want LTX's distinctive aesthetic. It's also fast and cheap, which makes it a good first pass before committing to a heavier vendor.
piapi · Seedance 2.0 Omni Reference
Same underlying model family as fal's Seedance, but served through piapi in composition mode. Up to 3 videos (1 source + 2 extra), up to 9 images, up to 3 audio refs. Native 1080p support — the one capability where piapi beats fal's Seedance route.
Composition mode is not the same as motion-preserving V2V. Seedance Omni Reference treats every reference — videos included — as compositional input, with the prompt deciding what to keep, what to mix, and what to anchor. If you give it a plate and a character, expect the model to compose them rather than to track the plate's motion strictly.
References are addressable in the prompt as
@video1, @video2, ..., @image1,
@image2, ... — this is how you direct which
reference plays which role. If you don't tag, the system adds a
default anchor directive so all references contribute.
Pick this when: you want to combine multiple references into one shot, you need 1080p direct, or you want the most expressive prompt-driven control over which reference becomes the subject vs the environment.
runway · Gen-4 Aleph
Runway's Gen-4 Aleph V2V model — single source video, no separate reference slots. Strong for cinematic restyle and stylised motion. Resolution and duration follow Runway's published Aleph spec.
Pick this when: you're already in a Runway-driven workflow, you want Aleph's particular look, or you're chaining V2V with other Runway models (Gen-4, Gen-4 Turbo) in the same shot.
How to choose
A few rules of thumb:
- Need the source motion to carry through? Use fal · Seedance 2.0 V2V. It's the only one in this list that's primarily a motion-preserving model.
-
Combining multiple references into one shot? Both
fal · Seedance 2.0 and piapi · Seedance
Omni Reference take up to 3 videos and 9 images —
piapi addresses each one by
@videoN/@imageNtag in the prompt for fine-grained role assignment. - Need 1080p direct? Use piapi · Seedance Omni Reference. Other vendors output lower resolution and rely on a separate upscale pass.
- Simple single-source restyle, fast and cheap? Use fal · LTX 2.3 Ref V2V or fal · Seedance Fast V2V.
- Already a Runway shop? Use runway · Gen-4 Aleph. It's the cleanest hand-off if downstream shots are also Runway-driven.
When you drop reference files into the 2D tab's reference column,
files are sorted by extension — video files (.mp4,
.mov, .webm) become video references,
image files (.png, .jpg, .webp,
...) become image references, audio files
(.mp3, .wav, .m4a, ...)
become audio references. You don't need to put them in separate
columns.
Resolution & upscaling
Only piapi · Seedance Omni Reference outputs 1080p directly today. For the other vendors, the canonical pattern is to chain a Video Upscale pass after the V2V row:
- Row N: Video to Video at the vendor's native resolution (e.g. 720p with fal Seedance).
- Row N+1: Video Upscale → Topaz, picking a model appropriate to the source motion (Proteus, Artemis, Nyx, Gaia, or Starlight). The 2D pipeline's job-chaining picks up Row N's output as Row N+1's input automatically.
Topaz upscale runs as a separate operation in the same 2D tab and is documented under the Pipeline guide.
Related operations
- Motion Control — the adjacent 2D-tab operation. Same four-step cascade. Currently wired for Kling (Kuaishou) motion-control models. Use this when you have an explicit motion path to drive against, rather than a reference clip to restyle.
- Image to Video — for shots that start from a still frame and need motion generated. Covered in the Pipeline guide.
- Text to Video — for shots that need no source media at all. Routes through every T2V-capable vendor via the same Platform → Vendor → Model cascade. Useful for loops, sprites, B-roll, and reference-element generation.
- Video Upscale — Topaz video upscaling models, served via fal. Chain after any V2V row that lands below your target resolution.