Create Videos with Wan 2.7

Wan 2.7 — A Full Upgrade for AI Video Generation

Wan 2.7 is the latest iteration of the Wan video generation model, delivering meaningful improvements in character consistency, physical motion, and prompt comprehension. It supports 5–15 second video generation at both 720p and 1080p resolution, making it a practical choice for everything from quick social clips to polished marketing content.

What's New in Wan 2.7

Compared with Wan 2.6, the upgrades are not incremental — they address the core pain points that creators encounter most frequently.

Character Consistency

Wan 2.7 maintains stable facial features, body proportions, and clothing details across every frame. Characters no longer "drift" mid-clip — even in complex multi-action sequences.

Smoother Physical Motion

Motion dynamics are more physically plausible. Acceleration, deceleration, gravity, and object interactions feel natural rather than artificial, especially in walking, running, and hand gesture sequences.

Better Prompt Understanding

The model parses longer, more detailed prompts with higher fidelity. Spatial relationships, sequential actions, and nuanced scene descriptions are interpreted more accurately than previous versions.

720p & 1080p · 5–15 Seconds

Generate videos from 5 to 15 seconds in length at 720p or 1080p resolution. Shorter clips maximize per-frame quality; longer clips suit narrative and cinematic use cases.

How to Write Prompts for Wan 2.7

A well-structured prompt is the single most important factor in getting high-quality, consistent output from Wan 2.7. We recommend organizing your prompt into three distinct parts: Subject, Description, and Camera Language.

Part 1 — Subject

The subject line tells the model who or what is in the video. When working with a reference image — especially a character — it is critical to include identity-locking phrases to prevent the model from altering facial features or body proportions.

Example

the same beautiful young woman, identical face and body proportions as reference

Why this matters

Adding the same ... or identical face explicitly anchors the model to the reference image's identity. Without these phrases, the model may generate a similar-looking but noticeably different person — subtle changes in jawline, eye shape, or proportions that break continuity across clips.

Part 2 — Description

This is the most important part of your prompt. The description tells the model what is happening — the scene, actions, poses, expressions, environment, and any changes over time. Be specific and sequential.

Scene & Environment

“standing on a rooftop terrace at golden hour, city skyline in the background, warm orange light”

Action & Movement

“slowly turns to face the camera, wind gently blowing through her hair”

Pose & Gesture

“one hand resting on the railing, the other brushing hair behind her ear”

Expression & Emotion

“soft confident smile, relaxed eyes, calm and composed demeanor”

The more spatial and temporal detail you provide, the more faithfully the model reproduces your creative intent. Vague prompts like “a woman standing outside” leave too much room for interpretation.

Part 3 — Camera Language

Camera language controls the cinematographic framing and feel of the output. Include shot type, angle, and depth of field to guide composition.

Example

medium close-up, low angle, shallow depth of field

Category	Options
Shot Type	extreme close-up · close-up · medium close-up · medium shot · full shot · wide shot
Angle	eye level · low angle · high angle · dutch angle · bird's eye · worm's eye
Depth of Field	shallow depth of field (subject in focus, blurred background) · deep focus (everything sharp)

Full Prompt Example

Here is a complete prompt that combines all three parts — subject, description, and camera language — into a single, production-ready input.

prompt

Subject

The same beautiful young woman, identical face and body proportions as reference, wearing a cream-colored knit sweater and light blue jeans.

Description

She stands on a sunlit wooden pier by the ocean at golden hour. She slowly walks toward the camera with a relaxed, confident stride, the ocean breeze lifting strands of her hair. She pauses, tilts her head slightly, and breaks into a warm, genuine smile. Waves crash softly in the background, and the warm light casts a soft glow across her face.

Camera

Medium close-up, low angle, shallow depth of field, golden backlight with soft lens flare.

Subject anchoring

"the same ... identical face" locks identity to the reference image

Rich description

Sequential actions (walks → pauses → tilts → smiles) give the model temporal structure

Cinematic camera

Low angle + shallow DoF + backlight creates a professional, editorial look

Tips for Best Results

Always Include Identity-Locking Phrases

When using a reference image, always add "the same ..." or "identical face and body proportions as reference" in your subject line. This is the single most effective way to prevent character drift and maintain consistency.

Describe Actions in Sequence

Instead of describing a static scene, write actions in the order they should happen: "she walks forward, pauses, and turns to face the camera." This gives the model a clear temporal roadmap for the 5–15 second clip.

Be Specific About Environment

Include lighting conditions (golden hour, overcast, neon-lit), setting details (cobblestone street, glass office, sandy beach), and atmospheric elements (fog, rain, wind). These details dramatically affect output quality.

Match Duration to Complexity

For a single action with minimal scene change, 5 seconds is usually sufficient. For sequences with multiple actions or camera movements, use 10–15 seconds to give the model enough temporal space.

Choose Resolution by Use Case

720p renders faster and works well for drafts, social stories, and rapid iteration. 1080p is best for final output — marketing videos, portfolio pieces, and any content that will be viewed on larger screens.

1080p

Max Resolution

5–15s

Video Duration

720p

Fast Preview

Wan 2.7

Latest Model

Frequently Asked Questions

What video lengths does Wan 2.7 support?

Wan 2.7 generates videos between 5 and 15 seconds. Shorter clips (5s) are great for quick social content, while longer clips (10–15s) are better suited for narrative sequences and cinematic storytelling.

What resolutions are available?

Wan 2.7 supports both 720p and 1080p. Use 720p for faster iteration and drafts. Use 1080p for final, production-quality output.

How do I keep the character looking the same across clips?

Include identity-locking phrases in your prompt's subject line, such as "the same beautiful young woman, identical face and body proportions as reference." This anchors the model to the reference image and prevents facial or body drift.

What makes Wan 2.7 better than Wan 2.6?

Wan 2.7 delivers significantly improved character consistency, smoother and more physically plausible motion, and better understanding of complex, detailed prompts. The overall output quality is a noticeable step up from 2.6.

How detailed should my prompts be?

The more detail, the better. We recommend structuring prompts into three parts: Subject (who/what), Description (scene, action, pose, expression), and Camera Language (shot type, angle, depth of field). Vague prompts produce generic results.

Can I use reference images with Wan 2.7?

Yes. Wan 2.7 supports reference image input. When using a reference, always pair it with identity-locking language in your prompt to ensure the generated character matches the reference as closely as possible.

Have more questions? Contact us anytime.