Trending Model:#1Unlimited-OCRbaidu⬇630kTrending Model:#2Qwythos-9B-Claude-Mythos-5-1M-GGUFempero-ai⬇1114kTrending Model:#3GLM-5.2zai-org⬇160kTrending Model:#4Ornith-1.0-35B-GGUFdeepreinforce-ai⬇234kTrending Model:#5Ornith-1.0-9B-GGUFdeepreinforce-ai⬇191kTrending Model:#6gemma-4-12B-agentic-fable5-composer2.5-v2-3.5x-tau2-GGUFyuxinlu1⬇289kTrending Model:#7Ornith-1.0-9Bdeepreinforce-ai⬇47kTrending Model:#8Qwen-AgentWorld-35B-A3BQwen⬇34kTrending Model:#9Ornith-1.0-35Bdeepreinforce-ai⬇135kTrending Model:#10DeepSeek-V4-Pro-DSparkdeepseek-ai⬇8kTrending Model:#1Unlimited-OCRbaidu⬇630kTrending Model:#2Qwythos-9B-Claude-Mythos-5-1M-GGUFempero-ai⬇1114kTrending Model:#3GLM-5.2zai-org⬇160kTrending Model:#4Ornith-1.0-35B-GGUFdeepreinforce-ai⬇234kTrending Model:#5Ornith-1.0-9B-GGUFdeepreinforce-ai⬇191kTrending Model:#6gemma-4-12B-agentic-fable5-composer2.5-v2-3.5x-tau2-GGUFyuxinlu1⬇289kTrending Model:#7Ornith-1.0-9Bdeepreinforce-ai⬇47kTrending Model:#8Qwen-AgentWorld-35B-A3BQwen⬇34kTrending Model:#9Ornith-1.0-35Bdeepreinforce-ai⬇135kTrending Model:#10DeepSeek-V4-Pro-DSparkdeepseek-ai⬇8k

Zai-org's SCAIL-2 Breathes Motion Into Still Characters Sans Skeleton

Animated character figure mid-stride featuring glowing motion trails and subtle kinetic ribbons.

SCAIL-2 is a new open-source model that animates still character images directly from a driving video without relying on skeleton maps or inpainting masks. This end-to-end approach removes information loss that occurs when converting motion into intermediate pose representations. The model also handles character replacement tasks and supports multi-character scenarios from a single interface.

Zai-org, the same team behind GLM, developed SCAIL-2 by building a synthetic training pipeline using several off-the-shelf models to generate 60,000 motion pairs. The team designed a Unified Motion Transfer Interface with specialized masking channels and a dedicated RoPE design to unify different animation tasks under one training process. By training the model to reverse the driving process, it learned capabilities beyond its teacher models.

End-to-end animation without intermediates

Key capabilities
  • End-to-end driving at 512p and 704p resolutions.
  • Cross-identity character replacement with detailed prompts.
  • Animal-to-character motion transfer without human skeletons.
  • Zero-shot support for SAM3D body mesh inputs.
  • Multi-reference generation using optional extra images.
  • Bias-Aware DPO LoRA for hand and face detail improvement.
  • Built-in Wan VAE and T5 in checkpoint.
  • ComfyUI integration with community workflows available.

Video creators and animators can use SCAIL-2 to transfer complex movements from any video source onto a reference character image. The removal of skeleton-based restrictions means driving sources can include animals or non-human motion that previous tools could not process. Users benefit from a single pipeline that handles both animation and character replacement without switching between different specialized tools.

Training data and model limitations

The project addresses a core weakness found in SCAIL-1, which identified pose representation and injection as key bottlenecks but still depended on intermediate representations. MotionPair-60K, the synthetic dataset created for training, combines data from multiple off-the-shelf models including MoCha and Wan-Animate alongside the team’s own SCAIL-Preview tool.

While multi-reference inference works in zero-shot mode, the model was not explicitly optimized for it and video quality may degrade when additional reference images are provided.

"SCAIL-2 is an open-source model for end-to-end controlled character animation. It animates a reference character with a driving video, and also supports character replacement and multi-character scenarios without relying on intermediate pose representations." — Source: Hugging Face