🎨 Omni Editor 2.0

🎨 Unlimited AI Image Generation & Editing

Access unlimited image generation, video creation, and advanced editing features. No time limitation, no ads, no watermark.

🚀 Get Unlimited Access

🎨 What Omni Model Can Do

This Space demonstrates just a small fraction of what the Omni model can do. Beyond what you see here, it can also perform:

👗 Virtual Try-On 🌅 Background Replacement 💇 Hairstyle Changer 📄 Poster Editing 🎨 Style Transfer + dozens more...

No fine-tuning required - just modify the prompt and input parameters!

🤖 Omni Creator 2.0: 8B Unified Multi-Modal Diffusion Transformer

8B-parameter native MM-DiT unifying T2I, pixel-level editing, and I2V generation. Uses CLIP/T5-style text encoders and visual conditioners to deliver high-fidelity multi-modal results with shared transformer backbone.

RoPE + AdaLN-Zero DiT Blocks

Multi-head attention with timestep-conditioned modulation

Adaptive Multi-Modal Gating

Learned fusion of text, 3× images, and temporal context

HPC-Ready Optimization

FP8 + RoPE + RMSNorm for production-scale inference

📊 AdaLN-Zero + AMG (Full Stack)

# 1) Modal fusion (text / up to 3 imgs / temporal)

α = Softmax(MLP([ctxt; cimg; ctmp]));  C = αtxt·ctxt + Σ αi·cimg,i + αtmp·ctmp

# 2) Time-conditioned modulation

h = TEmbed(t) ⊕ C;  (γ, β, λ) = MLP(h)

# 3) RoPE self-attn + gated residual

Q = RoPE(Wq·LN(x)); K = RoPE(Wk·LN(x)); V = Wv·LN(x)

A = Softmax((QKT/√d) + Brel) · V

x ← x + λ ⊙ A · (1+γ) + β + CrossAttn(x, C)

# 4) AdaLN-zero modulated SwiGLU FFN

u = SwiGLU(W1·LN(x));   x ← x + λ ⊙ (W2·u) · (1+γ) + β

📅 Product Timeline

Sep 31 2025 Omni Creator 1.0 Released

Supported single-image editing and multi-image editing.

Dec 15 2025 Omni Creator 2.0 Released

Unified image editing, T2I, T2V, I2V, face swap, watermark removal, and more into one full-modal generation & editing suite.

Next Plan for Omni Creator 3.0

Bring video generation into the unified stack for true end-to-end multimodal creation.

⚡ OmniScheduler: Unified Hybrid Flow-Diffusion Sampler

Sampling framework that couples Flow Matching, Karras sigmas, and Heun/RK4 ODE solvers for fast 4–8 step generation.

🎯 Few-Step Flow Matching

Supports velocity/epsilon/sample prediction with flow-to-velocity conversion, enabling 4–8 step inference.

🔄 Multi-Stage Sampling

Coarse-to-fine pipeline: ~70% steps for coarse draft, optional refine passes for details.

📈 RK4 Hybrid ODE

Runge-Kutta 4th order solver with flow matching conditioning for optimal trajectory evolution.

🔢 Flow Matching Formulation

Given data distribution p₁(x) and noise distribution p₀(z) = N(0, I), the Rectified Flow defines:

x_t = (1 - t) · z + t · x where t ∈ [0, 1]

The velocity field v_θ(x_t, t) is trained to match the conditional velocity:

L_FM = E_{t,x,z}[ ||v_θ(x_t, t) - (x - z)||² ]

At inference, we solve the ODE: dx/dt = v_θ(x_t, t) from t=0 to t=1

π π-Flow Policy Network (Coarse Trajectory)

Instead of evaluating the model dozens of times, a lightweight policy network can predict a multi-step velocity trajectory in one forward pass.

# One-shot trajectory prediction (coarse stage)
v_{0:S-1} = π_φ(z₀, c, t_grid)
    x_{k+1} = x_k + v_k · Δt   (k = 0..S-1)

How it integrates with OmniScheduler:

Stage-1 (coarse): apply the predicted velocities directly (policy rollout) to rapidly move along the flow-matching path.

Stage-2 (refine): optionally switch to Heun/RK4 higher-order updates for detail recovery and stability.

Multi-modal conditioning: the policy is conditioned on aggregated text/visual context + time embedding, outputting velocity fields matching latent shape.

🎲 Multi-Stage Sampling Pipeline

📷

Input

Text + Images

→

⚡

Stage 1

Coarse (≈70%)

Euler/Heun, few steps

→

💎

Stage 2

Refine (≈30%)

RK4 optional

→

🎨

Output

HD Image/Video

📈 State-of-the-Art Model Comparison (2025)

Model	Params	Architecture	Training	Inference	NFE	Acceleration
FLUX.2-Dev	32B	DiT + MM	Flow Matching	Euler/DPM	50	FP8 + FlashAttn
Qwen-Image	20B	DiT + MLLM	Rectified Flow	FlowMatch Euler	30-50	Lightning Lora
Qwen-Image-Edit	20B	DiT + Dual-Branch	Flow Matching	Euler	28-50	Lightning Lora
HunyuanVideo	13B+	AsymmDiT	Diffusion	Multi-step	50+	FP8 + Multi-frame
Wan2.2	5B/14B	DiT + MoE	Diffusion	Multi-step	30-50	MoE Routing + FP8
Z-Image-Turbo	6B	Distilled DiT	Progressive Distill	Few-step	4-8	Distill
Mochi	10B	Video DiT	Diffusion	Multi-step	50+	ComfyUI Parallel
⭐ Omni Creator 2.0	8B	MM-DiT + AMG	π-Flow + FM	RK4 Hybrid	4-8	Policy Distill + Multi-Stage

🎨 Omni Editor 2.0

📤 Upload Image

✏️ Editing Instructions

🎯 Editing Result

💡 Prompt Examples