🎨 Omni Editor 2.0
📤 Upload Image
✏️ Editing Instructions
🎯 Editing Result
💡 Prompt Examples
🎨 Unlimited AI Image Generation & Editing
Access unlimited image generation, video creation, and advanced editing features. No time limitation, no ads, no watermark.
🎨 What Omni Model Can Do
This Space demonstrates just a small fraction of what the Omni model can do. Beyond what you see here, it can also perform:
No fine-tuning required - just modify the prompt and input parameters!
🤖 Omni Creator 2.0: 8B Unified Multi-Modal Diffusion Transformer
8B-parameter native MM-DiT unifying T2I, pixel-level editing, and I2V generation. Uses CLIP/T5-style text encoders and visual conditioners to deliver high-fidelity multi-modal results with shared transformer backbone.
Multi-head attention with timestep-conditioned modulation
Learned fusion of text, 3× images, and temporal context
FP8 + RoPE + RMSNorm for production-scale inference
α = Softmax(MLP([ctxt; cimg; ctmp])); C = αtxt·ctxt + Σ αi·cimg,i + αtmp·ctmp
# 2) Time-conditioned modulation
h = TEmbed(t) ⊕ C; (γ, β, λ) = MLP(h)
# 3) RoPE self-attn + gated residual
Q = RoPE(Wq·LN(x)); K = RoPE(Wk·LN(x)); V = Wv·LN(x)
A = Softmax((QKT/√d) + Brel) · V
x ← x + λ ⊙ A · (1+γ) + β + CrossAttn(x, C)
# 4) AdaLN-zero modulated SwiGLU FFN
u = SwiGLU(W1·LN(x)); x ← x + λ ⊙ (W2·u) · (1+γ) + β
Supported single-image editing and multi-image editing.
Unified image editing, T2I, T2V, I2V, face swap, watermark removal, and more into one full-modal generation & editing suite.
Bring video generation into the unified stack for true end-to-end multimodal creation.
⚡ OmniScheduler: Unified Hybrid Flow-Diffusion Sampler
Sampling framework that couples Flow Matching, Karras sigmas, and Heun/RK4 ODE solvers for fast 4–8 step generation.
🎯 Few-Step Flow Matching
Supports velocity/epsilon/sample prediction with flow-to-velocity conversion, enabling 4–8 step inference.
🔄 Multi-Stage Sampling
Coarse-to-fine pipeline: ~70% steps for coarse draft, optional refine passes for details.
📈 RK4 Hybrid ODE
Runge-Kutta 4th order solver with flow matching conditioning for optimal trajectory evolution.
Given data distribution p₁(x) and noise distribution p₀(z) = N(0, I), the Rectified Flow defines:
The velocity field v_θ(x_t, t) is trained to match the conditional velocity:
At inference, we solve the ODE: dx/dt = v_θ(x_t, t) from t=0 to t=1
📈 State-of-the-Art Model Comparison (2025)
| Model | Params | Architecture | Training | Inference | NFE | Acceleration |
|---|---|---|---|---|---|---|
| FLUX.2-Dev | 32B | DiT + MM | Flow Matching | Euler/DPM | 50 | FP8 + FlashAttn |
| Qwen-Image | 20B | DiT + MLLM | Rectified Flow | FlowMatch Euler | 30-50 | Lightning Lora |
| Qwen-Image-Edit | 20B | DiT + Dual-Branch | Flow Matching | Euler | 28-50 | Lightning Lora |
| HunyuanVideo | 13B+ | AsymmDiT | Diffusion | Multi-step | 50+ | FP8 + Multi-frame |
| Wan2.2 | 5B/14B | DiT + MoE | Diffusion | Multi-step | 30-50 | MoE Routing + FP8 |
| Z-Image-Turbo | 6B | Distilled DiT | Progressive Distill | Few-step | 4-8 | Distill |
| Mochi | 10B | Video DiT | Diffusion | Multi-step | 50+ | ComfyUI Parallel |
| ⭐ Omni Creator 2.0 | 8B | MM-DiT + AMG | π-Flow + FM | RK4 Hybrid | 4-8 | Policy Distill + Multi-Stage |