xPilot — AI Social Media Marketing Copilot

Model Developers

OpenAI

Creator of the GPT language model series and TTS voice synthesis

Anthropic

Creator of Claude language models, focused on safety & alignment

ByteDance

Creator of Seedream, Seedance & Dreamina visual AI models

Alibaba

Creator of Wan series and Qwen series AI models

Google

Creator of the Gemini language model series

xAI

Creator of the Grok language model series

Black Forest Labs

Creator of the FLUX image editing model series

Kuaishou

Creator of the Kling video generation model

Mistral

Europe's leading open-source language model developer

Cheng et al.

Research team behind the MMAudio video-to-audio model (UIUC / Sony Research)

🖼️

Image Generation Models

Generate high-quality images from text descriptions with various styles and resolutions.

Model	Developer	Description	Tier
Seedream 4.5	ByteDance	Latest flagship · Native bilingual · 4K ultra-HD	Standard
Seedream 4	ByteDance	High-quality image generation · Bilingual	Fast
Dreamina 3.1	ByteDance	High-fidelity aesthetics · Artistic style	Premium
Qwen Image	Alibaba	20B parameters · Excellent Chinese text rendering	Standard
Wan 2.6 Image	Alibaba	Wan series image model · High resolution	Fast

✏️

Image Editing Models

Upload existing images for editing, enhancement, or style transformation.

Model	Developer	Description	Tier
FLUX Kontext Pro	Black Forest Labs	Context-aware editing · Best for image & text editing	Premium
FLUX Kontext Pro Multi	Black Forest Labs	Multi-image context editing · Style consistency	Premium
UNO	ByteDance	Universal image editing · Image + text	Standard
Real-ESRGAN	Xintao Wang et al.	Image super-resolution · Quality enhancement	Fast

🎬

Video Generation Models (Text-to-Video)

Auto-generate short videos from text descriptions. Some models support synchronized audio generation.

Model	Developer	Description	Tier
Wan 2.2 — 480p Ultra Fast	Alibaba	Ultra-fast generation · ~5s per video	Fast
Wan 2.2 — 720p	Alibaba	High-definition resolution	Standard
Wan 2.6Audio	Alibaba	Latest Wan series · Audio support · Best quality	Standard
Seedance 1.5 ProAudio	ByteDance	Cinematic quality · Audio support	Premium
Kling Video O3	Kuaishou	Best motion quality · Premium dynamics	Premium

🎞️

Video Generation Models (Image-to-Video)

Transform static images into dynamic videos, bringing images to life.

Model	Developer	Description	Tier
Wan 2.2 i2v — 480p Fast	Alibaba	Image-to-video · Fast generation	Fast
Wan 2.2 i2v — 720p	Alibaba	Image-to-video · HD resolution	Standard
Seedance 1.5 Pro i2vAudio	ByteDance	Image-to-video · Cinematic · Audio support	Premium

📝

Text Generation Models

Multiple leading AI language models for social content creation, rewriting, and optimization.

Model	Developer	Description	Tier
GPT-4o	OpenAI	Flagship · Most capable overall	Premium
GPT-4o Mini	OpenAI	Lightweight · Cost-effective	Fast
GPT-5	OpenAI	Latest flagship model	Premium
Claude Sonnet 4	Anthropic	Excellent writing quality	Premium
Claude 3.5 Haiku	Anthropic	Fast · Cost-efficient	Fast
Gemini 2.5 Flash	Google	Ultra-fast · Low cost	Fast
Gemini 2.5 Pro	Google	High performance reasoning	Premium
Grok 3	xAI	Real-time aware	Premium
Grok 3 Mini	xAI	Lightweight and fast	Fast
Mistral Small	Mistral	Efficient European model	Fast
Mistral Medium	Mistral	Balanced performance	Standard

🎙️

Voice Synthesis Models

Convert text to natural speech with multiple voice options and speed control.

Model	Developer	Description	Tier
TTS-1	OpenAI	High-quality text-to-speech · 6 voice options	Standard

Available voices: Alloy · Echo · Fable · Onyx · Nova · Shimmer

🎵

Background Music Generation Models

Auto-generate synchronized background music from video content and text descriptions, no extra assets needed.

Model	Developer	Description	Tier
MMAudio V2	Cheng et al.	Video-to-audio · Multimodal sync · High-quality BGM generation	Standard

🗣️

Video Narration Models

AI automatically analyzes video content and generates voiced narration. This feature uses two models in tandem: Gemini 2.5 Flash analyzes the video frames, then TTS-1 converts the generated script to speech.

Model	Developer	Description	Tier
Gemini 2.5 FlashAnalysis	Google	Video content analysis · Auto-generate narration scripts	Fast
TTS-1Synthesis	OpenAI	Narration voice synthesis · 6 voice options	Standard

Narration styles: Professional · Casual · Dramatic · Documentary · Enthusiastic

Model Tier Guide

Fast

Fastest generation, lowest cost. Ideal for quick iteration and daily use.

Standard

Best balance of speed and quality. Recommended for most use cases.

Premium

Highest quality output. Best for professional work and important content.

Try Media Studio →