fal.ai Models

Total: 945 • New: 0 • Active: 14 • Deprecated: 931

Show Deprecated

Model Name	Description	Status	First Seen	Action
Z-Image Turbo image-to-image	Generate images from text and images using Z-Image Turbo, Tongyi-MAI's super-fast 6B model. turbo z-image fast	OK	1d	→
Z-Image Turbo image-to-image	Generate images from text and images using custom LoRA and Z-Image Turbo, Tongyi-MAI's super-fast 6B model. turbo z-image fast lora	OK	1d	→
Z-Image Turbo image-to-image	Generate images from text and edge, depth or pose images using Z-Image Turbo, Tongyi-MAI's super-fast 6B model.	OK	1d	→
Z-Image Turbo image-to-image	Generate images from text and edge, depth or pose images using custom LoRA and Z-Image Turbo, Tongyi-MAI's super-fast 6B model. turbo z-image fast lora	OK	1d	→
Longcat Image image-to-image	LongCat image Edit is a 6B parameter image editing model excelling at multilingual text rendering, photorealism and deployment efficiency.	OK	3d	→
Longcat Image text-to-image	LongCat image is a 6B parameter model excelling at multilingual text rendering, photorealism and deployment efficiency.	OK	3d	→
Kling AI Avatar v2 Standard image-to-video	Kling AI Avatar v2 Standard: Endpoint for creating avatar videos with realistic humans, animals, cartoons, or stylized characters	OK	4d	→
Kling AI Avatar v2 Pro image-to-video	Kling AI Avatar v2 Pro: The premium endpoint for creating avatar videos with realistic humans, animals, cartoons, or stylized characters	OK	4d	→
Z Image Trainer training	Train LoRAs on Z-Image Turbo, a super fast text-to-image model of 6B parameters developed by Tongyi-MAI. turbo z-image fast trainer	OK	5d	→
Kling Video v2.6 Image to Video image-to-video	Kling 2.6 Pro: Top-tier image-to-video with cinematic visuals, fluid motion, and native audio generation.	Deprecated	5d	→
Kling Video v2.6 Text to Video text-to-video	Kling 2.6 Pro: Top-tier text-to-video with cinematic visuals, fluid motion, and native audio generation.	Deprecated	5d	→
Bytedance image-to-image	A new-generation image creation model ByteDance, Seedream 4.5 integrates image generation and image editing capabilities into a single, unified architecture. stylized transform	OK	6d	→
Bytedance text-to-image	A new-generation image creation model ByteDance, Seedream 4.5 integrates image generation and image editing capabilities into a single, unified architecture. stylized transform	OK	6d	→
Sam 3 image-to-3d	SAM 3D enables precise 3D reconstruction of objects from real images, while accurately reconstructing their geometry and texture. 3d object	OK	6d	→
Sam 3 image-to-3d	SAM 3D allows for accurate 3D reconstruction of human body shape and position from a single image. 3d human pose	OK	6d	→
Sam 3 3d-to-3d	SAM 3D enables full scene reconstructions, placing objects and humans in a shared context together. align 3D	OK	6d	→
Vidu text-to-image	Use vidu Text-to-Image to turn your prompts into reality.	Deprecated	6d	→
Vidu image-to-image	Vidu Reference-to-Image creates images by using a reference images and combining them with a prompt. images-to-imag reference-to-image	Deprecated	6d	→
Pixverse text-to-video	Generate high quality video clips from text and image prompts using PixVerse v5.5 text-to-video	Deprecated	7d	→
Pixverse image-to-video	Generate high quality video clips from text and image prompts using PixVerse v5.5 image-to-video	Deprecated	7d	→
Pixverse image-to-video	Pixverse Transition	Deprecated	7d	→
Pixverse image-to-video	Pixverse Effects	Deprecated	7d	→
Kling O1 Image image-to-image	Perform precise image edits using strong reference control, transforming subjects, styles, and local details while preserving visual consistency. edit realism typography	Deprecated	7d	→
Z Image text-to-image	Text-to-Image endpoint with LoRA support for Z-Image Turbo, a super fast text-to-image model of 6B parameters developed by Tongyi-MAI. z-image lora fast	Deprecated	7d	→
Video Background Removal video-to-video	Remove background from videos filmed on a green screen.	Deprecated	7d	→
Video Background Removal video-to-video	Remove background from any video with people and objects. No green screen needed.	Deprecated	7d	→
Kling O1 Reference Video to Video video-to-video	Kling O1 Omni generates new shots guided by an input reference video, preserving cinematic language such as motion, and camera style to produce seamless scene continuity.	Deprecated	7d	→
Kling O1 Edit Video video-to-video	Edit an existing video using natural-language instructions, transforming subjects, settings, and style while retaining the original motion structure.	Deprecated	7d	→
Kling O1 Reference Image to Video image-to-video	Transform images, elements, and text into consistent, high-quality video scenes, ensuring stable character identity, object details, and environments.	Deprecated	7d	→
Kling O1 First Frame Last Frame to Video image-to-video	Generate a video by taking a start frame and an end frame, animating the transition between them while following exctext-driven style and scene guidance.	Deprecated	7d	→
Video Background Removal video-to-video	Remove background from any video with people and objects. No green screen needed.	Deprecated	7d	→
Ovis Image text-to-image	Ovis-Image is a 7B text-to-image model specifically optimized for quick, high quality text rendering. ovis-image artistic	Deprecated	9d	→
Lucy Edit [Fast] video-to-video	Lucy Edit Fast is a rapid, localized video editing model that lets you modify specific elements like objects, or backgrounds in just 10 seconds. edit	Deprecated	12d	→
LTX Video 2.0 Retake video-to-video	Change sections of a video using LTX-2	Deprecated	12d	→
LTX Video 2.0 Pro image-to-video	Create high-fidelity video with audio from images with LTX-2 Pro	Deprecated	12d	→
LTX Video 2.0 Fast image-to-video	Create high-fidelity video with audio from images with LTX-2 Fast	Deprecated	12d	→
LTX Video 2.0 Pro text-to-video	Create high-fidelity video with audio from text with LTX-2 Pro.	Deprecated	12d	→
LTX Video 2.0 Fast text-to-video	Create high-fidelity video with audio from text with LTX-2 Fast	Deprecated	12d	→
Z Image text-to-image	Z-Image Turbo is a super fast text-to-image model of 6B parameters developed by Tongyi-MAI. turbo z-image fast	Deprecated	12d	→
LTX Video 2.0 Retake video-to-video	Change sections of a video using LTX-2	Deprecated	12d	→
Flux 2 Lora Gallery image-to-image	Add a background to images with white/clean background stylized transform	Deprecated	13d	→
Flux 2 Lora Gallery image-to-image	Virtually furnishes an empty apartment stylized transform	Deprecated	13d	→
Flux 2 Lora Gallery text-to-image	Ballpoint pen sketch drawing style stylized transform	Deprecated	13d	→
Flux 2 Lora Gallery text-to-image	Transforms images into comic book style stylized transform	Deprecated	13d	→
Flux 2 Lora Gallery image-to-image	Extends a face into a full body portrait stylized transform	Deprecated	13d	→
Flux 2 Lora Gallery text-to-image	HDR surrealistic effect with intense colors stylized transform	Deprecated	13d	→
Flux 2 Lora Gallery image-to-image	Generates same object from different angles (azimuth/elevation) stylized transform	Deprecated	13d	→
Flux 2 Lora Gallery text-to-image	Makes images more photorealistic and natural stylized transform	Deprecated	13d	→
Flux 2 Lora Gallery text-to-image	Generates satellite/aerial view style images stylized transform	Deprecated	13d	→
Flux 2 Lora Gallery image-to-image	Virtual clothing try-on (2 images: person + garment) stylized transform	Deprecated	13d	→
Flux 2 Lora Gallery text-to-image	Applies sepia vintage effect to images stylized transform	Deprecated	13d	→
Flux 2 Pro image-to-image	Text-to-image generation with FLUX.2 [pro] from Black Forest Labs. Optimized for maximum quality, exceptional photorealism and artistic images.	Deprecated	13d	→
Flux 2 Pro text-to-image	Image editing with FLUX.2 [pro] from Black Forest Labs. Ideal for high-quality image manipulation, style transfer, and sequential editing workflows	Deprecated	13d	→
Flux 2 text-to-image	Text-to-image generation with FLUX.2 [dev] from Black Forest Labs. Enhanced realism, crisper text generation, and native editing capabilities.	Deprecated	13d	→
Flux 2 image-to-image	Image-to-image editing with FLUX.2 [dev] from Black Forest Labs. Precise modifications using natural language descriptions and hex color control.	Deprecated	13d	→
Flux 2 text-to-image	Text-to-image generation with LoRA support for FLUX.2 [dev] from Black Forest Labs. Custom style adaptation and fine-tuned model variations.	Deprecated	13d	→
Flux 2 text-to-image	Image-to-image editing with LoRA support for FLUX.2 [dev] from Black Forest Labs. Specialized style transfer and domain-specific modifications.	Deprecated	13d	→
Flux 2 Flex text-to-image	Text-to-image generation with FLUX.2 [flex] from Black Forest Labs. Features adjustable inference steps and guidance scale for fine-tuned control. Enhanced typography and text rendering capabilities. stylized transform	Deprecated	13d	→
Flux 2 Flex image-to-image	Image editing with FLUX.2 [flex] from Black Forest Labs. Supports multi-reference editing with customizable inference steps and enhanced text rendering.	Deprecated	13d	→
Flux 2 Trainer training	Fine-tune FLUX.2 [dev] from Black Forest Labs with custom datasets. Create specialized LoRA adaptations for specific styles and domains.	Deprecated	13d	→
Flux 2 Trainer training	Fine-tune FLUX.2 [dev] from Black Forest Labs with custom datasets. Create specialized LoRA adaptations for specific editing tasks.	Deprecated	13d	→
Crystal Upscaler image-to-image	An advanced image enhancement tool designed specifically for facial details and portrait photography, utilizing Clarity AI's upscaling technology. image-to-image	Deprecated	14d	→
Chrono Edit Lora Gallery image-to-image	Upscales and cleans up the image. upscale details	Deprecated	17d	→
Chrono Edit Lora Gallery image-to-image	You can make edits simply by drawing a quick sketch on the input image. paint edit sketch	Deprecated	17d	→
Chrono Edit Lora image-to-image	LoRA endpoint for the Chrono Edit model. image-to-image image-editing	Deprecated	17d	→
Hunyuan Video V1.5 text-to-video	Hunyuan Video 1.5 is Tencent's latest and best video model hunyuan-video text-to-video	Deprecated	18d	→
Sam 3 vision	SAM 3 is a unified foundation model for promptable segmentation in images and videos. It can detect, segment, and track objects using text or visual prompts such as points, boxes, and masks. embeddings mask real-time	Deprecated	18d	→
Segment Anything Model 3 image-to-image	SAM 3 is a unified foundation model for promptable segmentation in images and videos. It can detect, segment, and track objects using text or visual prompts such as points, boxes, and masks. segmentation mask real-time	Deprecated	18d	→
Sam 3 video-to-video	SAM 3 is a unified foundation model for promptable segmentation in images and videos. It can detect, segment, and track objects using text or visual prompts such as points, boxes, and masks. segmentation mask real-time	Deprecated	18d	→
Sam 3 video-to-video	SAM 3 is a unified foundation model for promptable segmentation in images and videos. It can detect, segment, and track objects using text or visual prompts such as points, boxes, and masks. segmentation mask real-time rle	Deprecated	18d	→
Sam 3 image-to-image	SAM 3 is a unified foundation model for promptable segmentation in images and videos. It can detect, segment, and track objects using text or visual prompts such as points, boxes, and masks. segmentation rle real-time	Deprecated	18d	→
Nano Banana Pro text-to-image	Nano Banana Pro (a.k.a Nano Banana 2) is Google's new state-of-the-art image generation and editing model realism typography	Deprecated	18d	→
Nano Banana Pro image-to-image	Nano Banana Pro (a.k.a Nano Banana 2) is Google's new state-of-the-art image generation and editing model realism typography	Deprecated	18d	→
Gemini 3 Pro Image Preview text-to-image	Nano Banana Pro (a.k.a Nano Banana 2) is Google's new state-of-the-art image generation and editing model realism typography	Deprecated	18d	→
Gemini 3 Pro Image Preview image-to-image	Nano Banana Pro (a.k.a Nano Banana 2) is Google's new state-of-the-art image generation and editing model realism typography	Deprecated	18d	→
Lynx image-to-video	Generate subject consistent videos using Lynx from ByteDance! image-to-video subject	Deprecated	21d	→
Maya1 text-to-speech	Maya1 is a state-of-the-art speech model by Maya Research for expressive voice generation, built to capture real human emotion and precise voice design. text-to-speech tts	Deprecated	24d	→
OpenRouter Chat Completions [OpenAI Compatible] llm	Run any LLM (Large Language Model) with fal, powered by OpenRouter. This endpoint is compatible with the OpenAI API.	Deprecated	25d	→
OpenRouter llm	Run any LLM (Large Language Model) with fal, powered by OpenRouter.	Deprecated	25d	→
OpenRouter [Vision] vision	Run any VLM (Vision Language Model) with fal, powered by OpenRouter.	Deprecated	25d	→
OpenRouter Embeddings [OpenAI Compatible] llm	The OpenRouter Embeddings API with fal, powered by OpenRouter, provides unified access to a wide range of large language models - including GPT, Claude, Gemini, and many others through a single API interface.	Deprecated	25d	→
OpenRouter Responses [OpenAI Compatible] llm	The OpenRouter Responses API with fal, powered by OpenRouter, provides unified access to a wide range of large language models - including GPT, Claude, Gemini, and many others through a single API interface.	Deprecated	25d	→
Fibo Mashup image-to-image	Combine three images to create an amazing mashup image with Bria's FIBO model. bria fibo image-to-image	Deprecated	25d	→
Editto video-to-video	Edit videos using instruction-based prompting using Editto model! video-edit wan-vace	Deprecated	27d	→
Qwen Image Edit Plus Lora Gallery image-to-image	Add a realistic scene behind the object with white background stylized transform	Deprecated	27d	→
Qwen Image Edit Plus Lora Gallery image-to-image	Generate full portrait from a cropped face photo stylized transform	Deprecated	27d	→
Qwen Image Edit Plus Lora Gallery image-to-image	Create group photos stylized transform	Deprecated	27d	→
Qwen Image Edit Plus Lora Gallery image-to-image	Blend products into backgrounds with automatic perspective and lighting correction stylized transform	Deprecated	27d	→
Qwen Image Edit Plus Lora Gallery image-to-image	Create cinematic transitions and scene progressions (camera movements, framing changes) stylized transform	Deprecated	27d	→
Qwen Image Edit Plus Lora Gallery image-to-image	Remove unwanted elements (objects, people, text) while maintaining image consistency stylized transform	Deprecated	27d	→
Qwen Image Edit Plus Lora Gallery image-to-image	Remove existing lighting and apply soft, even illumination stylized transform	Deprecated	27d	→
Qwen Image Edit Plus Lora Gallery image-to-image	Apply designs/graphics onto people's shirts stylized transform	Deprecated	27d	→
Qwen Image Edit Plus Lora Gallery image-to-image	Precise camera position and angle control (rotation, zoom, vertical movement) stylized transform	Deprecated	27d	→
Flashvsr video-to-video	Upscale your videos using FlashVSR with the fastest speeds! upscale video-to-video	Deprecated	28d	→
Pixverse image-to-video	Generate high quality video clips by swapping person, objects and background using Pixverse Swap.	Deprecated	28d	→
Infinity Star text-to-video	InfinityStar’s unified 8B spacetime autoregressive engine to turn any text prompt into crisp 720p videos - 10× faster than diffusion models. text-to-video	Deprecated	11/7	→
Sana Video text-to-video	Leverage Sana's ultra-fast processing speed to generate high-quality assets that transform your text prompts into production-ready videos text-to-video	Deprecated	11/7	→
Crystal Upscaler image-to-image	An advanced image enhancement tool designed specifically for facial details and portrait photography, utilizing Clarity AI's upscaling technology. image-to-image	Deprecated	11/5	→
Image Outpaint image-to-image	Directional outpainting. Choose edges to expand. left, right, top, or center (uniform all sides). Only expanded areas are generated; an optional zoom-out pulls the frame back by the chosen amount. outpainting	Deprecated	11/5	→
Workflow Utilities video-to-video	Add automatic subtitles to videos auto-subtitle captioning	Deprecated	11/4	→
Reve image-to-image	Reve’s fast edit model lets you upload an existing image and then transform it via a text prompt at lightning speed! image-to-image	Deprecated	11/4	→
Reve image-to-image	Reve’s fast remix model lets you upload an reference images and then combine/transform them via a text prompt at lightning speed! image-to-image	Deprecated	11/4	→
Fashion Size Estimator vision	Fashion Size Estimator model analyzes human body images to predict clothing size recommendations and estimate key body measurements including height, bust, waist, and hip dimensions. utility editing	Deprecated	11/3	→
Bytedance Upscaler video-to-video	Upscale videos with Bytedance's video upscaler. upscaler video bytedance	Deprecated	11/3	→
Flux Vision Upscaler image-to-image	Flux Vision Upscaler for magnify/upscaling images with high fidelity and creativity.	Deprecated	11/2	→
Emu 3.5 Image text-to-image	Generate images from text using Emu 3.5 Image	Deprecated	11/2	→
Emu 3.5 Image image-to-image	Edit images with a text prompt using Emu 3.5 Image	Deprecated	11/2	→
Sima Video Upscaler Lite video-to-video	Upscale your videos at real-time speeds with Sima Labs! upscale video-to-video	Deprecated	10/31	→
Sima Upscaler image-to-image	Upscale your images at blazingly fast speeds with Sima Labs! upscale image-to-image	Deprecated	10/31	→
Chrono Edit image-to-image	NVIDIA's Logically Consistent and Physics-Aware Image Editing Model image-editing	Deprecated	10/31	→
Minimax Music text-to-audio	Generate music from text prompts using the MiniMax Music 2.0 model, which leverages advanced AI techniques to create high-quality, diverse musical compositions. music audio	Deprecated	10/30	→
LongCat Video Distilled text-to-video	Generate long videos in 720p/30fps from text using LongCat Video Distilled	Deprecated	10/30	→
LongCat Video Distilled image-to-video	Generate long videos in 720p/30fps from images using LongCat Video Distilled	Deprecated	10/30	→
LongCat Video text-to-video	Generate long videos from text using LongCat Video	Deprecated	10/30	→
LongCat Video image-to-video	Generate long videos from images using LongCat Video	Deprecated	10/30	→
LongCat Video image-to-video	Generate long videos in 720p/30fps from images using LongCat Video	Deprecated	10/30	→
LongCat Video text-to-video	Generate long videos in 720p/30fps from text using LongCat Video	Deprecated	10/30	→
Qwen Image Edit Trainer training	LoRA trainer for Qwen Image Edit	Deprecated	10/30	→
Qwen Image Edit Plus Trainer training	LoRA trainer for Qwen Image Edit Plus	Deprecated	10/30	→
Omnipart unknown	Image-to-3D endpoint for OmniPart, a part-aware 3D generator with semantic decoupling and structural cohesion.	Deprecated	10/30	→
Fibo json-to-image	SOTA Open source model trained on licensed data, transforming intent into structured control for precise, high-quality AI image generation in enterprise and agentic workflows. bria fibo prompt-adherence	Deprecated	10/29	→
Fibo text-to-json	Structured Prompt Generation endpoint for Fibo, Bria's SOTA Open source model bria fibo structured-prompting	Deprecated	10/29	→
Video As Prompt video-to-video	A model for unified semantic control in video generation. It animates a static reference image using the motion and semantics from a reference video as a prompt. video-as-prompt semantic control	Deprecated	10/29	→
MiniMax Speech 2.6 [HD] text-to-speech	Generate speech from text prompts and different voices using the MiniMax Speech-2.6 HD model, which leverages advanced AI techniques to create high-quality text-to-speech. text-to-speech	Deprecated	10/29	→
MiniMax Speech 2.6 [Turbo] text-to-speech	Generate speech from text prompts and different voices using the MiniMax Speech-2.6 HD model, which leverages advanced AI techniques to create high-quality text-to-speech. text-to-speech	Deprecated	10/29	→
Bytedance image-to-3d	Image to 3D endpoint for Bytedance's high-quality Seed3D 3d model generator. seed3d.quality bytedance 3d	Deprecated	10/29	→
LongCat Video Distilled text-to-video	Generate long videos from text using LongCat Video Distilled	Deprecated	10/29	→
LongCat Video Distilled image-to-video	Generate long videos from images using LongCat Video Distilled	Deprecated	10/29	→
MiniMax Hailuo 2.3 [Pro] (Text to Video) text-to-video	MiniMax Hailuo-2.3 Text To Video API (Pro, 1080p): Advanced text-to-video generation model with 1080p resolution text-to-video	Deprecated	10/28	→
MiniMax Hailuo 2.3 [Standard] (Text to Video) text-to-video	MiniMax Hailuo-2.3 Text To Video API (Standard, 768p): Advanced text-to-video generation model with 768p resolution text-to-video	Deprecated	10/28	→
MiniMax Hailuo 2.3 Fast [Pro] (Image to Video) image-to-video	MiniMax Hailuo-2.3-Fast Image To Video API (Pro, 1080p): Advanced fast image-to-video generation model with 1080p resolution image-to-video	Deprecated	10/28	→
MiniMax Hailuo 2.3 [Standard] (Image to Video) image-to-video	MiniMax Hailuo-2.3 Image To Video API (Standard, 768p): Advanced image-to-video generation model with 768p resolution image-to-video	Deprecated	10/28	→
MiniMax Hailuo 2.3 Fast [Standard] (Image to Video) image-to-video	MiniMax Hailuo-2.3-Fast Image To Video API (Standard, 768p): Advanced fast image-to-video generation model with 768p resolution image-to-video	Deprecated	10/28	→
MiniMax Hailuo 2.3 [Pro] (Image to Video) image-to-video	MiniMax Hailuo-2.3 Image To Video API (Pro, 1080p): Advanced image-to-video generation model with 1080p resolution image-to-video	Deprecated	10/28	→
Demucs audio-to-audio	SOTA stemming model for voice, drums, bass, guitar and more. audio	Deprecated	10/27	→
Piflow text-to-image	Use the faster speed of piflow to generate images with same quality to that of slower models. text-to-image	Deprecated	10/27	→
Birefnet video-to-video	Video background removal version of bilateral reference framework (BiRefNet) for high-resolution dichotomous image segmentation (DIS) utility editing	Deprecated	10/26	→
Audio Understanding audio-to-audio	A audio understanding model to analyze audio content and answer questions about what's happening in the audio based on user prompts. utility audio	Deprecated	10/24	→
Bytedance text-to-video	Text to Video endpoint for Seedance 1.0 Pro Fast, a next-generation video model designed to deliver maximum performance at minimal cost bytedance fast motion	Deprecated	10/24	→
Bytedance image-to-video	Image to Video endpoint for Seedance 1.0 Pro Fast, a next-generation video model designed to deliver maximum performance at minimal cost bytedance seedance pro fast	Deprecated	10/24	→
Vidu text-to-video	Use the latest Vidu Q2 models which much more better quality and control on your videos.	Deprecated	10/24	→
Vidu image-to-video	Use the latest Vidu Q2 models which much more better quality and control on your videos. image-to-video	Deprecated	10/24	→
Vidu image-to-video	Use the latest Vidu Q2 models which much more better quality and control on your videos. image-to-video	Deprecated	10/24	→
Vidu video-to-video	Use the latest Vidu Q2 models which much more better quality and control on your videos.	Deprecated	10/24	→
LTX Video 2.0 Pro text-to-video	Create high-fidelity video with audio from text with LTX-2 Pro.	Deprecated	10/23	→
LTX Video 2.0 Pro image-to-video	Create high-fidelity video with audio from images with LTX-2 Pro	Deprecated	10/23	→
LTX Video 2.0 Fast text-to-video	Create high-fidelity video with audio from text with LTX-2 Fast	Deprecated	10/23	→
LTX Video 2.0 Fast image-to-video	Create high-fidelity video with audio from images with LTX-2 Fast	Deprecated	10/23	→
Kling Video image-to-video	Kling 2.5 Turbo Standard: Top-tier image-to-video generation with unparalleled motion fluidity, cinematic visuals, and exceptional prompt precision. stylized transform	Deprecated	10/22	→
GPT Image 1 Mini text-to-image	GPT Image 1 mini combines OpenAI's advanced language capabilities, powered by GPT-5, with GPT Image 1 Mini for efficient image generation. text-to-image	Deprecated	10/22	→
GPT Image 1 Mini image-to-image	GPT Image 1 mini combines OpenAI's advanced language capabilities, powered by GPT-5, with GPT Image 1 Mini for efficient image generation. image-to-image	Deprecated	10/22	→
Music Generation text-to-audio	Generate royalty-free instrumental music from electronic, hip hop, and indie rock to cinematic and classical genres. Perfect for games, films, social content, podcasts, and more. speech audio music	Deprecated	10/21	→
Sound Effect Generation text-to-audio	Create professional-grade sound effects from animal and vehicle to nature, sci-fi, and otherworldly sounds. Perfect for films, games, and digital content. speech audio effects	Deprecated	10/21	→
Krea Wan 14b- Text to Video text-to-video	Fast Text-to-Video endpoint for Krea's Wan 14b model. text to video fast	Deprecated	10/20	→
Qwen 3 Guard llm	Use Qwen 3 Guard to detect and classify text as safe or harmful, delivering precise and reliable safety categorization. filter safety utility	Deprecated	10/20	→
Meshy 5 Remesh 3d-to-3d	Meshy-5 remesh allows you to remesh and export existing 3D models into various formats 3d-to-3d	Deprecated	10/18	→
Meshy 5 Retexture 3d-to-3d	Meshy-5 retexture applies new, high-quality textures to existing 3D models using either text prompts or reference images. It supports PBR material generation for realistic, production-ready results. 3d-to-3d	Deprecated	10/18	→
Reve image-to-image	Reve’s edit model lets you upload an existing image and then transform it via a text prompt image-to-image	Deprecated	10/17	→
Reve text-to-image	Reve’s text-to-image model generates detailed visual output that closely follow your instructions, with strong aesthetic quality and accurate text rendering. text-to-image	Deprecated	10/17	→
Reve image-to-image	Reve’s remix model lets you upload an reference images and then combine/transform them via a text prompt image-to-image	Deprecated	10/17	→
Wan Alpha text-to-video	Generate videos with transparent backgrounds transparent alpha	Deprecated	10/16	→
Mirelo SFX V1.5 video-to-video	Generate synced sounds for any video, and return it with its new sound track (like MMAudio) video-to-video sfx	Deprecated	10/15	→
Mirelo SFX V1.5 video-to-audio	Generate synced sounds for any video, and return the new sound track (like MMAudio) video-to-audio sfx	Deprecated	10/15	→
Image2Pixel image-to-image	Turn images into pixel-perfect retro art post-processing pixel-art	Deprecated	10/14	→
Kandinsky5 text-to-video	Kandinsky 5.0 is a diffusion model for fast, high-quality text-to-video generation.	Deprecated	10/13	→
Kandinsky5 text-to-video	Kandinsky 5.0 Distilled is a lightweight diffusion model for fast, high-quality text-to-video generation.	Deprecated	10/13	→
DreamOmni2 image-to-image	DreamOmni2 is a unified multimodal model for text and image guided image editing.	Deprecated	10/10	→
Moondream3 Preview [Caption] vision	Moondream 3 is a vision language model that brings frontier-level visual reasoning with native object detection, pointing, and OCR capabilities to real-world applications requiring fast, inexpensive inference at scale. Vision	Deprecated	10/10	→
Moondream 3 Preview [Query] vision	Moondream 3 is a vision language model that brings frontier-level visual reasoning with native object detection, pointing, and OCR capabilities to real-world applications requiring fast, inexpensive inference at scale. Vision	Deprecated	10/10	→
Moondream3 Preview [Point] vision	Moondream 3 is a vision language model that brings frontier-level visual reasoning with native object detection, pointing, and OCR capabilities to real-world applications requiring fast, inexpensive inference at scale. Vision	Deprecated	10/10	→
Moondream3 Preview [Detect] vision	Moondream 3 is a vision language model that brings frontier-level visual reasoning with native object detection, pointing, and OCR capabilities to real-world applications requiring fast, inexpensive inference at scale. Vision	Deprecated	10/10	→
Kling Video video-to-audio	Generate audio from input videos using Kling	Deprecated	10/9	→
Sora 2 video-to-video	Video-to-video remix endpoint for Sora 2, OpenAI’s advanced model that transforms existing videos based on new text or image prompts allowing rich edits, style changes, and creative reinterpretations while preserving motion and structure video to video audio sora	Deprecated	10/9	→
Meshy 6 Preview image-to-3d	Meshy-6-Preview is the latest model from Meshy. It generates realistic and production ready 3D models. image-to-3d	Deprecated	10/8	→
Meshy 5 Multi image-to-3d	Meshy-5 multi image generates realistic and production ready 3D models from multiple images. multi-image-to-3d	Deprecated	10/8	→
Meshy 6 Preview text-to-3d	Meshy-6-Preview is the latest model from Meshy. It generates realistic and production ready 3D models. text-to-3d	Deprecated	10/8	→
Hunyuan Part 3d-to-3d	Use the capabilities of hunyuan part to generate point clouds from your 3D files. 3D-to-3D point-cloud	Deprecated	10/8	→
Wan 2.1 VACE Long Reframe video-to-video	Reframe entire videos scene-by-scene using Wan VACE 2.1	Deprecated	10/8	→
Index TTS 2.0 text-to-speech	Generate natural, clear speeches using Index TTS 2.0 from IndexTeam text-to-speech	Deprecated	10/7	→
Sora 2 text-to-video	Text-to-video endpoint for Sora 2 Pro, OpenAI's state-of-the-art video model capable of creating richly detailed, dynamic clips with audio from natural language or images. text-to-video audio sora-2-pro	Deprecated	10/6	→
Sora 2 image-to-video	Image-to-video endpoint for Sora 2 Pro, OpenAI's state-of-the-art video model capable of creating richly detailed, dynamic clips with audio from natural language or images. image-to-video audio sora-2-pro	Deprecated	10/6	→
Sora 2 image-to-video	Image-to-video endpoint for Sora 2, OpenAI's state-of-the-art video model capable of creating richly detailed, dynamic clips with audio from natural language or images. image-to-video audio sora	Deprecated	10/6	→
Sora 2 text-to-video	Text-to-video endpoint for Sora 2, OpenAI's state-of-the-art video model capable of creating richly detailed, dynamic clips with audio from natural language or images. text to video audio sora	Deprecated	10/6	→
Ovi Text to Video text-to-video	A unified paradigm for audio-video generation	Deprecated	10/3	→
Lucidflux image-to-image	LucidFlux for upscaling images with very high fidelity image-to-image	Deprecated	10/3	→
Qwen Image Edit Plus Lora image-to-image	LoRA endpoint for the Qwen Image Edit Plus model. image-to-image image-editing	Deprecated	10/3	→
Ovi image-to-video	Ovi can generate videos with audio from image and text inputs. image-to-audio-video image-to-video	Deprecated	10/3	→
Fabric 1.0 Fast image-to-video	VEED Fabric 1.0 is an image-to-video API that turns any image into a talking video	Deprecated	10/1	→
Qwen Image Edit image-to-image	Image to Image Endpoint for Qwen's Image Editing model. Has superior text editing capabilities. stylized transform	Deprecated	9/30	→
Hunyuan Image text-to-image	Leverage the state-of-the-art capabilities of Hunyuan Image 3.0 to generate visual content that effectively conveys the messaging of your written material. text-to-image	Deprecated	9/28	→
Hyper3d image-to-3d	Rodin by Hyper3D generates realistic and production ready 3D models from text or images. image-to-3d	Deprecated	9/27	→
Lynx image-to-video	Generate subject consistent videos using Lynx from ByteDance! image-to-video subject	Deprecated	9/26	→
Wan 2.5 Text to Image text-to-image	Wan 2.5 text-to-image model.	Deprecated	9/26	→
Wan 2.5 Image to Image image-to-image	Wan 2.5 image-to-image model.	Deprecated	9/26	→
Wan 2.5 Text to Video text-to-video	Wan 2.5 text-to-video model.	Deprecated	9/24	→
Wan 2.5 Image to Video image-to-video	Wan 2.5 image-to-video model.	Deprecated	9/24	→
Bytedance OmniHuman v1.5 image-to-video	Omnihuman v1.5 is a new and improved version of Omnihuman. It generates video using an image of a human figure paired with an audio file. It produces vivid, high-quality videos where the character’s emotions and movements maintain a strong correlation with the audio. image-to-video lipsync	Deprecated	9/23	→
Product Photoshoot image-to-image	Create product advertisements with an example image of the product	Deprecated	9/23	→
Kling v2.5 Text to Video text-to-video	Kling 2.5 Turbo Pro: Top-tier text-to-video generation with unparalleled motion fluidity, cinematic visuals, and exceptional prompt precision. animation stylized	Deprecated	9/23	→
Kling Video image-to-video	Kling 2.5 Turbo Pro: Top-tier image-to-video generation with unparalleled motion fluidity, cinematic visuals, and exceptional prompt precision. stylized transform	Deprecated	9/23	→
Qwen Image Edit Plus image-to-image	Endpoint for Qwen's Image Editing Plus model. Has superior text editing capabilities and multi-image support. image-editing image-to-image high-quality-text	Deprecated	9/23	→
Infinitalk video-to-video	Infinitalk model generates a talking avatar video from an image and audio file. The avatar lip-syncs to the provided audio with natural facial expressions. video-to-video	Deprecated	9/22	→
SeedVR2 image-to-image	Use SeedVR2 to upscale your images upscale image-to-image	Deprecated	9/22	→
SeedVR2 video-to-video	Upscale your videos using SeedVR2 with temporal consistency! upscale video-to-video	Deprecated	9/22	→
Wan VACE Video Edit video-to-video	Edit videos using plain language and Wan VACE video-edit wan-vace	Deprecated	9/22	→
Wan-2.2 Animate Move video-to-video	Wan-Animate is a video model that generates high-fidelity character videos by replicating the expressions and movements of characters from reference videos. video to video motion	Deprecated	9/21	→
Wan-2.2 Animate Replace video-to-video	Wan-Animate Replace is a model that can integrate animated characters into reference videos, replacing the original character while preserving the scene’s lighting and color tone for seamless environmental integration. video to video motion	Deprecated	9/21	→
Fabric 1.0 image-to-video	VEED Fabric 1.0 is an image-to-video API that turns any image into a talking video	Deprecated	9/20	→
Headshot Generator image-to-image	Generate professional headshot photos with customizable backgrounds. headshot profile-photo	Deprecated	9/19	→
Object Removal image-to-image	Remove unwanted objects seamlessly from any image. remove object-removal	Deprecated	9/19	→
Perspective Change image-to-image	Easily adjust the perspective of any image to different angles. change-angle perspective	Deprecated	9/19	→
Photography Effects image-to-image	Apply diverse photography styles and effects to transform your images. style-transfer photography	Deprecated	9/19	→
Portrait Enhance image-to-image	Enhance and refine portrait photos with improved clarity and detail. image-edit enhancement	Deprecated	9/19	→
Photo Restoration image-to-image	Restore old or damaged photos by fixing colors, scratches, and resolution. photo-restoration image-enhance	Deprecated	9/19	→
Style Transfer image-to-image	Apply artistic styles like impressionism, cubism, or surrealism to your images. style-transfer	Deprecated	9/19	→
Relighting image-to-image	Adjust and enhance images with different lighting styles. relighting	Deprecated	9/19	→
Texture Transform image-to-image	Transform objects with different surface textures like marble, wood, or fabric. texture-transform	Deprecated	9/19	→
Virtual Try-on image-to-image	Try on clothes virtually by combining person and clothing images. fashion try-on virtual-try-on	Deprecated	9/19	→
Product Photography image-to-image	Generate professional product photography with realistic lighting and backgrounds. product marketing	Deprecated	9/19	→
Product Holding image-to-image	Place products naturally in a person’s hands for realistic marketing visuals. product marketing	Deprecated	9/19	→
Lucy Edit [Dev] video-to-video	Lucy Edit Dev	Deprecated	9/18	→
Lucy Edit [Pro] video-to-video	Lucy Edit Pro	Deprecated	9/18	→
Isaac 01 vision	Isaac-01 is a multimodal vision-language model from Perceptron for various vision language tasks. multimodal vision	Deprecated	9/18	→
Wan 2.2 VACE Fun A14B video-to-video	VACE Fun for Wan 2.2 A14B from Alibaba-PAI	Deprecated	9/17	→
Wan 2.2 VACE Fun A14B video-to-video	VACE Fun for Wan 2.2 A14B from Alibaba-PAI	Deprecated	9/17	→
Wan 2.2 VACE Fun A14B video-to-video	VACE Fun for Wan 2.2 A14B from Alibaba-PAI	Deprecated	9/17	→
Wan 2.2 VACE Fun A14B video-to-video	VACE Fun for Wan 2.2 A14B from Alibaba-PAI	Deprecated	9/17	→
Wan 2.2 VACE Fun A14B video-to-video	VACE Fun for Wan 2.2 A14B from Alibaba-PAI	Deprecated	9/17	→
Qwen Image Edit image-to-image	Inpainting Endpoint for the Qwen Edit Image editing model. image-to-image inpainting qwen-image	Deprecated	9/17	→
FLUX.1 SRPO [dev] text-to-image	FLUX.1 SRPO [dev] is a 12 billion parameter flow transformer that generates high-quality images from text with incredible aesthetics. It is suitable for personal and commercial use.	Deprecated	9/16	→
FLUX.1 SRPO [dev] image-to-image	FLUX.1 SRPO [dev] is a 12 billion parameter flow transformer that generates high-quality images from text with incredible aesthetics. It is suitable for personal and commercial use.	Deprecated	9/16	→
FLUX.1 SRPO [dev] text-to-image	FLUX.1 SRPO [dev] is a 12 billion parameter flow transformer that generates high-quality images from text with incredible aesthetics. It is suitable for personal and commercial use.	Deprecated	9/15	→
FLUX.1 SRPO [dev] image-to-image	FLUX.1 SRPO [dev] is a 12 billion parameter flow transformer that generates high-quality images from text with incredible aesthetics. It is suitable for personal and commercial use.	Deprecated	9/15	→
Pshuman image-to-3d	Use the 6D pose estimation capabilities of PSHuman to generate 3D files from single image. image-to-3D	Deprecated	9/13	→
Kling AI Avatar Pro image-to-video	Kling AI Avatar Pro: The premium endpoint for creating avatar videos with realistic humans, animals, cartoons, or stylized characters stylized transform	Deprecated	9/13	→
Kling AI Avatar image-to-video	Kling AI Avatar Standard: Endpoint for creating avatar videos with realistic humans, animals, cartoons, or stylized characters stylized transform	Deprecated	9/13	→
Kling TTS text-to-speech	Generate speech from text prompts and different voices using the Kling TTS model, which leverages advanced AI techniques to create high-quality text-to-speech. audio	Deprecated	9/13	→
MiniMax (Hailuo AI) Music v1.5 text-to-audio	Generate music from text prompts using the MiniMax model, which leverages advanced AI techniques to create high-quality, diverse musical compositions. music	Deprecated	9/12	→
Decart Lucy 14b image-to-video	Lucy-14B delivers lightning fast performance that redefines what's possible with image-to-video AI	Deprecated	9/10	→
Qwen Image Edit Lora image-to-image	LoRA inference endpoint for the Qwen Image Editing model. image-to-image image-editing lora	Deprecated	9/10	→
Stable Audio 25 audio-to-audio	Generate high quality music and sound effects using Stable Audio 2.5 from StabilityAI audio	Deprecated	9/10	→
Stable Audio 2.5 text-to-audio	Generate high quality music and sound effects using Stable Audio 2.5 from StabilityAI audio	Deprecated	9/10	→
Stable Audio 2.5 audio-to-audio	Generate high quality music and sound effects using Stable Audio 2.5 from StabilityAI audio	Deprecated	9/10	→
Hunyuan Image text-to-image	Use the amazing capabilities of hunyuan image 2.1 to generate images that express the feelings of your text. text-to-image	Deprecated	9/9	→
Elevenlabs text-to-audio	Generate realistic audio dialogues using Eleven-v3 from ElevenLabs. audio	Deprecated	9/9	→
Vidu image-to-image	Vidu Reference-to-Image creates images by using a reference images and combining them with a prompt. images-to-image	Deprecated	9/9	→
Bytedance text-to-image	A new-generation image creation model ByteDance, Seedream 4.0 integrates image generation and image editing capabilities into a single, unified architecture. stylized transform	Deprecated	9/9	→
Bytedance image-to-image	A new-generation image creation model ByteDance, Seedream 4.0 integrates image generation and image editing capabilities into a single, unified architecture. stylized transform	Deprecated	9/9	→
Hunyuan Video Foley video-to-video	Use the capabilities of the hunyuan foley model to bring life to your videos by adding sound effect to them. video-to-video add-sound	Deprecated	9/8	→
Avatars Audio to Video audio-to-video	High-quality avatar videos that feel real, generated from your audio	Deprecated	9/4	→
Avatars Text to Video text-to-video	High-quality avatar videos that feel real, generated from your text	Deprecated	9/4	→
Chatterbox text-to-speech	Whether you're working on memes, videos, games, or AI agents, Chatterbox brings your content to life. Use the first tts from resemble ai. text-to-speech multilingual	Deprecated	9/4	→
Wan image-to-image	Wan 2.2's 14B model edit high-resolution, photorealistic images with powerful prompt understanding and fine-grained visual detail image-to-image	Deprecated	9/3	→
Elevenlabs text-to-audio	Generate sound effects using ElevenLabs advanced sound effects model. sound	Deprecated	9/2	→
Sync Lipsync video-to-video	Generate high-quality realistic lipsync animations from audio while preserving unique details like natural teeth and unique facial features using the state-of-the-art Sync Lipsync 2 Pro model. animation lip sync high-quality	Deprecated	9/2	→
Bytedance image-to-video	Seedance lite reference-to-video allows the use of 1 to 4 images as reference to create a high-quality video. reference-to-video image-to-video	Deprecated	9/1	→
Uso image-to-image	Use USO to perform subject driven generations using reference image. image-to-image	Deprecated	8/30	→
Sonauto V2 text-to-audio	Replace sections of an existing audio with newly generated content music text-to-music text-to-audio	Deprecated	8/28	→
Sonauto V2 audio-to-audio	Extend an existing song music text-to-music text-to-audio	Deprecated	8/28	→
Wan 2.2 Fun Control video-to-video	Generate pose or depth controlled video using Alibaba-PAI's Wan 2.2 Fun wan pose depth	Deprecated	8/28	→
Decart image-to-video	Lucy-5B is a model that can create 5-second I2V videos in under 5 seconds, achieving >1x RTF end-to-end	Deprecated	8/28	→
Pixverse text-to-video	Generate high quality video clips from text and image prompts using PixVerse v5	Deprecated	8/27	→
Pixverse v5 Image to Video image-to-video	Generate high quality video clips from text and image prompts using PixVerse v5 stylized transform	Deprecated	8/27	→
Pixverse image-to-video	Create seamless transition between images using PixVerse v5 stylized transform	Deprecated	8/27	→
VibeVoice 1.5B text-to-speech	Generate long, expressive multi-voice speech using Microsoft's powerful TTS text-to-speech multi-speaker podcast	Deprecated	8/27	→
VibeVoice 7B text-to-speech	Generate long, expressive multi-voice speech using Microsoft's powerful TTS text-to-speech multi-speaker podcast	Deprecated	8/27	→
Wan-2.2 Speech-to-Video 14B audio-to-video	Wan-S2V is a video model that generates high-quality videos from static images and audio, with realistic facial expressions, body movements, and professional camera work for film and television applications audio-to-video talking-head	Deprecated	8/27	→
Nano Banana text-to-image	Google's state-of-the-art image generation and editing model image-generation	Deprecated	8/26	→
Nano Banana image-to-image	Google's state-of-the-art image generation and editing model image-editing	Deprecated	8/26	→
Gemini 2.5 Flash Image text-to-image	Nano Banana is Google's state-of-the-art image generation and editing model text-to-image	Deprecated	8/26	→
Gemini 2.5 Flash Image image-to-image	Gemini 2.5 Flash Image is Google's state-of-the-art image generation and editing model image-editing	Deprecated	8/26	→
Video video-to-video	Upscale videos up to 8K output resolution. Trained on fully licensed and commercially safe data. video-upscaling upscale	Deprecated	8/26	→
Qwen Image image-to-image	Qwen-Image (Image-to-Image) transforms and edits input images with high fidelity, enabling precise style transfer, enhancement, and creative modification. image-to-image	Deprecated	8/25	→
Infinitalk image-to-video	Infinitalk model generates a talking avatar video from an image and audio file. The avatar lip-syncs to the provided audio with natural facial expressions. stylized transform	Deprecated	8/23	→
Infinitalk text-to-video	Infinitalk model generates a talking avatar video from a text and audio file. The avatar lip-syncs to the provided audio with natural facial expressions.	Deprecated	8/23	→
Elevenlabs text-to-audio	Generate text-to-speech audio using Eleven-v3 from ElevenLabs. audio	Deprecated	8/20	→
Nextstep 1 image-to-image	Endpoint for NextStep-1 Autoregressive Image Editing model.	Deprecated	8/20	→
Reimagine image-to-image	Reimagine uses a structure reference for generating new images while preserving the structure of an input image, guided by text prompts. Perfect for transforming sketches, illustrations, or photos into new illustrations. Trained exclusively on licensed data bria	Deprecated	8/20	→
Mirelo SFX video-to-video	Generate synced sounds for any video, and return it with its new sound track video-to-video sfx	Deprecated	8/19	→
Mirelo SFX video-to-audio	Generate synced sounds for any video, and return the new sound track sfx	Deprecated	8/19	→
Qwen Image Edit image-to-image	Endpoint for Qwen's Image Editing model. Has superior text editing capabilities. image-editing image-to-image high-quality-text	Deprecated	8/18	→
Qwen Image Trainer training	Qwen Image LoRA training lora personalization	Deprecated	8/14	→
Marey Realism V1.5 text-to-video	Generate a video from a text prompt with Marey, a generative video model trained exclusively on fully licensed data.	Deprecated	8/14	→
Marey Realism V1.5 image-to-video	Generate a video starting from an image as the first frame with Marey, a generative video model trained exclusively on fully licensed data.	Deprecated	8/14	→
Marey Realism V1.5 video-to-video	Pull motion from a reference video and apply it to new subjects or scenes.	Deprecated	8/14	→
Marey Realism V1.5 video-to-video	Ideal for matching human movement. Your input video determines human poses, gestures, and body movements that will appear in the generated video.	Deprecated	8/14	→
Stable Avatar audio-to-video	Stable Avatar generates audio-driven video avatars up to five minutes long stable-avatar talking-head audio-to-video	Deprecated	8/14	→
ControlNet SDXL image-to-image	Generate Images with ControlNet. diffusion controlnet manipulation	Deprecated	8/13	→
MusePose video-to-video	Animate a reference image with a driving video using MusePose.	Deprecated	8/13	→
Segment Anything Model image-to-image	SAM. segmentation mask	Deprecated	8/13	→
LLaVA v1.5 13B vision	Vision multimodal vision	Deprecated	8/13	→
LTX Video-0.9.5 image-to-video	Generate videos from prompts and images using LTX Video-0.9.5 video image-to-video	Deprecated	8/13	→
Hidream E1 Full image-to-image	Edit images with natural language	Deprecated	8/13	→
LTX Video-0.9.7 image-to-video	Deprecated. Use fal-ai/ltx-video-13b-dev or fal-ai/ltx-video-13b-distilled instead. video image-to-video	Deprecated	8/13	→
LTX Video-0.9.7 text-to-video	Deprecated. Use fal-ai/ltx-video-13b-dev or fal-ai/ltx-video-13b-distilled instead. video text-video	Deprecated	8/13	→
LTX Video-0.9.7 video-to-video	Deprecated. Use fal-ai/ltx-video-13b-dev or fal-ai/ltx-video-13b-distilled instead. video image-to-video text-to-video	Deprecated	8/13	→
Ltx Video V097 video-to-video	Deprecated. Use fal-ai/ltx-video-13b-dev or fal-ai/ltx-video-13b-distilled instead.	Deprecated	8/13	→
LTX Video-0.9.7 LoRA text-to-video	Deprecated. Use fal-ai/ltx-video-13b-dev or fal-ai/ltx-video-13b-distilled instead. video ltx-video text-to-video	Deprecated	8/13	→
Stable Diffusion with LoRAs text-to-image	Run Any Stable Diffusion model with customizable LoRA weights. diffusion lora customization	Deprecated	8/13	→
Remove Background image-to-image	Remove the background from an image. background removal utility editing	Deprecated	8/13	→
Upscale Images image-to-image	Upscale images by a given factor. upscaling high-res	Deprecated	8/13	→
Inpainting sdxl and sd image-to-image	Inpaint images with SD and SDXL editing diffusion	Deprecated	8/13	→
Animatediff SparseCtrl LCM text-to-video	Animate Your Drawings with Latent Consistency Models! lcm animation stylized	Deprecated	8/13	→
Optimized Latent Consistency (SDv1.5) image-to-image	Produce high-quality images with minimal inference steps. Optimized for 512x512 input image size. diffusion lcm real-time	Deprecated	8/13	→
Fooocus text-to-image	Default parameters with automated optimizations and quality improvements. stylized	Deprecated	8/13	→
ControlNet SDXL image-to-image	Generate Images with ControlNet. diffusion controlnet editing manipulation	Deprecated	8/13	→
ControlNet SDXL image-to-image	Generate Images with ControlNet. diffusion controlnet editing manipulation	Deprecated	8/13	→
PuLID image-to-image	Tuning-free ID customization. editing customization personalization	Deprecated	8/13	→
Marigold Depth Estimation image-to-image	Create depth maps using Marigold depth estimation. depth utility	Deprecated	8/13	→
Stable Audio Open text-to-audio	Open source text-to-audio model. music	Deprecated	8/13	→
DiffusionEdge text-to-image	Diffusion based high quality edge detection detection	Deprecated	8/13	→
TripoSR image-to-3d	State of the art Image to 3D Object generation	Deprecated	8/13	→
Latent Consistency (SDXL & SDv1.5) text-to-image	Produce high-quality images with minimal inference steps. diffusion lcm real-time	Deprecated	8/13	→
Clarity Upscaler image-to-image	Clarity upscaler for upscaling images with high very fidelity. upscaling	Deprecated	8/13	→
AnimateDiff video-to-video	Re-animate your videos! animation stylized	Deprecated	8/13	→
MiniMax (Hailuo AI) Video 01 text-to-video	Generate video clips from your prompts using MiniMax model motion transformation	Deprecated	8/13	→
Fooocus Inpainting text-to-image	Default parameters with automated optimizations and quality improvements. stylized editing	Deprecated	8/13	→
AnimateDiff Turbo video-to-video	Re-animate your videos in lightning speed! animation stylized turbo	Deprecated	8/13	→
Midas Depth Estimation image-to-image	Create depth maps using Midas depth estimation. depth utility	Deprecated	8/13	→
Stable Video Diffusion Turbo image-to-video	Generate short video clips from your images using SVD v1.1 at Lightning Speed turbo	Deprecated	8/13	→
Face Retoucher image-to-image	Automatically retouches faces to smooth skin and remove blemishes. editing	Deprecated	8/13	→
Fooocus Image Prompt text-to-image	Default parameters with automated optimizations and quality improvements. stylized	Deprecated	8/13	→
Illusion Diffusion text-to-image	Create illusions conditioned on image. composition stylized	Deprecated	8/13	→
AnimateDiff Turbo text-to-video	Animate your ideas in lightning speed! animation stylized turbo	Deprecated	8/13	→
LLaVA v1.6 34B vision	Vision multimodal vision	Deprecated	8/13	→
Any LLM llm	Use any large language model from our selected catalogue (powered by OpenRouter) chat claude gpt streaming	Deprecated	8/13	→
Fooocus text-to-image	Fooocus extreme speed mode as a standalone app.	Deprecated	8/13	→
Latent Consistency Models (v1.5/XL) image-to-image	Run SDXL at the speed of light lcm diffusion turbo real-time editing	Deprecated	8/13	→
Latent Consistency Models (v1.5/XL) text-to-image	Run SDXL at the speed of light lcm diffusion turbo real-time	Deprecated	8/13	→
Latent Consistency Models (v1.5/XL) image-to-image	Run SDXL at the speed of light lcm diffusion turbo real-time editing	Deprecated	8/13	→
Whisper speech-to-text	Whisper is a model for speech transcription and translation. transcription translation speech	Deprecated	8/13	→
AnimateDiff text-to-video	Animate your ideas! animation stylized	Deprecated	8/13	→
AMT Interpolation video-to-video	Interpolate between video frames interpolation editing	Deprecated	8/13	→
Playground v2.5 image-to-image	State-of-the-art open-source model in aesthetic quality artistic style	Deprecated	8/13	→
Hyper SDXL text-to-image	Hyper-charge SDXL's performance and creativity. diffusion real-time	Deprecated	8/13	→
Stable Diffusion XL Lightning image-to-image	Run SDXL at the speed of light diffusion lightning	Deprecated	8/13	→
Playground v2.5 image-to-image	State-of-the-art open-source model in aesthetic quality inpaint artistic style	Deprecated	8/13	→
Stable Diffusion XL Lightning image-to-image	Run SDXL at the speed of light diffusion lightning editing	Deprecated	8/13	→
Birefnet Background Removal image-to-image	bilateral reference framework (BiRefNet) for high-resolution dichotomous image segmentation (DIS) background removal segmentation high-res utility	Deprecated	8/13	→
Creative Upscaler image-to-image	Create creative upscaled images. upscaling	Deprecated	8/13	→
ControlNet SDXL text-to-image	Generate Images with ControlNet. diffusion controlnet manipulation	Deprecated	8/13	→
T2V Turbo - Video Crafter text-to-video	Generate short video clips from your prompts turbo	Deprecated	8/13	→
PhotoMaker image-to-image	Customizing Realistic Human Photos via Stacked ID Embedding editing customization realism personalization	Deprecated	8/13	→
Face to Sticker image-to-image	Create stickers from faces. sticker editing	Deprecated	8/13	→
Fooocus text-to-image	Fooocus extreme speed mode as a standalone app. stylized	Deprecated	8/13	→
Moondream vision	Answer questions from the images. multimodal vision	Deprecated	8/13	→
NSFW Filter vision	Predict the probability of an image being NSFW. filter safety utility	Deprecated	8/13	→
Wizper (Whisper v3 -- fal.ai edition) speech-to-text	[Experimental] Whisper v3 Large -- but optimized by our inference wizards. Same WER, double the performance! transcription speech	Deprecated	8/13	→
Sad Talker image-to-video	Learning Realistic 3D Motion Coefficients for Stylized Audio-Driven Single Image Talking Face Animation animation	Deprecated	8/13	→
AuraSR image-to-image	Upscale your images with AuraSR. upscaling high-res	Deprecated	8/13	→
Stable Diffusion XL Lightning text-to-image	Run SDXL at the speed of light diffusion lightning real-time	Deprecated	8/13	→
MuseTalk image-to-video	MuseTalk is a real-time high quality audio-driven lip-syncing model. Use MuseTalk to animate a face with your own audio. animation lip sync real-time	Deprecated	8/13	→
Layer Diffusion XL text-to-image	SDXL with an alpha channel.	Deprecated	8/13	→
Stable Diffusion v1.5 text-to-image	Stable Diffusion v1.5 diffusion	Deprecated	8/13	→
Stable Diffusion XL image-to-image	Run SDXL at the speed of light diffusion high-res lora ip-adapter controlnet	Deprecated	8/13	→
Stable Diffusion XL image-to-image	Run SDXL at the speed of light diffusion high-res lora ip-adapter controlnet	Deprecated	8/13	→
Stable Diffusion with LoRAs image-to-image	Run Any Stable Diffusion model with customizable LoRA weights. diffusion lora customization fine-tuning	Deprecated	8/13	→
Stable Diffusion with LoRAs image-to-image	Run Any Stable Diffusion model with customizable LoRA weights. diffusion lora customization fine-tuning	Deprecated	8/13	→
IP Adapter Face ID image-to-image	High quality zero-shot personalization ip-adapter personalization customization editing	Deprecated	8/13	→
Hyper SDXL image-to-image	Hyper-charge SDXL's performance and creativity. diffusion	Deprecated	8/13	→
Dreamshaper text-to-image	Dreamshaper model. stylized diffusion	Deprecated	8/13	→
Realistic Vision text-to-image	Generate realistic images. realism diffusion	Deprecated	8/13	→
Hyper SDXL image-to-image	Hyper-charge SDXL's performance and creativity. diffusion editing	Deprecated	8/13	→
Playground v2.5 text-to-image	State-of-the-art open-source model in aesthetic quality artistic style	Deprecated	8/13	→
Lightning Models text-to-image	Collection of SDXL Lightning models. diffusion lightning	Deprecated	8/13	→
Omni Zero image-to-image	Any pose, any style, any identity style transfer	Deprecated	8/13	→
CCSR Upscaler image-to-image	SOTA Image Upscaler upscaling	Deprecated	8/13	→
SD 1.5 Depth ControlNet image-to-image	SD 1.5 ControlNet diffusion editing manipulation controlnet	Deprecated	8/13	→
DWPose Pose Prediction image-to-image	Predict poses from images. pose utility	Deprecated	8/13	→
Stable Video Diffusion Turbo text-to-video	Generate short video clips from your images using SVD v1.1 at Lightning Speed lcm diffusion turbo	Deprecated	8/13	→
Luma Dream Machine image-to-video	Generate video clips from your images using Luma Dream Machine v1.5 motion transformation	Deprecated	8/13	→
Luma Photon text-to-image	Generate images from your prompts using Luma Photon. Photon is the most creative, personalizable, and intelligent visual models for creatives, bringing a step-function change in the cost of high-quality image generation.	Deprecated	8/13	→
SoteDiffusion text-to-image	Anime finetune of Würstchen V3. lcm stylized	Deprecated	8/13	→
Stable Diffusion V3 image-to-image	Stable Diffusion 3 Medium (Image to Image) is a Multimodal Diffusion Transformer (MMDiT) model that improves image quality, typography, prompt understanding, and efficiency. diffusion editing style	Deprecated	8/13	→
Stable Diffusion XL text-to-image	Run SDXL at the speed of light diffusion lora embeddings high-res style	Deprecated	8/13	→
Florence-2 Large vision	Florence-2 is an advanced vision foundation model that uses a prompt-based approach to handle a wide range of vision and vision-language tasks captioning multimodal vision	Deprecated	8/13	→
Florence-2 Large image-to-image	Florence-2 is an advanced vision foundation model that uses a prompt-based approach to handle a wide range of vision and vision-language tasks multimodal vision segmentation	Deprecated	8/13	→
Florence-2 Large vision	Florence-2 is an advanced vision foundation model that uses a prompt-based approach to handle a wide range of vision and vision-language tasks captioning multimodal vision	Deprecated	8/13	→
Florence-2 Large vision	Florence-2 is an advanced vision foundation model that uses a prompt-based approach to handle a wide range of vision and vision-language tasks multimodal vision	Deprecated	8/13	→
Florence-2 Large image-to-image	Florence-2 is an advanced vision foundation model that uses a prompt-based approach to handle a wide range of vision and vision-language tasks multimodal vision	Deprecated	8/13	→
Florence-2 Large image-to-image	Florence-2 is an advanced vision foundation model that uses a prompt-based approach to handle a wide range of vision and vision-language tasks multimodal vision segmentation	Deprecated	8/13	→
Florence-2 Large vision	Florence-2 is an advanced vision foundation model that uses a prompt-based approach to handle a wide range of vision and vision-language tasks ocr multimodal vision	Deprecated	8/13	→
Florence-2 Large vision	Florence-2 is an advanced vision foundation model that uses a prompt-based approach to handle a wide range of vision and vision-language tasks captioning multimodal vision	Deprecated	8/13	→
Florence-2 Large image-to-image	Florence-2 is an advanced vision foundation model that uses a prompt-based approach to handle a wide range of vision and vision-language tasks ocr multimodal vision	Deprecated	8/13	→
Florence-2 Large image-to-image	Florence-2 is an advanced vision foundation model that uses a prompt-based approach to handle a wide range of vision and vision-language tasks multimodal vision	Deprecated	8/13	→
Florence-2 Large image-to-image	Florence-2 is an advanced vision foundation model that uses a prompt-based approach to handle a wide range of vision and vision-language tasks multimodal vision	Deprecated	8/13	→
Florence-2 Large vision	Florence-2 is an advanced vision foundation model that uses a prompt-based approach to handle a wide range of vision and vision-language tasks multimodal vision	Deprecated	8/13	→
Florence-2 Large image-to-image	Florence-2 is an advanced vision foundation model that uses a prompt-based approach to handle a wide range of vision and vision-language tasks multimodal vision detection	Deprecated	8/13	→
Florence-2 Large image-to-image	Florence-2 is an advanced vision foundation model that uses a prompt-based approach to handle a wide range of vision and vision-language tasks detection multimodal vision	Deprecated	8/13	→
Stable Cascade text-to-image	Stable Cascade: Image generation on a smaller & cheaper latent space. diffusion lcm	Deprecated	8/13	→
Era 3D image-to-image	A powerful image to novel multiview model with normals.	Deprecated	8/13	→
Live Portrait image-to-video	Transfer expression from a video to a portrait. expression animation	Deprecated	8/13	→
FLUX.1 [dev] image-to-image	FLUX.1 Image-to-Image is a high-performance endpoint for the FLUX.1 [dev] model that enables rapid transformation of existing images, delivering high-quality style transfers and image modifications with the core FLUX capabilities. style transfer	Deprecated	8/13	→
AMT Frame Interpolation image-to-video	Interpolate between image frames interpolation editing	Deprecated	8/13	→
Kolors text-to-image	Photorealistic Text-to-Image realism diffusion	Deprecated	8/13	→
SDXL ControlNet Union image-to-image	An efficent SDXL multi-controlnet inpainting model. diffusion controlnet composition	Deprecated	8/13	→
SDXL ControlNet Union image-to-image	An efficent SDXL multi-controlnet image-to-image model. diffusion controlnet composition	Deprecated	8/13	→
SDXL ControlNet Union text-to-image	An efficent SDXL multi-controlnet text-to-image model. diffusion controlnet composition	Deprecated	8/13	→
FLUX.1 [dev] with LoRAs text-to-image	Super fast endpoint for the FLUX.1 [dev] model with LoRA support, enabling rapid and high-quality image generation using pre-trained LoRA adaptations for personalization, specific styles, brand identities, and product-specific outputs. lora personalization	Deprecated	8/13	→
PixArt-Σ text-to-image	Weak-to-Strong Training of Diffusion Transformer for 4K Text-to-Image Generation diffusion	Deprecated	8/13	→
Sana text-to-image	Sana can synthesize high-resolution, high-quality images with strong text-image alignment at a remarkably fast speed, with the ability to generate 4K images in less than a second.	Deprecated	8/13	→
FLUX.1 Subject text-to-image	Super fast endpoint for the FLUX.1 [schnell] model with subject input capabilities, enabling rapid and high-quality image generation for personalization, specific styles, brand identities, and product-specific outputs. personalization customization	Deprecated	8/13	→
Fooocus Upscale or Vary text-to-image	Default parameters with automated optimizations and quality improvements. upscaling vary stylized	Deprecated	8/13	→
FLUX.1 [dev] with LoRAs image-to-image	FLUX LoRA Image-to-Image is a high-performance endpoint that transforms existing images using FLUX models, leveraging LoRA adaptations to enable rapid and precise image style transfer, modifications, and artistic variations. lora style transfer	Deprecated	8/13	→
FLUX.1 [dev] with Controlnets and Loras image-to-image	A specialized FLUX endpoint combining differential diffusion control with LoRA, ControlNet, and IP-Adapter support, enabling precise, region-specific image transformations through customizable change maps. lora controlnet ip-adapter	Deprecated	8/13	→
FLUX.1 [dev] with Controlnets and Loras image-to-image	FLUX General Inpainting is a versatile endpoint that enables precise image editing and completion, supporting multiple AI extensions including LoRA, ControlNet, and IP-Adapter for enhanced control over inpainting results and sophisticated image modifications. lora controlnet ip-adapter	Deprecated	8/13	→
FLUX.1 [dev] with Controlnets and Loras image-to-image	FLUX General Image-to-Image is a versatile endpoint that transforms existing images with support for LoRA, ControlNet, and IP-Adapter extensions, enabling precise control over style transfer, modifications, and artistic variations through multiple guidance methods. lora controlnet ip-adapter	Deprecated	8/13	→
Segment Anything Model 2 video-to-video	SAM 2 is a model for segmenting images and videos in real-time. segmentation mask real-time	Deprecated	8/13	→
Segment Anything Model 2 image-to-image	SAM 2 is a model for segmenting images and videos in real-time. segmentation mask real-time	Deprecated	8/13	→
Stable Diffusion V3 text-to-image	Stable Diffusion 3 Medium (Text to Image) is a Multimodal Diffusion Transformer (MMDiT) model that improves image quality, typography, prompt understanding, and efficiency. diffusion style	Deprecated	8/13	→
FLUX.1 [dev] with Controlnets and Loras text-to-image	A versatile endpoint for the FLUX.1 [dev] model that supports multiple AI extensions including LoRA, ControlNet conditioning, and IP-Adapter integration, enabling comprehensive control over image generation through various guidance methods. lora controlnet ip-adapter	Deprecated	8/13	→
ControlNeXt SVD video-to-video	Animate a reference image with a driving video using ControlNeXt. animation stylized	Deprecated	8/13	→
Stable Video Diffusion text-to-video	Generate short video clips from your prompts using SVD v1.1	Deprecated	8/13	→
Image Preprocessors image-to-image	PIDI (Pidinet) preprocessor. detection preprocess utility controlnet	Deprecated	8/13	→
Image Preprocessors image-to-image	M-LSD line segment detection preprocessor. preprocess utility controlnet	Deprecated	8/13	→
Image Preprocessors image-to-image	TEED (Temporal Edge Enhancement Detection) preprocessor. preprocess detection utility controlnet	Deprecated	8/13	→
Image Preprocessors image-to-image	ZoeDepth preprocessor. depth preprocess utility controlnet	Deprecated	8/13	→
Image Preprocessors image-to-image	Segment Anything Model (SAM) preprocessor. segmentation preprocess utility mask controlnet	Deprecated	8/13	→
Image Preprocessors image-to-image	Line art preprocessor. preprocess utility sketch controlnet	Deprecated	8/13	→
Image Preprocessors image-to-image	MiDaS depth estimation preprocessor. depth preprocess utility controlnet	Deprecated	8/13	→
Image Preprocessors image-to-image	Depth Anything v2 preprocessor. depth preprocess utility controlnet	Deprecated	8/13	→
Image Preprocessors image-to-image	Scribble preprocessor. preprocess utility editing controlnet sketch	Deprecated	8/13	→
Image Preprocessors image-to-image	Holistically-Nested Edge Detection (HED) preprocessor. preprocess detection utility controlnet	Deprecated	8/13	→
High Quality Stable Video Diffusion image-to-video	Generate short video clips from your images using SVD v1.1	Deprecated	8/13	→
FLUX.1 [dev] with Controlnets and Loras image-to-image	A general purpose endpoint for the FLUX.1 [dev] model, implementing the RF-Inversion pipeline. This can be used to edit a reference image based on a prompt. rf-inversion editing lora	Deprecated	8/13	→
FLUX.1 [dev] Inpainting with LoRAs text-to-image	Super fast endpoint for the FLUX.1 [dev] inpainting model with LoRA support, enabling rapid and high-quality image inpaingting using pre-trained LoRA adaptations for personalization, specific styles, brand identities, and product-specific outputs. lora personalization	Deprecated	8/13	→
Live Portrait image-to-image	Transfer expression from a video to a portrait. expression animation	Deprecated	8/13	→
FLUX.1 [pro] text-to-image	FLUX.1 [pro] new is an accelerated version of FLUX.1 [pro], maintaining professional-grade image quality while delivering significantly faster generation speeds.	Deprecated	8/13	→
LTX Video (preview) text-to-video	Generate videos from prompts using LTX Video	Deprecated	8/13	→
Kling 1.0 text-to-video	Generate video clips from your prompts using Kling 1.0 (pro) motion	Deprecated	8/13	→
Kling 1.0 image-to-video	Generate video clips from your images using Kling 1.0 motion	Deprecated	8/13	→
Kling 1.5 image-to-video	Generate video clips from your images using Kling 1.5 (pro)	Deprecated	8/13	→
Kling 1.0 image-to-video	Generate video clips from your images using Kling 1.0 (pro) motion	Deprecated	8/13	→
Any VLM vision	Use any vision language model from our selected catalogue (powered by OpenRouter) multimodal vision streaming	Deprecated	8/13	→
CogVideoX-5B video-to-video	Generate videos from videos and prompts using CogVideoX-5B editing	Deprecated	8/13	→
F5 TTS text-to-audio	F5 TTS speech	Deprecated	8/13	→
CogVideoX-5B image-to-video	Generate videos from images and prompts using CogVideoX-5B	Deprecated	8/13	→
Hunyuan Video text-to-video	Hunyuan Video is an Open video generation model with high visual quality, motion diversity, text-video alignment, and generation stability. This endpoint generates videos from text descriptions. motion	Deprecated	8/13	→
Stable Diffusion 3.5 Medium text-to-image	Stable Diffusion 3.5 Medium is a Multimodal Diffusion Transformer (MMDiT) text-to-image model that features improved performance in image quality, typography, complex prompt understanding, and resource-efficiency. diffusion typography style	Deprecated	8/13	→
Stable Diffusion 3.5 Large text-to-image	Stable Diffusion 3.5 Large is a Multimodal Diffusion Transformer (MMDiT) text-to-image model that features improved performance in image quality, typography, complex prompt understanding, and resource-efficiency. diffusion typography style	Deprecated	8/13	→
Birefnet Background Removal image-to-image	bilateral reference framework (BiRefNet) for high-resolution dichotomous image segmentation (DIS) background removal segmentation high-res utility	Deprecated	8/13	→
PuLID Flux image-to-image	An endpoint for personalized image generation using Flux as per given description. personalization style transfer	Deprecated	8/13	→
MiniMax (Hailuo AI) Video 01 image-to-video	Generate video clips from your images using MiniMax Video model motion transformation	Deprecated	8/13	→
FLUX.1 [dev] Differential Diffusion image-to-image	FLUX.1 Differential Diffusion is a rapid endpoint that enables swift, granular control over image transformations through change maps, delivering fast and precise region-specific modifications while maintaining FLUX.1 [dev]'s high-quality output. transformation	Deprecated	8/13	→
Train Flux LoRAs For Portraits training	FLUX LoRA training optimized for portrait generation, with bright highlights, excellent prompt following and highly detailed results. lora personalization	Deprecated	8/13	→
Mochi 1 text-to-video	Mochi 1 preview is an open state-of-the-art video generation model with high-fidelity motion and strong prompt adherence in preliminary evaluation.	Deprecated	8/13	→
IC-Light-v2 for Image Relighting image-to-image	An endpoint for re-lighting photos and changing their backgrounds per a given description relighting editing	Deprecated	8/13	→
Kolors Image to Image image-to-image	Photorealistic Image-to-Image realism editing diffusion	Deprecated	8/13	→
FLUX.1 [pro] Redux image-to-image	FLUX.1 [pro] Redux is a high-performance endpoint for the FLUX.1 [pro] model that enables rapid transformation of existing images, delivering high-quality style transfers and image modifications with the core FLUX capabilities. style transfer	Deprecated	8/13	→
FLUX.1 [pro] Fill image-to-image	FLUX.1 [pro] Fill is a high-performance endpoint for the FLUX.1 [pro] model that enables rapid transformation of existing images, delivering high-quality style transfers and image modifications with the core FLUX capabilities. editing	Deprecated	8/13	→
FLUX1.1 [pro] ultra Redux image-to-image	FLUX1.1 [pro] ultra Redux is a high-performance endpoint for the FLUX1.1 [pro] model that enables rapid transformation of existing images, delivering high-quality style transfers and image modifications with the core FLUX capabilities. style transfer high-res	Deprecated	8/13	→
LTX Video (preview) image-to-video	Generate videos from images using LTX Video	Deprecated	8/13	→
FLUX.1 [dev] Depth with LoRAs image-to-image	Generate high-quality images from depth maps using Flux.1 [dev] depth estimation model. The model produces accurate depth representations for scene understanding and 3D visualization. depth lora utility composition	Deprecated	8/13	→
FLUX.1 [dev] Redux image-to-image	FLUX.1 [dev] Redux is a high-performance endpoint for the FLUX.1 [dev] model that enables rapid transformation of existing images, delivering high-quality style transfers and image modifications with the core FLUX capabilities.	Deprecated	8/13	→
FLUX1.1 [pro] Redux image-to-image	FLUX1.1 [pro] Redux is a high-performance endpoint for the FLUX1.1 [pro] model that enables rapid transformation of existing images, delivering high-quality style transfers and image modifications with the core FLUX capabilities. style transfer	Deprecated	8/13	→
FLUX.1 [schnell] text-to-image	FLUX.1 [schnell] is a 12 billion parameter flow transformer that generates high-quality images from text in 1 to 4 steps, suitable for personal and commercial use.	Deprecated	8/13	→
Kling 1.5 text-to-video	Generate video clips from your prompts using Kling 1.5 (pro)	Deprecated	8/13	→
FLUX.1 [schnell] Redux image-to-image	FLUX.1 [schnell] Redux is a high-performance endpoint for the FLUX.1 [schnell] model that enables rapid transformation of existing images, delivering high-quality style transfers and image modifications with the core FLUX capabilities. style transfer	Deprecated	8/13	→
OmniGen v1 text-to-image	OmniGen is a unified image generation model that can generate a wide range of images from multi-modal prompts. It can be used for various tasks such as Image Editing, Personalized Image Generation, Virtual Try-On, Multi Person Generation and more! multimodal editing try-on	Deprecated	8/13	→
AuraFlow text-to-image	AuraFlow v0.3 is an open-source flow-based text-to-image generation model that achieves state-of-the-art results on GenEval. The model is currently in beta. typography style	Deprecated	8/13	→
Luma Photon Flash text-to-image	Generate images from your prompts using Luma Photon Flash. Photon Flash is the most creative, personalizable, and intelligent visual models for creatives, bringing a step-function change in the cost of high-quality image generation.	Deprecated	8/13	→
Kling 1.0 text-to-video	Generate video clips from your prompts using Kling 1.0 motion	Deprecated	8/13	→
Ideogram V2 Remix image-to-image	Reimagine existing images with Ideogram V2's remix feature. Create variations and adaptations while preserving core elements and adding new creative directions through prompt guidance. realism typography	Deprecated	8/13	→
Ideogram V2 Turbo Remix image-to-image	Rapidly create image variations with Ideogram V2 Turbo Remix. Fast and efficient reimagining of existing images while maintaining creative control through prompt guidance. realism typography	Deprecated	8/13	→
Ideogram V2 Turbo Edit image-to-image	Edit images faster with Ideogram V2 Turbo. Quick modifications and adjustments while preserving the high-quality standards and realistic outputs of Ideogram. realism typography	Deprecated	8/13	→
Video Upscaler video-to-video	The video upscaler endpoint uses RealESRGAN on each frame of the input video to upscale the video to a higher resolution. video generation video to video ai video high fidelity motion	Deprecated	8/13	→
Ideogram V2 Turbo text-to-image	Accelerated image generation with Ideogram V2 Turbo. Create high-quality visuals, posters, and logos with enhanced speed while maintaining Ideogram's signature quality. realism typography	Deprecated	8/13	→
Ideogram V2 text-to-image	Generate high-quality images, posters, and logos with Ideogram V2. Features exceptional typography handling and realistic outputs optimized for commercial and creative use. realism typography	Deprecated	8/13	→
Luma Dream Machine text-to-video	Generate video clips from your prompts using Luma Dream Machine v1.5 motion transformation	Deprecated	8/13	→
MMAudio V2 video-to-video	MMAudio generates synchronized audio given video and/or text inputs. It can be combined with video models to get videos with audio. ai video fast	Deprecated	8/13	→
Trellis image-to-3d	Generate 3D models from your images using Trellis. A native 3D generative model enabling versatile and high-quality 3D asset creation. stylized	Deprecated	8/13	→
Ideogram V2 Edit image-to-image	Transform existing images with Ideogram V2's editing capabilities. Modify, adjust, and refine images while maintaining high fidelity and realistic outputs with precise prompt control. realism typography	Deprecated	8/13	→
MiniMax (Hailuo AI) Video 01 Live text-to-video	Generate video clips from your prompts using MiniMax model motion transformation	Deprecated	8/13	→
Hyper3D Rodin image-to-3d	Rodin by Hyper3D generates realistic and production ready 3D models from text or images. stylized	Deprecated	8/13	→
Recraft 20b text-to-image	Recraft 20b is a new and affordable text-to-image model. image generation vector art typograph style	Deprecated	8/13	→
MiniMax (Hailuo AI) Video 01 Live image-to-video	Generate video clips from your images using MiniMax Video model motion transformation	Deprecated	8/13	→
MiniMax (Hailuo AI) Music text-to-audio	Generate music from text prompts using the MiniMax model, which leverages advanced AI techniques to create high-quality, diverse musical compositions. music	Deprecated	8/13	→
Leffa Virtual TryOn image-to-image	Leffa Virtual TryOn is a high quality image based Try-On endpoint which can be used for commercial try on. try-on fashion clothing	Deprecated	8/13	→
FLUX1.1 [pro] ultra text-to-image	FLUX1.1 [pro] ultra is the newest version of FLUX1.1 [pro], maintaining professional-grade image quality while delivering up to 2K resolution with improved photo realism. high-res realism	Deprecated	8/13	→
Leffa Pose Transfer image-to-image	Leffa Pose Transfer is an endpoint for changing pose of an image with a reference image. pose utility	Deprecated	8/13	→
try-on image-to-image	Image based high quality Virtual Try-On try-on fashion clothing	Deprecated	8/13	→
Bria RMBG 2.0 image-to-image	Bria RMBG 2.0 enables seamless removal of backgrounds from images, ideal for professional editing tasks. Trained exclusively on licensed data for safe and risk-free commercial use. Model weights for commercial use are available here: https://share-eu1.hsforms.com/2GLpEVQqJTI2Lj7AMYwgfIwf4e04 background removal image segmentation high resolution utility rembg	Deprecated	8/13	→
Bria Product Shot image-to-image	Place any product in any scenery with just a prompt or reference image while maintaining high integrity of the product. Trained exclusively on licensed data for safe and risk-free commercial use and optimized for eCommerce. product photography	Deprecated	8/13	→
Bria Text-to-Image HD text-to-image	Bria's Text-to-Image model for HD images. Trained exclusively on licensed data for safe and risk-free commercial use. Available also as source code and weights. For access to weights: https://bria.ai/contact-us image generation	Deprecated	8/13	→
FLUX.1 [dev] Fill with LoRAs image-to-image	FLUX.1 [dev] Fill is a high-performance endpoint for the FLUX.1 [pro] model that enables rapid transformation of existing images, delivering high-quality style transfers and image modifications with the core FLUX capabilities. editing lora	Deprecated	8/13	→
Bria Eraser image-to-image	Bria Eraser enables precise removal of unwanted objects from images while maintaining high-quality outputs. Trained exclusively on licensed data for safe and risk-free commercial use. Access the model's source code and weights: https://bria.ai/contact-us image editing object removal	Deprecated	8/13	→
Bria Background Replace image-to-image	Bria Background Replace allows for efficient swapping of backgrounds in images via text prompts or reference image, delivering realistic and polished results. Trained exclusively on licensed data for safe and risk-free commercial use image editing	Deprecated	8/13	→
PlayAI Text-to-Speech v3 text-to-speech	Blazing-fast text-to-speech. Generate audio with improved emotional tones and extensive multilingual support. Ideal for high-volume processing and efficient workflows.	Deprecated	8/13	→
Bria GenFill image-to-image	Bria GenFill enables high-quality object addition or visual transformation. Trained exclusively on licensed data for safe and risk-free commercial use. Access the model's source code and weights: https://bria.ai/contact-us image editing	Deprecated	8/13	→
Bria Text-to-Image Base text-to-image	Bria's Text-to-Image model, trained exclusively on licensed data for safe and risk-free commercial use. Available also as source code and weights. For access to weights: https://bria.ai/contact-us image generation	Deprecated	8/13	→
Bria Text-to-Image Fast text-to-image	Bria's Text-to-Image model with perfect harmony of latency and quality. Trained exclusively on licensed data for safe and risk-free commercial use. Available also as source code and weights. For access to weights: https://bria.ai/contact-us image generation	Deprecated	8/13	→
Bria Expand Image image-to-image	Bria Expand expands images beyond their borders in high quality. Trained exclusively on licensed data for safe and risk-free commercial use. Access the model's source code and weights: https://bria.ai/contact-us outpainting	Deprecated	8/13	→
PlayAI Text-to-Speech Dialog text-to-audio	Generate natural-sounding multi-speaker dialogues, and audio. Perfect for expressive outputs, storytelling, games, animations, and interactive media. audio	Deprecated	8/13	→
Dubbing video-to-video	This endpoint delivers seamlessly localized videos by generating lip-synced dubs in multiple languages, ensuring natural and immersive multilingual experiences animation lip sync dubbing	Deprecated	8/13	→
Sad Talker image-to-video	Learning Realistic 3D Motion Coefficients for Stylized Audio-Driven Single Image Talking Face Animation animation	Deprecated	8/13	→
MMAudio V2 Text to Audio text-to-audio	MMAudio generates synchronized audio given text inputs. It can generate sounds described by a prompt. audio fast	Deprecated	8/13	→
Switti 512 text-to-image	Switti is a scale-wise transformer for fast text-to-image generation that outperforms existing T2I AR models and competes with state-of-the-art T2I diffusion models while being faster than distilled diffusion models.	Deprecated	8/13	→
Switti 1024 text-to-image	Switti is a scale-wise transformer for fast text-to-image generation that outperforms existing T2I AR models and competes with state-of-the-art T2I diffusion models while being faster than distilled diffusion models.	Deprecated	8/13	→
Train Flux LoRA training	Train styles, people and other subjects at blazing speeds. lora personalization	Deprecated	8/13	→
Auto-Captioner video-to-video	Automatically generates text captions for your videos from the audio as per text colour/font specifications captioning video	Deprecated	8/13	→
Kling 1.6 image-to-video	Generate video clips from your images using Kling 1.6 (std)	Deprecated	8/13	→
Kling 1.6 text-to-video	Generate video clips from your prompts using Kling 1.6 (std)	Deprecated	8/13	→
Kling 1.6 image-to-video	Generate video clips from your images using Kling 1.6 (pro)	Deprecated	8/13	→
MoonDreamNext Detection image-to-image	MoonDreamNext Detection is a multimodal vision-language model for gaze detection, bbox detection, point detection, and more. multimodal	Deprecated	8/13	→
MoonDreamNext vision	MoonDreamNext is a multimodal vision-language model for captioning, gaze detection, bbox detection, point detection, and more. multimodal vision	Deprecated	8/13	→
Sa2VA 8B Image vision	Sa2VA is an MLLM capable of question answering, visual prompt understanding, and dense object segmentation at both image and video levels multimodal vision	Deprecated	8/13	→
Sa2VA 4B Image vision	Sa2VA is an MLLM capable of question answering, visual prompt understanding, and dense object segmentation at both image and video levels multimodal vision	Deprecated	8/13	→
Sa2VA 4B Video vision	Sa2VA is an MLLM capable of question answering, visual prompt understanding, and dense object segmentation at both image and video levels multimodal vision	Deprecated	8/13	→
Sa2VA 8B Video vision	Sa2VA is an MLLM capable of question answering, visual prompt understanding, and dense object segmentation at both image and video levels multimodal vision	Deprecated	8/13	→
sync.so -- lipsync 1.9.0-beta video-to-video	Generate realistic lipsync animations from audio using advanced algorithms for high-quality synchronization. animation lip sync	Deprecated	8/13	→
TransPixar V1 text-to-video	Transform text into stunning videos with TransPixar - an AI model that generates both RGB footage and alpha channels, enabling seamless compositing and creative video effects.	Deprecated	8/13	→
CogVideoX-5B text-to-video	Generate videos from prompts using CogVideoX-5B	Deprecated	8/13	→
Train Hunyuan LoRA training	Train Hunyuan Video lora on people, objects, characters and more! lora personalization	Deprecated	8/13	→
Hunyuan Video LoRA Inference text-to-video	Hunyuan Video is an Open video generation model with high visual quality, motion diversity, text-video alignment, and generation stability	Deprecated	8/13	→
FLUX.1 [pro] Depth Fine-tuned image-to-image	Generate high-quality images from depth maps using Flux.1 [pro] depth estimation model with a fine-tuned LoRA. The model produces accurate depth representations for scene understanding and 3D visualization. depth utility composition	Deprecated	8/13	→
FLUX.1 [pro] Fill Fine-tuned image-to-image	FLUX.1 [pro] Fill Fine-tuned is a high-performance endpoint for the FLUX.1 [pro] model with a fine-tuned LoRA that enables rapid transformation of existing images, delivering high-quality style transfers and image modifications with the core FLUX capabilities. editing	Deprecated	8/13	→
FLUX1.1 [pro] ultra Fine-tuned text-to-image	FLUX1.1 [pro] ultra fine-tuned is the newest version of FLUX1.1 [pro] with a fine-tuned LoRA, maintaining professional-grade image quality while delivering up to 2K resolution with improved photo realism. high-res realism	Deprecated	8/13	→
FLUX1.1 [pro] text-to-image	FLUX1.1 [pro] is an enhanced version of FLUX.1 [pro], improved image generation capabilities, delivering superior composition, detail, and artistic fidelity compared to its predecessor.	Deprecated	8/13	→
Train Flux LoRAs For Pro Models training	FLUX LoRA for Pro endpoints. lora personalization	Deprecated	8/13	→
FLUX.1 [pro] Depth image-to-image	Generate high-quality images from depth maps using Flux.1 [pro] depth estimation model. The model produces accurate depth representations for scene understanding and 3D visualization. depth utility composition	Deprecated	8/13	→
FLUX.1 [pro] Canny Fine-tuned image-to-image	Utilize Flux.1 [pro] Controlnet with a fine-tuned LoRA to generate high-quality images with precise control over composition, style, and structure through advanced edge detection and guidance mechanisms. controlnet detection editing composition	Deprecated	8/13	→
FLUX.1 [dev] Canny with LoRAs image-to-image	Utilize Flux.1 [dev] Controlnet to generate high-quality images with precise control over composition, style, and structure through advanced edge detection and guidance mechanisms. controlnet detection lora editing composition	Deprecated	8/13	→
FLUX.1 [pro] Canny image-to-image	Utilize Flux.1 [pro] Controlnet to generate high-quality images with precise control over composition, style, and structure through advanced edge detection and guidance mechanisms. controlnet detection editing composition	Deprecated	8/13	→
MoonDreamNext Batch vision	MoonDreamNext Batch is a multimodal vision-language model for batch captioning. multimodal	Deprecated	8/13	→
MiniMax (Hailuo AI) Video 01 Subject Reference image-to-video	Generate video clips maintaining consistent, realistic facial features and identity across dynamic video content subject transformation	Deprecated	8/13	→
FFmpeg API Compose video-to-video	Compose videos from multiple media sources using FFmpeg API. ffmpeg	Deprecated	8/13	→
FFmpeg API Waveform json	Get waveform data from audio files using FFmpeg API. ffmpeg	Deprecated	8/13	→
FFmpeg API Metadata json	Get encoding metadata from video and audio files using FFmpeg API. ffmpeg	Deprecated	8/13	→
Kling Kolors Virtual TryOn v1.5 image-to-image	Kling Kolors Virtual TryOn v1.5 is a high quality image based Try-On endpoint which can be used for commercial try on. try-on fashion clothing	Deprecated	8/13	→
Luma Ray 2 text-to-video	Ray2 is a large-scale video generative model capable of creating realistic visuals with natural, coherent motion. motion transformation	Deprecated	8/13	→
YuE: Lyrics to Song text-to-audio	YuE is a groundbreaking series of open-source foundation models designed for music generation, specifically for transforming lyrics into full songs. music	Deprecated	8/13	→
DeepSeek Janus-Pro text-to-image	DeepSeek Janus-Pro is a novel text-to-image model that unifies multimodal understanding and generation through an autoregressive framework stylized	Deprecated	8/13	→
PixVerse v3.5: Image to Video image-to-video	Generate high quality video clips from text and image prompts using PixVerse v3.5	Deprecated	8/13	→
PixVerse v3.5 Fast text-to-video	Generate high quality video clips quickly from text prompts using PixVerse v3.5 Fast	Deprecated	8/13	→
PixVerse v3.5: Image to Video Fast image-to-video	Generate high quality video clips from text and image prompts quickly using PixVerse v3.5 Fast	Deprecated	8/13	→
PixVerse v3.5 text-to-video	Generate high quality video clips from text prompts using PixVerse v3.5	Deprecated	8/13	→
Hunyuan Video LoRA Inference (Video-to-Video) video-to-video	Hunyuan Video is an Open video generation model with high visual quality, motion diversity, text-video alignment, and generation stability. Use this endpoint to generate videos from videos. video to video motion lora	Deprecated	8/13	→
Hunyuan Video (Video-to-Video) video-to-video	Hunyuan Video is an Open video generation model with high visual quality, motion diversity, text-video alignment, and generation stability. Use this endpoint to generate videos from videos. video to video motion	Deprecated	8/13	→
Lumina Image 2 text-to-image	Lumina-Image-2.0 is a 2 billion parameter flow-based diffusion transforer which features improved performance in image quality, typography, complex prompt understanding, and resource-efficiency. diffusion typography style	Deprecated	8/13	→
CodeFormer image-to-image	Fix distorted or blurred photos of people with CodeFormer. image-restoration faces utility	Deprecated	8/13	→
Hunyuan Video Image-to-Video LoRA Inference image-to-video	Image to Video for the Hunyuan Video model using a custom trained LoRA. motion	Deprecated	8/13	→
Ideogram Upscale image-to-image	Ideogram Upscale enhances the resolution of the reference image by up to 2X and might enhance the reference image too. Optionally refine outputs with a prompt for guided improvements. upscaling high-res	Deprecated	8/13	→
Imagen3 Fast text-to-image	Imagen3 Fast is a high-quality text-to-image model that generates realistic images from text prompts.	Deprecated	8/13	→
Imagen3 text-to-image	Imagen3 is a high-quality text-to-image model that generates realistic images from text prompts.	Deprecated	8/13	→
Ben-Video-Bg-Rm video-to-video	A model for high quality and smooth background removal for videos. segmentation background removal	Deprecated	8/13	→
MiniMax (Hailuo AI) Video 01 Director text-to-video	Generate video clips more accurately with respect to natural language descriptions and using camera movement instructions for shot control. motion transformation camera-controls	Deprecated	8/13	→
FLUX.1 [dev] Control LoRA Depth text-to-image	FLUX Control LoRA Depth is a high-performance endpoint that uses a control image to transfer structure to the generated image, using a depth map. lora style transfer	Deprecated	8/13	→
FLUX.1 [dev] Control LoRA Canny text-to-image	FLUX Control LoRA Canny is a high-performance endpoint that uses a control image to transfer structure to the generated image, using a Canny edge map. lora style transfer	Deprecated	8/13	→
ben-v2-image image-to-image	A fast and high quality model for image background removal. background removal	Deprecated	8/13	→
FLUX.1 [dev] Control LoRA Depth image-to-image	FLUX Control LoRA Depth is a high-performance endpoint that uses a control image using a depth map to transfer structure to the generated image and another initial image to guide color. lora style transfer	Deprecated	8/13	→
FLUX.1 [dev] Control LoRA Canny image-to-image	FLUX Control LoRA Canny is a high-performance endpoint that uses a control image using a Canny edge map to transfer structure to the generated image and another initial image to guide color. lora style transfer	Deprecated	8/13	→
GOT OCR 2.0 vision	GOT-OCR2 works on a wide range of tasks, including plain document OCR, scene text OCR, formatted document OCR, and even OCR for tables, charts, mathematical formulas, geometric shapes, molecular formulas and sheet music. optical character recognition high-res utility	Deprecated	8/13	→
Luma Ray 2 (Image to Video) image-to-video	Ray2 is a large-scale video generative model capable of creating realistic visuals with natural, coherent motion. motion transformation	Deprecated	8/13	→
Kokoro TTS (Italian) text-to-audio	A high-quality Italian text-to-speech model delivering smooth and expressive speech synthesis. speech	Deprecated	8/13	→
Zonos-Audio-Clone text-to-audio	Clone voice of any person and speak anything in their voice using zonos' voice cloning. voice cloning	Deprecated	8/13	→
Kokoro TTS (Japanese) text-to-audio	A fast and natural-sounding Japanese text-to-speech model optimized for smooth pronunciation. speech	Deprecated	8/13	→
Kokoro TTS text-to-audio	Kokoro is a lightweight text-to-speech model that delivers comparable quality to larger models while being significantly faster and more cost-efficient. speech	Deprecated	8/13	→
Kokoro TTS (British English) text-to-audio	A high-quality British English text-to-speech model offering natural and expressive voice synthesis. speech	Deprecated	8/13	→
Kokoro TTS (French) text-to-audio	An expressive and natural French text-to-speech model for both European and Canadian French. speech	Deprecated	8/13	→
Kokoro TTS (Spanish) text-to-audio	A natural-sounding Spanish text-to-speech model optimized for Latin American and European Spanish. speech	Deprecated	8/13	→
Kokoro TTS (Brazilian Portuguese) text-to-audio	A natural and expressive Brazilian Portuguese text-to-speech model optimized for clarity and fluency. speech	Deprecated	8/13	→
Kokoro TTS (Hindi) text-to-audio	A fast and expressive Hindi text-to-speech model with clear pronunciation and accurate intonation. speech	Deprecated	8/13	→
Kokoro TTS (Mandarin Chinese) text-to-audio	A highly efficient Mandarin Chinese text-to-speech model that captures natural tones and prosody. speech	Deprecated	8/13	→
Flow-Edit text-to-image	The model provides you high quality image editing capabilities. editing	Deprecated	8/13	→
Skyreels V1 (Image-to-Video) image-to-video	SkyReels V1 is the first and most advanced open-source human-centric video foundation model. By fine-tuning HunyuanVideo on O(10M) high-quality film and television clips motion	Deprecated	8/13	→
Post Processing image-to-image	Post Processing is an endpoint that can enhance images using a variety of techniques including grain, blur, sharpen, and more. stylized utility	Deprecated	8/13	→
NAFNet-denoise image-to-image	Use NAFNet to fix issues like blurriness and noise in your images. This model specializes in image restoration and can help enhance the overall quality of your photography. image-restoration deblur denoise	Deprecated	8/13	→
NAFNet-deblur image-to-image	Use NAFNet to fix issues like blurriness and noise in your images. This model specializes in image restoration and can help enhance the overall quality of your photography. image-restoration deblur denoise	Deprecated	8/13	→
Veo 2 text-to-video	Veo 2 creates videos with realistic motion and high quality output. Explore different styles and find your own with extensive camera controls. motion transformation	Deprecated	8/13	→
DRCT-Super-Resolution image-to-image	Upscale your images with DRCT-Super-Resolution. upscaling high-res	Deprecated	8/13	→
MiniMax (Hailuo AI) Video 01 Director - Image to Video image-to-video	Generate video clips more accurately with respect to initial image, natural language descriptions, and using camera movement instructions for shot control. motion transformation camera-controls	Deprecated	8/13	→
Segment Anything Model 2 image-to-image	SAM 2 is a model for segmenting images automatically. It can return individual masks or a single mask for the entire image. segmentation mask	Deprecated	8/13	→
Video Prompt Generator llm	Generate video prompts using a variety of techniques including camera direction, style, pacing, special effects and more. motion transformation chat claude gpt	Deprecated	8/13	→
Wan-2.1 Image-to-Video image-to-video	Wan-2.1 is a image-to-video model that generates high-quality videos with high visual quality and motion diversity from images image to video motion	Deprecated	8/13	→
Wan-2.1 Text-to-Video text-to-video	Wan-2.1 is a text-to-video model that generates high-quality videos with high visual quality and motion diversity from text prompts text to video motion	Deprecated	8/13	→
DDColor image-to-image	Bring colors into old or new black and white photos with DDColor. image-recolorization faces utility	Deprecated	8/13	→
EVF-SAM2 Segmentation image-to-image	EVF-SAM2 combines natural language understanding with advanced segmentation capabilities, allowing you to precisely mask image regions using intuitive positive and negative text prompts. segmentation mask	Deprecated	8/13	→
Ideogram V2A text-to-image	Generate high-quality images, posters, and logos with Ideogram V2A. Features exceptional typography handling and realistic outputs optimized for commercial and creative use. realism typography	Deprecated	8/13	→
ElevenLabs Sound Effects text-to-audio	Generate sound effects using ElevenLabs advanced sound effects model. sound	Deprecated	8/13	→
ElevenLabs TTS Turbo v2.5 text-to-speech	Generate high-speed text-to-speech audio using ElevenLabs TTS Turbo v2.5. audio	Deprecated	8/13	→
ElevenLabs Audio Isolation audio-to-audio	Isolate audio tracks using ElevenLabs advanced audio isolation technology. audio	Deprecated	8/13	→
ElevenLabs Speech to Text speech-to-text	Generate text from speech using ElevenLabs advanced speech-to-text model. speech	Deprecated	8/13	→
Ideogram V2A Turbo text-to-image	Accelerated image generation with Ideogram V2A Turbo. Create high-quality visuals, posters, and logos with enhanced speed while maintaining Ideogram's signature quality. realism typography	Deprecated	8/13	→
Wan-2.1 1.3B Text-to-Video text-to-video	Wan-2.1 1.3B is a text-to-video model that generates high-quality videos with high visual quality and motion diversity from text promptsat faster speeds. text to video motion	Deprecated	8/13	→
ElevenLabs TTS Multilingual v2 text-to-audio	Generate multilingual text-to-speech audio using ElevenLabs TTS Multilingual v2. audio	Deprecated	8/13	→
Ideogram V2A Turbo Remix image-to-image	Rapidly create image variations with Ideogram V2A Turbo Remix. Fast and efficient reimagining of existing images while maintaining creative control through prompt guidance. realism typography	Deprecated	8/13	→
Kling 1.6 text-to-video	Generate video clips from your prompts using Kling 1.6 (pro)	Deprecated	8/13	→
Ideogram V2A Remix image-to-image	Create variations of existing images with Ideogram V2A Remix while maintaining creative control through prompt guidance. realism typography	Deprecated	8/13	→
SWIN2SR image-to-image	Enhance low-resolution images with the superior quality of Swin2SR for sharper, clearer results. image-enhancement	Deprecated	8/13	→
DocRes image-to-image	Enhance low-resolution, blur, shadowed documents with the superior quality of docres for sharper, clearer results. image-enhancement	Deprecated	8/13	→
DocRes-dewarp image-to-image	Enhance wraped, folded documents with the superior quality of docres for sharper, clearer results. image-enhancement	Deprecated	8/13	→
DiffRhythm: Lyrics to Song text-to-audio	DiffRhythm is a blazing fast model for transforming lyrics into full songs. It boasts the capability to generate full songs in less than 30 seconds. music	Deprecated	8/13	→
Topaz Video Upscale video-to-video	Professional-grade video upscaling using Topaz technology. Enhance your videos with high-quality upscaling. upscaling high-res	Deprecated	8/13	→
CogView text-to-image	Generate high quality images from text prompts using CogView4. Longer text prompts will result in better quality images. stylized	Deprecated	8/13	→
Juggernaut Flux Base text-to-image	Juggernaut Base Flux by RunDiffusion is a drop-in replacement for Flux [Dev] that delivers sharper details, richer colors, and enhanced realism, while instantly boosting LoRAs and LyCORIS with full compatibility. image generation	Deprecated	8/13	→
Juggernaut Flux Pro text-to-image	Juggernaut Pro Flux by RunDiffusion is the flagship Juggernaut model rivaling some of the most advanced image models available, often surpassing them in realism. It combines Juggernaut Base with RunDiffusion Photo and features enhancements like reduced background blurriness. image generation	Deprecated	8/13	→
LTX Video-0.9.5 video-to-video	Generate videos from prompts,images, and videos using LTX Video-0.9.5 video image-to-video text-to-video	Deprecated	8/13	→
LTX Video-0.9.5 text-to-video	Generate videos from prompts using LTX Video-0.9.5 video text-video	Deprecated	8/13	→
LTX Video-0.9.5 video-to-video	Generate videos from prompts and videos using LTX Video-0.9.5 video video-to-video	Deprecated	8/13	→
Juggernaut Flux Base image-to-image	Juggernaut Base Flux by RunDiffusion is a drop-in replacement for Flux [Dev] that delivers sharper details, richer colors, and enhanced realism, while instantly boosting LoRAs and LyCORIS with full compatibility. image generation	Deprecated	8/13	→
Juggernaut Flux Base LoRA text-to-image	Juggernaut Base Flux LoRA by RunDiffusion is a drop-in replacement for Flux [Dev] that delivers sharper details, richer colors, and enhanced realism to all your LoRAs and LyCORIS with full compatibility. image generation	Deprecated	8/13	→
Rundiffusion Photo Flux text-to-image	RunDiffusion Photo Flux provides insane realism. With this enhancer, textures and skin details burst to life, turning your favorite prompts into vivid, lifelike creations. Recommended to keep it at 0.65 to 0.80 weight. Supports resolutions up to 1536x1536. image generation lora	Deprecated	8/13	→
Juggernaut Flux Pro image-to-image	Juggernaut Pro Flux by RunDiffusion is the flagship Juggernaut model rivaling some of the most advanced image models available, often surpassing them in realism. It combines Juggernaut Base with RunDiffusion Photo and features enhancements like reduced background blurriness. image generation	Deprecated	8/13	→
Juggernaut Flux Lightning text-to-image	Juggernaut Lightning Flux by RunDiffusion provides blazing-fast, high-quality images rendered at five times the speed of Flux. Perfect for mood boards and mass ideation, this model excels in both realism and prompt adherence. image generation	Deprecated	8/13	→
Hunyuan Video Image-to-Video Inference image-to-video	Image to Video for the high-quality Hunyuan Video I2V model. motion	Deprecated	8/13	→
Kling 1.6 text-to-video	Generate video clips from your prompts using Kling 1.6 (std)	Deprecated	8/13	→
Kling 1.6 text-to-video	Generate video clips from your prompts using Kling 1.6 (pro)	Deprecated	8/13	→
Kling 1.0 text-to-video	Generate video clips from your prompts using Kling 1.0 motion	Deprecated	8/13	→
Kling 1.5 text-to-video	Generate video clips from your prompts using Kling 1.5 (pro)	Deprecated	8/13	→
Wan-2.1 Image-to-Video with LoRAs image-to-video	Add custom LoRAs to Wan-2.1 is a image-to-video model that generates high-quality videos with high visual quality and motion diversity from images image to video motion lora	Deprecated	8/13	→
Easel AI Advanced Face Swap image-to-image	Swap faces of one or two people at once, while preserving user and scene details! face swap utility editing	Deprecated	8/13	→
Veo 2 (Image to Video) image-to-video	Veo 2 creates videos from images with realistic motion and very high quality output. motion transformation	Deprecated	8/13	→
Wan-2.1 Pro Text-to-Video text-to-video	Wan-2.1 Pro is a premium text-to-video model that generates high-quality 1080p videos at 30fps with up to 6 seconds duration, delivering exceptional visual quality and motion diversity from text prompts text to video motion	Deprecated	8/13	→
Wan-2.1 Pro Image-to-Video image-to-video	Wan-2.1 Pro is a premium image-to-video model that generates high-quality 1080p videos at 30fps with up to 6 seconds duration, delivering exceptional visual quality and motion diversity from images image to video motion	Deprecated	8/13	→
Vidu Template to Video image-to-video	Vidu Template to Video lets you create different effects by applying motion templates to your images. motion template	Deprecated	8/13	→
Vidu Reference to Video image-to-video	Vidu Reference to Video creates videos by using a reference images and combining them with a prompt. motion reference	Deprecated	8/13	→
Vidu Start-End to Video image-to-video	Vidu Start-End to Video generates smooth transition videos between specified start and end images. motion transition	Deprecated	8/13	→
Vidu Image to Video image-to-video	Vidu Image to Video generates high-quality videos with exceptional visual quality and motion diversity from a single image motion image to video	Deprecated	8/13	→
Wan Effects image-to-video	Wan Effects generates high-quality videos with popular effects from images motion effects	Deprecated	8/13	→
CSM-1B text-to-audio	CSM (Conversational Speech Model) is a speech generation model from Sesame that generates RVQ audio codes from text and audio inputs. conversational text to speech	Deprecated	8/13	→
Pika Image to Video (v2.1) image-to-video	Pika v2.1 creates videos from images with high quality output. editing effects animation	Deprecated	8/13	→
Pika Scenes (v2.2) image-to-video	Pika Scenes v2.2 creates videos from a images with high quality output. editing effects animation	Deprecated	8/13	→
Pika Image to Video (v2.2) image-to-video	Pika v2.2 creates videos from images with high quality output. editing effects animation	Deprecated	8/13	→
Pika Text to Video Turbo (v2) text-to-video	Pika v2 Turbo creates videos from a text prompt with high quality output. editing effects animation	Deprecated	8/13	→
Pika Text to Video (v2.1) text-to-video	Pika v2.1 creates videos from a text prompt with high quality output. editing effects animation	Deprecated	8/13	→
Invisible Watermark image-to-image	Invisible Watermark is a model that can add an invisible watermark to an image. utility editing	Deprecated	8/13	→
Pika Text to Video (v2.2) text-to-video	Pika v2.2 creates videos from a text prompt with high quality output. editing effects animation	Deprecated	8/13	→
Pika Image to Video Turbo (v2) image-to-video	Pika v2 Turbo creates videos from images with high quality output. editing effects animation	Deprecated	8/13	→
Pika Effects (v1.5) image-to-video	Pika Effects are AI-powered video effects designed to modify objects, characters, and environments in a fun, engaging, and visually compelling manner. editing effects animation	Deprecated	8/13	→
Luma Ray 2 Flash text-to-video	Ray2 Flash is a fast video generative model capable of creating realistic visuals with natural, coherent motion. motion transformation	Deprecated	8/13	→
Luma Ray 2 Flash (Image to Video) image-to-video	Ray2 Flash is a fast video generative model capable of creating realistic visuals with natural, coherent motion. motion transformation	Deprecated	8/13	→
Gemini Flash Edit Multi Image image-to-image	Gemini Flash Edit Multi Image is a model that can edit multiple images using a text prompt and a reference image. editing	Deprecated	8/13	→
Hunyuan3D image-to-3d	Generate 3D models from your images using Hunyuan 3D. A native 3D generative model enabling versatile and high-quality 3D asset creation. stylized	Deprecated	8/13	→
Hunyuan3D image-to-3d	Generate 3D models from your images using Hunyuan 3D. A native 3D generative model enabling versatile and high-quality 3D asset creation. stylized	Deprecated	8/13	→
Hunyuan3D image-to-3d	Generate 3D models from your images using Hunyuan 3D. A native 3D generative model enabling versatile and high-quality 3D asset creation. stylized	Deprecated	8/13	→
Hunyuan3D image-to-3d	Generate 3D models from your images using Hunyuan 3D. A native 3D generative model enabling versatile and high-quality 3D asset creation. stylized	Deprecated	8/13	→
Hunyuan3D image-to-3d	Generate 3D models from your images using Hunyuan 3D. A native 3D generative model enabling versatile and high-quality 3D asset creation. stylized	Deprecated	8/13	→
Gemini Flash Edit Multi Image image-to-image	Gemini Flash Edit is a model that can edit single image using a text prompt and a reference image. editing	Deprecated	8/13	→
Hunyuan3D image-to-3d	Generate 3D models from your images using Hunyuan 3D. A native 3D generative model enabling versatile and high-quality 3D asset creation. stylized	Deprecated	8/13	→
MixDehazer image-to-image	An advanced dehaze model to remove atmospheric haze, restoring clarity and detail in images through intelligent neural network processing.	Deprecated	8/13	→
Thera image-to-image	Fix low resolution images with fast speed and quality of thera.	Deprecated	8/13	→
Wan-2.1 LoRA Trainer training	Train custom LoRAs for Wan-2.1 I2V 480P lora training	Deprecated	8/13	→
Wan-2.1 Text-to-Video with LoRAs text-to-video	Add custom LoRAs to Wan-2.1 is a text-to-video model that generates high-quality videos with high visual quality and motion diversity from images "text to video" "motion" "lora"	Deprecated	8/13	→
LatentSync video-to-video	LatentSync is a video-to-video model that generates lip sync animations from audio using advanced algorithms for high-quality synchronization. animation lip sync	Deprecated	8/13	→
Kling LipSync Audio-to-Video text-to-video	Kling LipSync is an audio-to-video model that generates realistic lip movements from audio input. audio to video lipsync	Deprecated	8/13	→
Kling LipSync Text-to-Video text-to-video	Kling LipSync is a text-to-video model that generates realistic lip movements from text input. text to video lipsync	Deprecated	8/13	→
music generator text-to-audio	CassetteAI’s model generates a 30-second sample in under 2 seconds and a full 3-minute track in under 10 seconds. At 44.1 kHz stereo audio, expect a level of professional consistency with no breaks, no squeaks, and no random interruptions in your creations. music cassetteai	Deprecated	8/13	→
Sana Sprint text-to-image	Sana Sprint is a text-to-image model capable of generating 4K images with exceptional speed. text to image 4k high-speed	Deprecated	8/13	→
Sana v1.5 4.8B text-to-image	Sana v1.5 4.8B is a powerful text-to-image model that generates ultra-high quality 4K images with remarkable detail. text to image 4k high-quality	Deprecated	8/13	→
Sana v1.5 1.6B text-to-image	Sana v1.5 1.6B is a lightweight text-to-image model that delivers 4K image generation with impressive efficiency. text to image 4k lightweight	Deprecated	8/13	→
Orpheus TTS text-to-speech	Orpheus TTS is a state-of-the-art, Llama-based Speech-LLM designed for high-quality, empathetic text-to-speech generation. This model has been finetuned to deliver human-level speech synthesis, achieving exceptional clarity, expressiveness, and real-time performances. text to speech voice synthesis high-fidelity	Deprecated	8/13	→
Ghiblify Images image-to-image	Reimagine and transform your ordinary photos into enchanting Studio Ghibli style artwork stylized transform	Deprecated	8/13	→
PixVerse v4: Text to Video Fast text-to-video	Generate high quality and fast video clips from text and image prompts using PixVerse v4 fast	Deprecated	8/13	→
PixVerse v3.5: Transition image-to-video	Create seamless transition between images using PixVerse v3.5	Deprecated	8/13	→
PixVerse v4: Text to Video text-to-video	Generate high quality video clips from text and image prompts using PixVerse v4	Deprecated	8/13	→
PixVerse v3.5: Effects image-to-video	Generate high quality video clips with different effects using PixVerse v3.5	Deprecated	8/13	→
PixVerse v4: Image to Video image-to-video	Generate high quality video clips from text and image prompts using PixVerse v4	Deprecated	8/13	→
PixVerse v4: Image to Video Fast image-to-video	Generate fast high quality video clips from text and image prompts using PixVerse v4	Deprecated	8/13	→
StarVector image-to-image	AI vectorization model that transforms raster images into scalable SVG graphics, preserving visual details while enabling infinite scaling and easy editing capabilities. image-to-image	Deprecated	8/13	→
FLUX.1 [dev] text-to-image	FLUX.1 [dev] is a 12 billion parameter flow transformer that generates high-quality images from text. It is suitable for personal and commercial use.	Deprecated	8/13	→
Sync Lipsync 2.0 video-to-video	Generate realistic lipsync animations from audio using advanced algorithms for high-quality synchronization with Sync Lipsync 2.0 model animation lip sync	Deprecated	8/13	→
Sound Effects Generator text-to-audio	Create stunningly realistic sound effects in seconds - CassetteAI's Sound Effects Model generates high-quality SFX up to 30 seconds long in just 1 second of processing time sound sfx sound-effects cassetteai	Deprecated	8/13	→
Speech-to-Text speech-to-text	Leverage the rapid processing capabilities of AI models to enable accurate and efficient real-time speech-to-text transcription.	Deprecated	8/13	→
Speech-To-text speech-to-text	Leverage the rapid processing capabilities of AI models to enable accurate and efficient real-time speech-to-text transcription. streaming	Deprecated	8/13	→
Speech-to-Text speech-to-text	Leverage the rapid processing capabilities of AI models to enable accurate and efficient real-time speech-to-text transcription. streaming	Deprecated	8/13	→
Speech-to-Text speech-to-text	Leverage the rapid processing capabilities of AI models to enable accurate and efficient real-time speech-to-text transcription.	Deprecated	8/13	→
Video Sound Effects Generator video-to-video	Add sound effects to your videos sound-effects sfx cassetteai	Deprecated	8/13	→
finegrain eraser image-to-image	Finegrain Eraser removes objects—along with their shadows, reflections, and lighting artifacts—using only natural language, seamlessly filling the scene with contextually accurate content. utility editing	Deprecated	8/13	→
finegrain eraser image-to-image	Finegrain Eraser removes any object selected with a bounding box—along with its shadows, reflections, and lighting artifacts—seamlessly reconstructing the scene with contextually accurate content. utility editing	Deprecated	8/13	→
finegrain eraser image-to-image	Finegrain Eraser removes any object selected with a mask—along with its shadows, reflections, and lighting artifacts—seamlessly reconstructing the scene with contextually accurate content. utility editing	Deprecated	8/13	→
Hidream I1 Fast text-to-image	HiDream-I1 fast is a new open-source image generative foundation model with 17B parameters that achieves state-of-the-art image generation quality within 16 steps.	Deprecated	8/13	→
Hidream I1 Dev text-to-image	HiDream-I1 dev is a new open-source image generative foundation model with 17B parameters that achieves state-of-the-art image generation quality within seconds.	Deprecated	8/13	→
Hidream I1 Full text-to-image	HiDream-I1 full is a new open-source image generative foundation model with 17B parameters that achieves state-of-the-art image generation quality within seconds.	Deprecated	8/13	→
Vace video-to-video	Vace a video generation model that uses a source image, mask, and video to create prompted videos with controllable sources. video-to-video image-to-video text-to-video	Deprecated	8/13	→
Cartoonify image-to-image	Transform images into 3D cartoon artwork using an AI model that applies cartoon stylization while preserving the original image's composition and details. stylized transform	Deprecated	8/13	→
Kling 2.0 Master text-to-video	Generate video clips from your prompts using Kling 2.0 Master	Deprecated	8/13	→
Kling 2.0 Master image-to-video	Generate video clips from your images using Kling 2.0 Master	Deprecated	8/13	→
Tavus LipSync v2 video-to-video	Generate lip sync using Tavus' state-of-the-art model for high-quality synchronization.	Deprecated	8/13	→
Framepack image-to-video	Framepack is an efficient Image-to-video model that autoregressively generates videos. image to video motion	Deprecated	8/13	→
Turbo Flux Trainer training	A blazing fast FLUX dev LoRA trainer for subjects and styles.	Deprecated	8/13	→
Wan-2.1 First-Last-Frame-to-Video image-to-video	Wan-2.1 flf2v generates dynamic videos by intelligently bridging a given first frame to a desired end frame through smooth, coherent motion sequences. image to video motion	Deprecated	8/13	→
Instant Character image-to-image	InstantCharacter creates high-quality, consistent characters from text prompts, supporting diverse poses, styles, and appearances with strong identity control. personalization customization	Deprecated	8/13	→
Plushify image-to-image	Turn any image into a cute plushie!	Deprecated	8/13	→
FASHN Virtual Try-On V1.5 image-to-image	FASHN v1.5 delivers precise virtual try-on capabilities, accurately rendering garment details like text and patterns at 576x864 resolution from both on-model and flat-lay photo references. try-on fashion clothing	Deprecated	8/13	→
Juggernaut Flux Lora image-to-image	Juggernaut Base Flux LoRA Inpainting by RunDiffusion is a drop-in replacement for Flux [Dev] inpainting that delivers sharper details, richer colors, and enhanced realism to all your LoRAs and LyCORIS with full compatibility.	Deprecated	8/13	→
Pipecat's Smart Turn model speech-to-text	An open source, community-driven and native audio turn detection model by Pipecat AI.	Deprecated	8/13	→
MAGI-1 (Distilled) text-to-video	MAGI-1 distilled is a faster video generation model with exceptional understanding of physical interactions and cinematic prompts text-to-video	Deprecated	8/13	→
Dia text-to-speech	Dia directly generates realistic dialogue from transcripts. Audio conditioning enables emotion control. Produces natural nonverbals like laughter and throat clearing. text-to-speech	Deprecated	8/13	→
Framepack image-to-video	Framepack is an efficient Image-to-video model that autoregressively generates videos. image to video motion	Deprecated	8/13	→
Dia Tts audio-to-audio	Clone dialog voices from a sample audio and generate dialogs from text prompts using the Dia TTS which leverages advanced AI techniques to create high-quality text-to-speech. speech	Deprecated	8/13	→
MAGI-1 (Distilled) image-to-video	MAGI-1 distilled generates videos faster from images with exceptional understanding of physical interactions and prompting image-to-video	Deprecated	8/13	→
MAGI-1 (Distilled) video-to-video	MAGI-1 distilled extends videos faster with an exceptional understanding of physical interactions and prompts video-to-video video-extend	Deprecated	8/13	→
Pixverse image-to-video	Generate high quality video clips with different effects using PixVerse v4 image-to-video	Deprecated	8/13	→
MAGI-1 image-to-video	MAGI-1 generates videos from images with exceptional understanding of physical interactions and prompting image-to-video	Deprecated	8/13	→
MAGI-1 text-to-video	MAGI-1 is a video generation model with exceptional understanding of physical interactions and cinematic prompts text-to-video	Deprecated	8/13	→
MAGI-1 video-to-video	MAGI-1 extends videos with an exceptional understanding of physical interactions and prompts video-to-video	Deprecated	8/13	→
gpt-image-1 text-to-image	OpenAI's latest image generation and editing model: gpt-1-image. Currently powered with bring-your-own-key.	Deprecated	8/13	→
gpt-image-1 image-to-image	OpenAI's latest image generation and editing model: gpt-1-image. Currently powered with bring-your-own-key.	Deprecated	8/13	→
Uno image-to-image	An AI model that transforms input images into new ones based on text prompts, blending reference visuals with your creative directions. image-to-image	Deprecated	8/13	→
Image2svg image-to-image	Image2SVG transforms raster images into clean vector graphics, preserving visual quality while enabling scalable, customizable SVG outputs with precise control over detail levels. utility editing	Deprecated	8/13	→
Tripo3D image-to-3d	State of the art Image to 3D Object generation. Generate 3D model from a single image! image-to-3d stylized	Deprecated	8/13	→
Step1X Edit image-to-image	Step1X-Edit transforms your photos with simple instructions into stunning, professional-quality edits—rivaling top proprietary tools. editing	Deprecated	8/13	→
Moondream2 vision	Moondream2 is a highly efficient open-source vision language model that combines powerful image understanding capabilities with a remarkably small footprint. image-to-image	Deprecated	8/13	→
Moondream2 vision	Moondream2 is a highly efficient open-source vision language model that combines powerful image understanding capabilities with a remarkably small footprint. image-to-image	Deprecated	8/13	→
Moondream2 vision	Moondream2 is a highly efficient open-source vision language model that combines powerful image understanding capabilities with a remarkably small footprint. Vision	Deprecated	8/13	→
Moondream2 vision	Moondream2 is a highly efficient open-source vision language model that combines powerful image understanding capabilities with a remarkably small footprint. Vision	Deprecated	8/13	→
F Lite (texture mode) text-to-image	F Lite is a 10B parameter diffusion model created by Fal and Freepik, trained exclusively on copyright-safe and SFW content. This is a high texture density variant of the model.	Deprecated	8/13	→
F Lite text-to-image	F Lite is a 10B parameter diffusion model created by Fal and Freepik, trained exclusively on copyright-safe and SFW content.	Deprecated	8/13	→
Ideogram V3 Edit image-to-image	Transform existing images with Ideogram V3's editing capabilities. Modify, adjust, and refine images while maintaining high fidelity and realistic outputs with precise prompt control. realism typography	Deprecated	8/13	→
Ideogram image-to-image	Reimagine existing images with Ideogram V3's remix feature. Create variations and adaptations while preserving core elements and adding new creative directions through prompt guidance. realism typography	Deprecated	8/13	→
Ideogram Replace Background image-to-image	Replace backgrounds existing images with Ideogram V3's replace background feature. Create variations and adaptations while preserving core elements and adding new creative directions through prompt guidance.	Deprecated	8/13	→
Ideogram Text to Image text-to-image	Generate high-quality images, posters, and logos with Ideogram V3. Features exceptional typography handling and realistic outputs optimized for commercial and creative use. realism typography	Deprecated	8/13	→
Ideogram image-to-image	Extend existing images with Ideogram V3's reframe feature. Create expanded versions and adaptations while preserving main image and adding new creative directions through prompt guidance. realism typography	Deprecated	8/13	→
Trellis image-to-3d	Generate 3D models from multiple images using Trellis. A native 3D generative model enabling versatile and high-quality 3D asset creation. stylized	Deprecated	8/13	→
Hidream I1 Full image-to-image	HiDream-I1 full is a new open-source image generative foundation model with 17B parameters that achieves state-of-the-art image generation quality within seconds. image-to-image hidream	Deprecated	8/13	→
MiniMax (Hailuo AI) Text to Image text-to-image	Generate high quality images from text prompts using MiniMax Image-01. Longer text prompts will result in better quality images. stylized realism	Deprecated	8/13	→
Minimax Image Subject Reference image-to-image	Generate images from text and a reference image using MiniMax Image-01 for consistent character appearance. stylized transform	Deprecated	8/13	→
MiniMax Speech-02 HD text-to-speech	Generate speech from text prompts and different voices using the MiniMax Speech-02 HD model, which leverages advanced AI techniques to create high-quality text-to-speech. speech	Deprecated	8/13	→
MiniMax Speech-02 Turbo text-to-speech	Generate fast speech from text prompts and different voices using the MiniMax Speech-02 Turbo model, which leverages advanced AI techniques to create high-quality text-to-speech. speech	Deprecated	8/13	→
MiniMax Voice Cloning text-to-speech	Clone a voice from a sample audio and generate speech from text prompts using the MiniMax model, which leverages advanced AI techniques to create high-quality text-to-speech. speech	Deprecated	8/13	→
Easel Avatar text-to-image	Create scenes with one or two people using just selfies and text prompt (without LoRAs) avatars loras image-generation	Deprecated	8/13	→
Recraft V3 text-to-image	Recraft V3 is a text-to-image model with the ability to generate long texts, vector art, images in brand style, and much more. As of today, it is SOTA in image generation, proven by Hugging Face's industry-leading Text-to-Image Benchmark by Artificial Analysis. vector typography style	Deprecated	8/13	→
Recraft V3 image-to-image	Recraft V3 is a text-to-image model with the ability to generate long texts, vector art, images in brand style, and much more. As of today, it is SOTA in image generation, proven by Hugging Face's industry-leading Text-to-Image Benchmark by Artificial Analysis. vector typography style	Deprecated	8/13	→
Recraft V3 Create Style training	Recraft V3 Create Style is capable of creating unique styles for Recraft V3 based on your images. style vector personalization	Deprecated	8/13	→
Recraft Crisp Upscale image-to-image	Enhances a given raster image using 'crisp upscale' tool, boosting resolution with a focus on refining small details and faces. upscaling	Deprecated	8/13	→
Recraft Creative Upscale image-to-image	Enhances a given raster image using the 'creative upscale' tool, increasing image resolution, making the image sharper and cleaner. upscaling	Deprecated	8/13	→
LTX Video Trainer training	Train LTX Video 0.9.7 for custom styles and effects. ltx-video fine-tuning	Deprecated	8/13	→
ACE-Step text-to-audio	Generate music with lyrics from text using ACE-Step text-to-audio text-to-music	Deprecated	8/13	→
Vidu Image to Video image-to-video	Vidu Q1 Image to Video generates high-quality 1080p videos with exceptional visual quality and motion diversity from a single image stylized transform	Deprecated	8/13	→
Vidu Text to Video text-to-video	Vidu Q1 Text to Video generates high-quality 1080p videos with exceptional visual quality and motion diversity stylized transform	Deprecated	8/13	→
Vidu Start End to Video image-to-video	Vidu Q1 Start-End to Video generates smooth transition 1080p videos between specified start and end images. stylized transform	Deprecated	8/13	→
Rembg Enhance (Remove Background Enhance) image-to-image	Rembg-enhance is optimized for 2D vector images, 3D graphics, and photos by leveraging matting technology. background removal image editing utility segmentation high resolution rembg	Deprecated	8/13	→
ACE-Step text-to-audio	Generate music from a simple prompt using ACE-Step text-to-audio text-to-music	Deprecated	8/13	→
ACE-Step audio-to-audio	Generate music from a lyrics and example audio using ACE-Step audio-to-audio audio-edit	Deprecated	8/13	→
ACE-Step audio-to-audio	Modify a portion of provided audio with lyrics and/or style using ACE-Step audio-to-audio audio-inpaint audio-repaint	Deprecated	8/13	→
ACE-Step audio-to-audio	Extend the beginning or end of provided audio with lyrics and/or style using ACE-Step audio-to-audio audio-outpaint audio-extend	Deprecated	8/13	→
Framepack F1 image-to-video	Framepack is an efficient Image-to-video model that autoregressively generates videos. image to video motion	Deprecated	8/13	→
Hunyuan Custom image-to-video	HunyuanCustom revolutionizes video generation with unmatched identity consistency across multiple input types. Its innovative fusion modules and alignment networks outperform competitors, maintaining subject integrity while responding flexibly to text, image, audio, and video conditions. image-to-video	Deprecated	8/13	→
Pixverse image-to-video	Generate high quality video clips with different effects using PixVerse v4.5 image-to-video	Deprecated	8/13	→
Pixverse text-to-video	Generate high quality video clips from text and image prompts using PixVerse v4.5 stylized transform	Deprecated	8/13	→
Pixverse text-to-video	Generate high quality and fast video clips from text and image prompts using PixVerse v4.5 fast stylized transform	Deprecated	8/13	→
Pixverse image-to-video	Generate high quality video clips from text and image prompts using PixVerse v4.5 stylized transform	Deprecated	8/13	→
Pixverse image-to-video	Generate fast high quality video clips from text and image prompts using PixVerse v4.5 stylized transform	Deprecated	8/13	→
Pixverse image-to-video	Create seamless transition between images using PixVerse v4.5 stylized transform	Deprecated	8/13	→
LTX Video-0.9.7 LoRA image-to-video	Generate videos from prompts and images using LTX Video-0.9.7 and custom LoRA video ltx-video image-to-video	Deprecated	8/13	→
LTX Video-0.9.7 LoRA video-to-video	Generate videos from prompts, images, and videos using LTX Video-0.9.7 and custom LoRA video ltx-video video-to-video multicondition-to-video image-to-video	Deprecated	8/13	→
Flux Lora text-to-image	Super fast endpoint for the FLUX.1 [dev] model with LoRA support, enabling rapid and high-quality image generation using pre-trained LoRA adaptations for personalization, specific styles, brand identities, and product-specific outputs. lora personalization	Deprecated	8/13	→
Easel Gifswap image-to-image	Swap faces on GIFs utility editing	Deprecated	8/13	→
LTX Video-0.9.7 13B Distilled text-to-video	Generate videos from prompts using LTX Video-0.9.7 13B Distilled and custom LoRA video ltx-video text-to-video	Deprecated	8/13	→
LTX Video-0.9.7 13B text-to-video	Generate videos from prompts using LTX Video-0.9.7 13B and custom LoRA video ltx-video text-to-video	Deprecated	8/13	→
LTX Video-0.9.7 13B image-to-video	Generate videos from prompts and images using LTX Video-0.9.7 13B and custom LoRA video ltx-video image-to-video	Deprecated	8/13	→
LTX Video-0.9.7 13B video-to-video	Extend videos using LTX Video-0.9.7 13B and custom LoRA video ltx-video video-to-video extend-video	Deprecated	8/13	→
LTX Video-0.9.7 13B video-to-video	Generate videos from prompts, images, and videos using LTX Video-0.9.7 13B and custom LoRA video ltx-video video-to-video multicondition-to-video image-to-video	Deprecated	8/13	→
LTX Video-0.9.7 13B Distilled image-to-video	Generate videos from prompts and images using LTX Video-0.9.7 13B Distilled and custom LoRA video ltx-video image-to-video	Deprecated	8/13	→
LTX Video-0.9.7 13B Distilled video-to-video	Generate videos from prompts, images, and videos using LTX Video-0.9.7 13B Distilled and custom LoRA video ltx-video video-to-video multicondition-to-video image-to-video	Deprecated	8/13	→
LTX Video-0.9.7 13B Distilled video-to-video	Extend videos using LTX Video-0.9.7 13B Distilled and custom LoRA video ltx-video video-to-video extend-video	Deprecated	8/13	→
DreamO text-to-image	DreamO is an image customization framework designed to support a wide range of tasks while facilitating seamless integration of multiple conditions. stylized realism	Deprecated	8/13	→
Kling 1.6 Elements image-to-video	Generate video clips from your multiple image references using Kling 1.6 (pro)	Deprecated	8/13	→
Kling 1.6 Elements image-to-video	Generate video clips from your multiple image references using Kling 1.6 (standard)	Deprecated	8/13	→
Imagen 4 text-to-image	Google’s highest quality image generation model	Deprecated	8/13	→
Imagen 4 Ultra text-to-image	Google’s highest quality image generation model	Deprecated	8/13	→
Lyria2 text-to-audio	Lyria 2 is Google's latest music generation model, you can generate any type of music with this model. music stylized	Deprecated	8/13	→
Bagel text-to-image	Bagel is a 7B parameter from Bytedance-Seed multimodal model that can generate both text and images. text-to-image multimodal	Deprecated	8/13	→
Bagel image-to-image	Bagel is a 7B parameter multimodal model from Bytedance-Seed that can generate both images and text. image-to-image image-editing	Deprecated	8/13	→
Bagel image-to-json	Bagel is a 7B parameter multimodal model from Bytedance-Seed that can generate both text and images. image-to-text vlm	Deprecated	8/13	→
Wan VACE 14B video-to-video	VACE is a video generation model that uses a source image, mask, and video to create prompted videos with controllable sources. image-to-video video-to-video text-to-video	Deprecated	8/13	→
Hunyuan Portrait image-to-video	HunyuanPortrait is a diffusion-based framework for generating lifelike, temporally consistent portrait animations. animation lip sync	Deprecated	8/13	→
Avatars audio-to-video	Generate high-quality videos with UGC-like avatars from audio lipsync audio-to-video	Deprecated	8/13	→
Avatars text-to-video	Generate high-quality videos with UGC-like avatars from text lipsync text-to-video	Deprecated	8/13	→
Lipsync video-to-video	Generate realistic lipsync from any audio using VEED's latest model lipsync video-to-video	Deprecated	8/13	→
FLUX.1 Kontext [pro] image-to-image	FLUX.1 Kontext [pro] handles both text and reference images as inputs, seamlessly enabling targeted, local edits and complex transformations of entire scenes.	Deprecated	8/13	→
FLUX.1 Kontext [dev] image-to-image	Frontier image editing model.	Deprecated	8/13	→
FLUX.1 Kontext [pro] text-to-image	The FLUX.1 Kontext [pro] text-to-image delivers state-of-the-art image generation results with unprecedented prompt following, photorealistic rendering, and flawless typography.	Deprecated	8/13	→
Kling 2.1 (standard) image-to-video	Kling 2.1 Standard is a cost-efficient endpoint for the Kling 2.1 model, delivering high-quality image-to-video generation	Deprecated	8/13	→
Kling 2.1 (pro) image-to-video	Kling 2.1 Pro is an advanced endpoint for the Kling 2.1 model, offering professional-grade videos with enhanced visual fidelity, precise camera movements, and dynamic motion control, perfect for cinematic storytelling.	Deprecated	8/13	→
Kling 2.1 Master image-to-video	Kling 2.1 Master: The premium endpoint for Kling 2.1, designed for top-tier image-to-video generation with unparalleled motion fluidity, cinematic visuals, and exceptional prompt precision. _marquee-video-model	Deprecated	8/13	→
Kling 2.1 Master text-to-video	Kling 2.1 Master: The premium endpoint for Kling 2.1, designed for top-tier text-to-video generation with unparalleled motion fluidity, cinematic visuals, and exceptional prompt precision.	Deprecated	8/13	→
FLUX.1 Kontext [max] text-to-image	FLUX.1 Kontext [max] text-to-image is a new premium model brings maximum performance across all aspects – greatly improved prompt adherence.	Deprecated	8/13	→
FLUX.1 Kontext [max] image-to-image	FLUX.1 Kontext [max] is a model with greatly improved prompt adherence and typography generation meet premium consistency for editing without compromise on speed.	Deprecated	8/13	→
Hunyuan Avatar image-to-video	HunyuanAvatar is a High-Fidelity Audio-Driven Human Animation model for Multiple Characters . stylized transform	Deprecated	8/13	→
FLUX.1 Kontext [max] image-to-image	Experimental version of FLUX.1 Kontext [max] with multi image handling capabilities	Deprecated	8/13	→
Image Editing image-to-image	See how you or others might look at different ages, from younger to older, while preserving core facial features. stylized transform	Deprecated	8/13	→
Image Editing image-to-image	Replace your photo's background with any scene you desire, from beach sunsets to urban landscapes, with perfect lighting and shadows stylized transform	Deprecated	8/13	→
Image Editing image-to-image	Transform your photos into vibrant cool cartoons with bold outlines and rich colors. stylized transform	Deprecated	8/13	→
Image Editing image-to-image	Perfect your photos with professional color grading, balanced tones, and vibrant yet natural colors stylized transform	Deprecated	8/13	→
Image Editing image-to-image	Change facial expressions in photos to any emotion you desire, from smiles to serious looks. stylized transform	Deprecated	8/13	→
Image Editing image-to-image	Enhance facial features with professional retouching while maintaining a natural, realistic look stylized transform	Deprecated	8/13	→
Image Editing image-to-image	Experiment with different hairstyles, from bald to any style you can imagine, while maintaining natural lighting and realistic results. stylized transform	Deprecated	8/13	→
Image Editing image-to-image	Remove unwanted objects or people from your photos while seamlessly blending the background. stylized transform	Deprecated	8/13	→
Image Editing image-to-image	Turn your casual photos into stunning professional studio portraits with perfect lighting and high-end photography style. stylized transform	Deprecated	8/13	→
Image Editing image-to-image	Place your subject in any scene you imagine, from enchanted forests to urban settings, with professional composition and lighting stylized transform	Deprecated	8/13	→
Image Editing image-to-image	Transform your photos into artistic masterpieces inspired by famous styles like Van Gogh's Starry Night or any artistic style you choose. stylized transform	Deprecated	8/13	→
Image Editing image-to-image	Transform your photos to any time of day, from golden hour to midnight, with appropriate lighting and atmosphere. stylized transform	Deprecated	8/13	→
Image Editing image-to-image	Add realistic weather effects like snowfall, rain, or fog to your photos while maintaining the scene's mood. stylized transform	Deprecated	8/13	→
Chatterbox text-to-speech	Whether you're working on memes, videos, games, or AI agents, Chatterbox brings your content to life. Use the first tts from resemble ai. text-to-speech	Deprecated	8/13	→
Chatterbox speech-to-speech	Whether you're working on memes, videos, games, or AI agents, Chatterbox brings your content to life. Use the first tts from resemble ai. speech-to-speech	Deprecated	8/13	→
Image Editing image-to-image	Restore and enhance old or damaged photos by removing imperfections, adding color while preserving the original character and details of the image. stylized transform	Deprecated	8/13	→
Image Editing image-to-image	Remove all text and writing from images while preserving the background and natural appearance. stylized transform	Deprecated	8/13	→
PlayAI Inpaint audio-to-audio	A novel way to perform audio editing, ensuring smooth transitions and consistent speaker characteristics for edits. audio inpaint	Deprecated	8/13	→
FLUX.1 [dev] text-to-image	FLUX.1 [dev] is a 12 billion parameter flow transformer that generates high-quality images from text. It is suitable for personal and commercial use.	Deprecated	8/13	→
FLUX.1 [schnell] text-to-image	Fastest inference in the world for the 12 billion parameter FLUX.1 [schnell] text-to-image model.	Deprecated	8/13	→
FLUX.1 [dev] image-to-image	FLUX.1 [dev] is a 12 billion parameter flow transformer that generates high-quality images from text. It is suitable for personal and commercial use.	Deprecated	8/13	→
FLUX.1 [dev] Redux image-to-image	FLUX.1 [dev] Redux is a high-performance endpoint for the FLUX.1 [dev] model that enables rapid transformation of existing images, delivering high-quality style transfers and image modifications with the core FLUX capabilities.	Deprecated	8/13	→
FLUX.1 [schnell] Redux image-to-image	FLUX.1 [schnell] Redux is a high-performance endpoint for the FLUX.1 [schnell] model that enables rapid transformation of existing images, delivering high-quality style transfers and image modifications with the core FLUX capabilities.	Deprecated	8/13	→
Chatterboxhd text-to-speech	Generate expressive, natural speech with Resemble AI's Chatterbox. Features unique emotion control, instant voice cloning from short audio, and built-in watermarking.	Deprecated	8/13	→
Chatterboxhd speech-to-speech	Transform voices using Resemble AI's Chatterbox. Convert audio to new voices or your own samples, with expressive results and built-in perceptual watermarking.	Deprecated	8/13	→
Luma Photon Reframe image-to-image	Extend and reframe images with Luma Photon Reframe. This advanced tool intelligently expands your visuals, seamlessly blending new content to enhance creativity and adaptability, offering unmatched personalization and quality for creators at a fraction of the cost. outpainting reframe	Deprecated	8/13	→
Luma Photon Flash Reframe image-to-image	This advanced tool intelligently expands your visuals, seamlessly blending new content to enhance creativity and adaptability, offering unmatched speed and quality for creators at a fraction of the cost. flash reframe outpainting	Deprecated	8/13	→
Luma Ray 2 Reframe video-to-video	Adjust and enhance videos with Ray-2 Reframe. This advanced tool seamlessly reframes videos to your desired aspect ratio, intelligently inpainting missing regions to ensure realistic visuals and coherent motion, delivering exceptional quality and creative flexibility. reframe outpaint	Deprecated	8/13	→
Luma Ray 2 Flash Reframe video-to-video	Adjust and enhance videos with Ray-2 Reframe. This advanced tool seamlessly reframes videos to your desired aspect ratio, intelligently inpainting missing regions to ensure realistic visuals and coherent motion, delivering exceptional quality and creative flexibility. reframe outpaint flash	Deprecated	8/13	→
Image Editing image-to-image	Transform any person into their baby version, while preserving the original pose and expression with childlike features. stylized transform	Deprecated	8/13	→
Wan Vace 1 3b video-to-video	Vace a video generation model that uses a source image, mask, and video to create prompted videos with controllable sources. video-to-video	Deprecated	8/13	→
Image Editing image-to-image	The reframe endpoint intelligently adjusts an image's aspect ratio while preserving the main subject's position, composition, pose, and perspective stylized transform	Deprecated	8/13	→
Veo 3 text-to-video	Veo 3 by Google, the most advanced AI video generation model in the world. With sound on!	Deprecated	8/13	→
Luma Photon image-to-image	Edit images from your prompts using Luma Photon. Photon is the most creative, personalizable, and intelligent visual models for creatives, bringing a step-function change in the cost of high-quality image generation. image-to-image	Deprecated	8/13	→
Luma Photon image-to-image	Edit images from your prompts using Luma Photon. Photon is the most creative, personalizable, and intelligent visual models for creatives, bringing a step-function change in the cost of high-quality image generation. image-to-image	Deprecated	8/13	→
Ffmpeg Api Merge Audio-Video video-to-video	Merge videos with standalone audio files or audio from video files. ffmpeg	Deprecated	8/13	→
Ffmpeg Api image-to-image	ffmpeg endpoint for first, middle and last frame extraction from videos utility editing	Deprecated	8/13	→
Bytedance text-to-image	Seedream 3.0 is a bilingual (Chinese and English) text-to-image model that excels at text-to-image generation.	Deprecated	8/13	→
Wan-2.1 LoRA Trainer training	Train custom LoRAs for Wan-2.1 FLF2V 720P lora training	Deprecated	8/13	→
Wan-2.1 LoRA Trainer training	Train custom LoRAs for Wan-2.1 I2V 720P lora training	Deprecated	8/13	→
Wan-2.1 LoRA Trainer training	Train custom LoRAs for Wan-2.1 T2V 14B lora training	Deprecated	8/13	→
Wan-2.1 LoRA Trainer training	Train custom LoRAs for Wan-2.1 T2V 1.3B lora training	Deprecated	8/13	→
Imagen 4 text-to-image	Google’s highest quality image generation model	Deprecated	8/13	→
Recraft image-to-image	Converts a given raster image to SVG format using Recraft model. stylized transform	Deprecated	8/13	→
Seedance 1.0 Lite text-to-video	Seedance 1.0 Lite	Deprecated	8/13	→
Seedance 1.0 Lite image-to-video	Seedance 1.0 Lite	Deprecated	8/13	→
Hunyuan 3D 2.1 image-to-3d	Hunyuan3D-2.1 is a scalable 3D asset creation system that advances state-of-the-art 3D generation through Physically-Based Rendering (PBR). image-to-3d	Deprecated	8/13	→
Seedance 1.0 Pro image-to-video	Seedance 1.0 Pro, a high quality video generation model developed by Bytedance.	Deprecated	8/13	→
Seedance 1.0 Pro text-to-video	Seedance 1.0 Pro, a high quality video generation model developed by Bytedance.	Deprecated	8/13	→
Object Removal image-to-image	Removes objects and their visual effects using natural language, replacing them with contextually appropriate content utility editing	Deprecated	8/13	→
Object Removal image-to-image	Removes mask-selected objects and their visual effects, seamlessly reconstructing the scene with contextually appropriate content. utility editing	Deprecated	8/13	→
Object Removal image-to-image	Removes box-selected objects and their visual effects, seamlessly reconstructing the scene with contextually appropriate content. utility editing	Deprecated	8/13	→
Bria 3.2 Text-to-Image text-to-image	Bria’s Text-to-Image model, trained exclusively on licensed data for safe and risk-free commercial use. Excels in Text-Rendering and Aesthetics. image generation	Deprecated	8/13	→
PASD image-to-image	Pixel-Aware Diffusion Model for Realistic Image Super-Resolution and Personalized Stylization utility editing	Deprecated	8/13	→
MiniMax Hailuo 02 [Standard] (Text to Video) text-to-video	MiniMax Hailuo-02 Text To Video API (Standard, 768p): Advanced video generation model with 768p resolution	Deprecated	8/13	→
MiniMax Hailuo 02 [Pro] (Text to Video) text-to-video	MiniMax Hailuo-02 Text To Video API (Pro, 1080p): Advanced video generation model with 1080p resolution	Deprecated	8/13	→
MiniMax Hailuo 02 [Pro] (Image to Video) image-to-video	MiniMax Hailuo-02 Image To Video API (Pro, 1080p): Advanced image-to-video generation model with 1080p resolution	Deprecated	8/13	→
MiniMax Hailuo 02 [Standard] (Image to Video) image-to-video	MiniMax Hailuo-02 Image To Video API (Standard, 768p, 512p): Advanced image-to-video generation model with 768p and 512p resolutions	Deprecated	8/13	→
Tripo3D image-to-3d	State of the art Multiview to 3D Object generation. Generate 3D models from multiple images! stylized multiview	Deprecated	8/13	→
Chain Of Zoom image-to-image	Extreme Super-Resolution via Scale Autoregression and Preference Alignment	Deprecated	8/13	→
Wan VACE 14B video-to-video	VACE is a video generation model that uses a source image, mask, and video to create prompted videos with controllable sources. image-to-video video-to-video text-to-video	Deprecated	8/13	→
Wan VACE 14B video-to-video	VACE is a video generation model that uses a source image, mask, and video to create prompted videos with controllable sources. image-to-video video-to-video text-to-video	Deprecated	8/13	→
Wan VACE 14B video-to-video	VACE is a video generation model that uses a source image, mask, and video to create prompted videos with controllable sources. image-to-video video-to-video text-to-video	Deprecated	8/13	→
Wan VACE 14B video-to-video	VACE is a video generation model that uses a source image, mask, and video to create prompted videos with controllable sources. image-to-video video-to-video text-to-video	Deprecated	8/13	→
Wan VACE 14B video-to-video	VACE is a video generation model that uses a source image, mask, and video to create prompted videos with controllable sources. reframe	Deprecated	8/13	→
Video Understanding vision	A video understanding model to analyze video content and answer questions about what's happening in the video based on user prompts. utility vision	Deprecated	8/13	→
Ai Avatar image-to-video	MultiTalk model generates a multi-person conversation video from an image and audio files. Creates a realistic scene where multiple people speak in sequence. stylized transform	Deprecated	8/13	→
Ai Avatar image-to-video	MultiTalk model generates a multi-person conversation video from an image and text inputs. Converts text to speech for each person, generating a realistic conversation scene. stylized transform	Deprecated	8/13	→
Ai Avatar image-to-video	MultiTalk model generates a talking avatar video from an image and audio file. The avatar lip-syncs to the provided audio with natural facial expressions. stylized transform	Deprecated	8/13	→
Ai Avatar image-to-video	MultiTalk model generates a talking avatar video from an image and text. Converts text to speech automatically, then generates the avatar speaking with lip-sync. stylized transform	Deprecated	8/13	→
FASHN Virtual Try-On V1.6 image-to-image	FASHN v1.6 delivers precise virtual try-on capabilities, accurately rendering garment details like text and patterns at 864x1296 resolution from both on-model and flat-lay photo references. try-on fashion clothing	Deprecated	8/13	→
Omnigen V2 text-to-image	OmniGen is a unified image generation model that can generate a wide range of images from multi-modal prompts. It can be used for various tasks such as Image Editing, Personalized Image Generation, Virtual Try-On, Multi Person Generation and more! multimodal editing try-on	Deprecated	8/13	→
Flux Kontext Lora image-to-image	Fast endpoint for the FLUX.1 Kontext [dev] model with LoRA support, enabling rapid and high-quality image editing using pre-trained LoRA adaptations for specific styles, brand identities, and product-specific outputs. image-editing image-to-image	Deprecated	8/13	→
Flux Kontext Lora text-to-image	Super fast text-to-image endpoint for the FLUX.1 Kontext [dev] model with LoRA support, enabling rapid and high-quality image generation using pre-trained LoRA adaptations for personalization, specific styles, brand identities, and product-specific outputs. text-to-image	Deprecated	8/13	→
Image Editing image-to-image	Transform your photos into cool plushies while keeping the original characters likeness stylized transform	Deprecated	8/13	→
Image Editing image-to-image	Transform your photos into wojak style while keeping the original characters likeness stylized transform	Deprecated	8/13	→
Image Editing image-to-image	Transform your character's hair into broccoli style while keeping the original characters likeness stylized transform	Deprecated	8/13	→
Flux Kontext Trainer training	LoRA trainer for FLUX.1 Kontext [dev]	Deprecated	8/13	→
Bytedance image-to-image	SeedEdit 3.0 is an image editing model independently developed by ByteDance. It excels in accurately following editing instructions and effectively preserving image content, especially excelling in handling real images image-editing image-to-image	Deprecated	8/13	→
Luma Ray 2 Modify video-to-video	Ray2 Modify is a video generative model capable of restyling or retexturing the entire shot, from turning live-action into CG or stylized animation, to changing wardrobe, props, or the overall aesthetic and swap environments or time periods, giving you control over background, location, or even weather. modify restyle	Deprecated	8/13	→
Video video-to-video	Automatically remove backgrounds from videos -perfect for creating clean, professional content without a green screen. background-removal	Deprecated	8/13	→
Image Editing image-to-image	Generate YouTube thumbnails with custom text stylized transform	Deprecated	8/13	→
Pixverse video-to-video	Generate realistic lipsync animations from audio using advanced algorithms for high-quality synchronization with PixVerse Lipsync model animation lip sync	Deprecated	8/13	→
Pixverse video-to-video	PixVerse Extend model is a video extending tool for your videos using with high-quality video extending techniques utility editing	Deprecated	8/13	→
Pixverse video-to-video	PixVerse Extend model is a video extending tool for your videos using with high-quality video extending techniques utility editing	Deprecated	8/13	→
Post Processing image-to-image	Apply Gaussian or Kuwahara blur effects with adjustable radius and sigma parameters stylized transform	Deprecated	8/13	→
Post Processing image-to-image	Create chromatic aberration by shifting red, green, and blue channels horizontally or vertically with customizable shift amounts. stylized transform	Deprecated	8/13	→
Post Processing image-to-image	Adjust color temperature, brightness, contrast, saturation, and gamma values for color correction. stylized transform	Deprecated	8/13	→
Post Processing image-to-image	Apply various color tints (sepia, red, green, blue, cyan, magenta, yellow, purple, orange, warm, cool, lime, navy, vintage, rose, teal, maroon, peach, lavender, olive) with adjustable strength. stylized transform	Deprecated	8/13	→
Post Processing image-to-image	Reduce color saturation using different methods (luminance Rec.709, luminance Rec.601, average, lightness) with adjustable factor. stylized transform	Deprecated	8/13	→
Post Processing image-to-image	Blend two images together using smooth linear interpolation with a configurable blend factor. stylized transform	Deprecated	8/13	→
Post Processing image-to-image	Apply dodge and burn effects with multiple modes and adjustable intensity. stylized transform	Deprecated	8/13	→
Post Processing image-to-image	Apply film grain effect with different styles (modern, analog, kodak, fuji, cinematic, newspaper) and customizable intensity and scale stylized transform	Deprecated	8/13	→
Post Processing image-to-image	Apply a parabolic distortion effect with configurable coefficient and vertex position. stylized transform	Deprecated	8/13	→
Post Processing image-to-image	Apply sharpening effects with three modes: basic unsharp mask, smart sharpening with edge preservation, and Contrast Adaptive Sharpening (CAS). stylized transform	Deprecated	8/13	→
Post Processing image-to-image	Apply solarization effect by inverting pixel values above a threshold stylized transform	Deprecated	8/13	→
Post Processing image-to-image	Add a darkening vignette effect around the edges of the image with adjustable strength stylized transform	Deprecated	8/13	→
ThinkSound video-to-video	Generate realistic audio for a video with an optional text prompt and combine audio-generation video-to-audio	Deprecated	8/13	→
ThinkSound video-to-video	Generate realistic audio from a video with an optional text prompt audio-generation video-to-audio	Deprecated	8/13	→
Image Editing image-to-image	Add details to faces, enhance face features, remove blur. stylized transform realism	Deprecated	8/13	→
Pixverse video-to-video	Add immersive sound effects and background music to your videos using PixVerse sound effects generation audio utility	Deprecated	8/13	→
Bria image-to-image	Structure Reference allows generating new images while preserving the structure of an input image, guided by text prompts. Perfect for transforming sketches, illustrations, or photos into new illustrations. Trained exclusively on licensed data for safe and risk-free commercial use.	Deprecated	8/13	→
Vidu image-to-video	Generate video clips from your multiple image references using Vidu Q1 stylized transform	Deprecated	8/13	→
Ffmpeg Api json	Get EBU R128 loudness normalization from audio files using FFmpeg API. ffmpeg	Deprecated	8/13	→
Veo 3 Fast text-to-video	Faster and more cost effective version of Google's Veo 3!	Deprecated	8/13	→
Veo 3 Fast [Image to Video] image-to-video	Generate videos from your images via Veo 3 Fast	Deprecated	8/13	→
Calligrapher image-to-image	Use the text and font retaining capabilities of calligrapher to modify texts on your books, clothes and many more. image-to-image	Deprecated	8/13	→
Fashion Photoshoot image-to-image	Instant fashion photoshoot with a selfie and an outfit image-to-image	Deprecated	8/13	→
any-llm Enterprise llm	Run any large language model with fal, powered by OpenRouter. This endpoint only supports models that do not train on private data. Read more in OpenRouter's Privacy and Logging documentation. chat claude gpt	Deprecated	8/13	→
LTX-Video 13B 0.9.8 Distilled video-to-video	Generate long videos from prompts, images, and videos using LTX Video-0.9.8 13B Distilled and custom LoRA video ltx-video video-to-video multicondition-to-video image-to-video	Deprecated	8/13	→
LTX-Video 13B 0.9.8 Distilled text-to-video	Generate long videos from prompts using LTX Video-0.9.8 13B Distilled and custom LoRA video ltx-video text-to-video	Deprecated	8/13	→
LTX-Video 13B 0.9.8 Distilled image-to-video	Generate long videos from prompts and images using LTX Video-0.9.8 13B Distilled and custom LoRA video ltx-video image-to-video	Deprecated	8/13	→
Luma Ray 2 Flash Modify video-to-video	Ray2 Flash Modify is a video generative model capable of restyling or retexturing the entire shot, from turning live-action into CG or stylized animation, to changing wardrobe, props, or the overall aesthetic and swap environments or time periods, giving you control over background, location, or even weather. modify restyle	Deprecated	8/13	→
Lipsync video-to-video	Realistic lipsync video - optimized for speed, quality, and consistency.	Deprecated	8/13	→
MiniMax Voice Design text-to-speech	Design a personalized voice from a text description, and generate speech from text prompts using the MiniMax model, which leverages advanced AI techniques to create high-quality text-to-speech. speech	Deprecated	8/13	→
FILM image-to-image	Interpolate images with FILM - Frame Interpolation for Large Motion interpolation	Deprecated	8/13	→
FILM video-to-video	Interpolate videos with FILM - Frame Interpolation for Large Motion interpolation	Deprecated	8/13	→
RIFE image-to-image	Interpolate images with RIFE - Real-Time Intermediate Flow Estimation interpolation	Deprecated	8/13	→
RIFE video-to-video	Interpolate videos with RIFE - Real-Time Intermediate Flow Estimation interpolation	Deprecated	8/13	→
LTX-Video 13B 0.9.8 Distilled video-to-video	Extend videos using LTX Video-0.9.8 13B Distilled and custom LoRA ltx-video extend	Deprecated	8/13	→
Hidream E1 1 image-to-image	Edit images with natural language	Deprecated	8/13	→
Image Editing image-to-image	Retouch photos of faces. Remove blemishes and improve the skin.	Deprecated	8/13	→
Sky Raccoon text-to-image	Generate images from a text prompt. text-to-image	Deprecated	8/13	→
OmniHuman image-to-video	OmniHuman generates video using an image of a human figure paired with an audio file. It produces vivid, high-quality videos where the character’s emotions and movements maintain a strong correlation with the audio. image-to-video lipsync	Deprecated	8/13	→
NSFW Checker vision	Predict whether an image is NSFW or SFW. filter safety utility	Deprecated	8/13	→
Hunyuan World image-to-3d	Hunyuan World 1.0 turns a single image into a panorama or a 3D world. It creates realistic scenes from the image, allowing you to explore and view it from different angles.	Deprecated	8/13	→
Hunyuan World image-to-image	Hunyuan World 1.0 turns a single image into a panorama or a 3D world. It creates realistic scenes from the image, allowing you to explore and view it from different angles.	Deprecated	8/13	→
Wan-2.2 Text-to-Video A14B text-to-video	Wan-2.2 text-to-video is a video model that generates high-quality videos with high visual quality and motion diversity from text prompts. text to video motion	Deprecated	8/13	→
Wan v2.2 5B text-to-video	Wan 2.2's 5B model produces up to 5 seconds of video 720p at 24FPS with fluid motion and powerful prompt understanding	Deprecated	8/13	→
Flux Kontext Lora image-to-image	Fast inpainting endpoint for the FLUX.1 Kontext [dev] model with LoRA support, enabling rapid and high-quality image inpainting with reference images, while using pre-trained LoRA adaptations for specific styles, brand identities, and product-specific outputs. image-editing image-inpainting image-to-image	Deprecated	8/13	→
Wan v2.2 5B image-to-video	Wan 2.2's 5B model produces up to 5 seconds of video 720p at 24FPS with fluid motion and powerful prompt understanding	Deprecated	8/13	→
FLUX.1 Krea [dev] text-to-image	FLUX.1 Krea [dev] is a 12 billion parameter flow transformer that generates high-quality images from text with incredible aesthetics. It is suitable for personal and commercial use.	Deprecated	8/13	→
FLUX.1 Krea [dev] Redux image-to-image	FLUX.1 Krea [dev] Redux is a high-performance endpoint for the FLUX.1 Krea [dev] model that enables rapid transformation of existing images, delivering high-quality style transfers and image modifications with the core FLUX capabilities.	Deprecated	8/13	→
FLUX.1 Krea [dev] image-to-image	FLUX.1 Krea [dev] is a 12 billion parameter flow transformer that generates high-quality images from text with incredible aesthetics. It is suitable for personal and commercial use.	Deprecated	8/13	→
FLUX.1 Krea [dev] text-to-image	FLUX.1 Krea [dev] is a 12 billion parameter flow transformer that generates high-quality images from text with incredible aesthetics. It is suitable for personal and commercial use.	Deprecated	8/13	→
FLUX.1 Krea [dev] Redux image-to-image	FLUX.1 Krea [dev] Redux is a high-performance endpoint for the FLUX.1 Krea [dev] model that enables rapid transformation of existing images, delivering high-quality style transfers and image modifications with the core FLUX capabilities.	Deprecated	8/13	→
FLUX.1 Krea [dev] image-to-image	FLUX.1 Krea [dev] is a 12 billion parameter flow transformer that generates high-quality images from text with incredible aesthetics. It is suitable for personal and commercial use.	Deprecated	8/13	→
Wan text-to-video	Wan-2.2 turbo text-to-video is a video model that generates high-quality videos with high visual quality and motion diversity from text prompts. text to video motion	Deprecated	8/13	→
Wan image-to-video	Wan-2.2 Turbo image-to-video is a video model that generates high-quality videos with high visual quality and motion diversity from text prompts.	Deprecated	8/13	→
Veo3 image-to-video	Veo 3 is the latest state-of-the art video generation model from Google DeepMind	Deprecated	8/13	→
Fashion Try On image-to-image	Instant fashion try on with a full-body pic and an outfit	Deprecated	8/13	→
FLUX.1 Krea [dev] with LoRAs image-to-image	FLUX LoRA Image-to-Image is a high-performance endpoint that transforms existing images using FLUX models, leveraging LoRA adaptations to enable rapid and precise image style transfer, modifications, and artistic variations. lora style transfer	Deprecated	8/13	→
FLUX.1 Krea [dev] with LoRAs text-to-image	Super fast endpoint for the FLUX.1 [dev] model with LoRA support, enabling rapid and high-quality image generation using pre-trained LoRA adaptations for personalization, specific styles, brand identities, and product-specific outputs. lora personalization	Deprecated	8/13	→
FLUX.1 Krea [dev] Inpainting with LoRAs image-to-image	Super fast endpoint for the FLUX.1 [dev] inpainting model with LoRA support, enabling rapid and high-quality image inpaingting using pre-trained LoRA adaptations for personalization, specific styles, brand identities, and product-specific outputs. lora personalization	Deprecated	8/13	→
Flux Krea Lora text-to-image	Super fast endpoint for the FLUX.1 [dev] model with LoRA support, enabling rapid and high-quality image generation using pre-trained LoRA adaptations for personalization, specific styles, brand identities, and product-specific outputs. lora personalization	Deprecated	8/13	→
Train Flux Krea LoRA training	Train styles, people and other subjects at blazing speeds using the FLUX.1 Krea [dev] base model. lora personalization	Deprecated	8/13	→
Wan video-to-video	Wan-2.2 video-to-video is a video model that generates high-quality videos with high visual quality and motion diversity from text prompts and source videos.	Deprecated	8/13	→
Qwen Image text-to-image	Qwen-Image is an image generation foundation model in the Qwen series that achieves significant advances in complex text rendering and precise image editing. text-to-image	Deprecated	8/13	→
Wan text-to-image	Wan 2.2's 14B model generates high-resolution, photorealistic images with powerful prompt understanding and fine-grained visual detail	Deprecated	8/13	→
Wan text-to-image	Wan 2.2's 5B model generates high-resolution, photorealistic images with powerful prompt understanding and fine-grained visual detail	Deprecated	8/13	→
Wan text-to-image	Wan 2.2's 14B model with LoRA support generates high-fidelity images with enhanced prompt alignment, style adaptability.	Deprecated	8/13	→
Wan text-to-video	Wan 2.2's 5B FastVideo model produces up to 5 seconds of video 720p at 24FPS with fluid motion and powerful prompt understanding text to video motion	Deprecated	8/13	→
Bytedance text-to-image	Dreamina showcases superior picture effects, with significant improvements in picture aesthetics, precise and diverse styles, and rich details. text-to-image	Deprecated	8/13	→
Minimax image-to-video	Create blazing fast and economical videos with MiniMax Hailuo-02 Image To Video API at 512p resolution stylized transform	Deprecated	8/13	→
Wan text-to-video	Wan 2.2's 5B distill model produces up to 5 seconds of video 720p at 24FPS with fluid motion and powerful prompt understanding	Deprecated	8/13	→
Wan v2.2 A14B Image-to-Video A14B with LoRAs image-to-video	Wan-2.2 image-to-video is a video model that generates high-quality videos with high visual quality and motion diversity from text prompts and images. This endpoint supports LoRAs made for Wan 2.2 image-to-video motion lora	Deprecated	8/13	→
Wan-2.2 Text-to-Video A14B with LoRAs text-to-video	Wan-2.2 text-to-video is a video model that generates high-quality videos with high visual quality and motion diversity from text prompts. This endpoint supports LoRAs made for Wan 2.2.	Deprecated	8/13	→
Ideogram V3 Character Remix image-to-image	Transform your consistent character into different art styles, settings, or scenarios while maintaining their distinctive appearance and identity character-consistency	Deprecated	8/13	→
Ideogram V3 Character image-to-image	Generate consistent character appearances across multiple images. Maintain facial features, proportions, and distinctive traits for cohesive storytelling and branding character-consistency	Deprecated	8/13	→
Ideogram V3 Character Edit image-to-image	Modify consistent characters while preserving their core identity. Edit poses, expressions, or clothing without losing recognizable character features character-consistency	Deprecated	8/13	→
Wan 2.2 14B Image Trainer training	Wan 2.2 text to image LoRA trainer. Fine-tune Wan 2.2 for subjects and styles with unprecedented detail. lora personalization	Deprecated	8/13	→
Ffmpeg Api video-to-video	Use ffmpeg capabilities to merge 2 or more videos.	Deprecated	8/13	→
Bytedance image-to-video	Transform your images into stylized videos using this workflow. image-to-video effects	Deprecated	8/13	→
EchoMimic V3 audio-to-video	EchoMimic V3 generates a talking avatar model from a picture, audio and text prompt. echomimic talking-head audio-to-video	Deprecated	8/13	→