| Model Name | Description | Status | First Seen | Action |
|---|---|---|---|---|
| Z-Image Turbo image-to-image |
Generate images from text and images using Z-Image Turbo, Tongyi-MAI's super-fast 6B model.
turbo z-image fast |
OK | 1d | → |
| Z-Image Turbo image-to-image |
Generate images from text and images using custom LoRA and Z-Image Turbo, Tongyi-MAI's super-fast 6B model.
turbo z-image fast lora |
OK | 1d | → |
| Z-Image Turbo image-to-image | Generate images from text and edge, depth or pose images using Z-Image Turbo, Tongyi-MAI's super-fast 6B model. | OK | 1d | → |
| Z-Image Turbo image-to-image |
Generate images from text and edge, depth or pose images using custom LoRA and Z-Image Turbo, Tongyi-MAI's super-fast 6B model.
turbo z-image fast lora |
OK | 1d | → |
| Longcat Image image-to-image | LongCat image Edit is a 6B parameter image editing model excelling at multilingual text rendering, photorealism and deployment efficiency. | OK | 3d | → |
| Longcat Image text-to-image | LongCat image is a 6B parameter model excelling at multilingual text rendering, photorealism and deployment efficiency. | OK | 3d | → |
| Kling AI Avatar v2 Standard image-to-video | Kling AI Avatar v2 Standard: Endpoint for creating avatar videos with realistic humans, animals, cartoons, or stylized characters | OK | 4d | → |
| Kling AI Avatar v2 Pro image-to-video | Kling AI Avatar v2 Pro: The premium endpoint for creating avatar videos with realistic humans, animals, cartoons, or stylized characters | OK | 4d | → |
| Z Image Trainer training |
Train LoRAs on Z-Image Turbo, a super fast text-to-image model of 6B parameters developed by Tongyi-MAI.
turbo z-image fast trainer |
OK | 5d | → |
| Kling Video v2.6 Image to Video image-to-video | Kling 2.6 Pro: Top-tier image-to-video with cinematic visuals, fluid motion, and native audio generation. | Deprecated | 5d | → |
| Kling Video v2.6 Text to Video text-to-video | Kling 2.6 Pro: Top-tier text-to-video with cinematic visuals, fluid motion, and native audio generation. | Deprecated | 5d | → |
| Bytedance image-to-image |
A new-generation image creation model ByteDance, Seedream 4.5 integrates image generation and image editing capabilities into a single, unified architecture.
stylized transform |
OK | 6d | → |
| Bytedance text-to-image |
A new-generation image creation model ByteDance, Seedream 4.5 integrates image generation and image editing capabilities into a single, unified architecture.
stylized transform |
OK | 6d | → |
| Sam 3 image-to-3d |
SAM 3D enables precise 3D reconstruction of objects from real images, while accurately reconstructing their geometry and texture.
3d object |
OK | 6d | → |
| Sam 3 image-to-3d |
SAM 3D allows for accurate 3D reconstruction of human body shape and position from a single image.
3d human pose |
OK | 6d | → |
| Sam 3 3d-to-3d |
SAM 3D enables full scene reconstructions, placing objects and humans in a shared context together.
align 3D |
OK | 6d | → |
| Vidu text-to-image | Use vidu Text-to-Image to turn your prompts into reality. | Deprecated | 6d | → |
| Vidu image-to-image |
Vidu Reference-to-Image creates images by using a reference images and combining them with a prompt.
images-to-imag reference-to-image |
Deprecated | 6d | → |
| Pixverse text-to-video |
Generate high quality video clips from text and image prompts using PixVerse v5.5
text-to-video |
Deprecated | 7d | → |
| Pixverse image-to-video |
Generate high quality video clips from text and image prompts using PixVerse v5.5
image-to-video |
Deprecated | 7d | → |
| Pixverse image-to-video | Pixverse Transition | Deprecated | 7d | → |
| Pixverse image-to-video | Pixverse Effects | Deprecated | 7d | → |
| Kling O1 Image image-to-image |
Perform precise image edits using strong reference control, transforming subjects, styles, and local details while preserving visual consistency.
edit realism typography |
Deprecated | 7d | → |
| Z Image text-to-image |
Text-to-Image endpoint with LoRA support for Z-Image Turbo, a super fast text-to-image model of 6B parameters developed by Tongyi-MAI.
z-image lora fast |
Deprecated | 7d | → |
| Video Background Removal video-to-video | Remove background from videos filmed on a green screen. | Deprecated | 7d | → |
| Video Background Removal video-to-video | Remove background from any video with people and objects. No green screen needed. | Deprecated | 7d | → |
| Kling O1 Reference Video to Video video-to-video | Kling O1 Omni generates new shots guided by an input reference video, preserving cinematic language such as motion, and camera style to produce seamless scene continuity. | Deprecated | 7d | → |
| Kling O1 Edit Video video-to-video | Edit an existing video using natural-language instructions, transforming subjects, settings, and style while retaining the original motion structure. | Deprecated | 7d | → |
| Kling O1 Reference Image to Video image-to-video | Transform images, elements, and text into consistent, high-quality video scenes, ensuring stable character identity, object details, and environments. | Deprecated | 7d | → |
| Kling O1 First Frame Last Frame to Video image-to-video | Generate a video by taking a start frame and an end frame, animating the transition between them while following exctext-driven style and scene guidance. | Deprecated | 7d | → |
| Video Background Removal video-to-video | Remove background from any video with people and objects. No green screen needed. | Deprecated | 7d | → |
| Ovis Image text-to-image |
Ovis-Image is a 7B text-to-image model specifically optimized for quick, high quality text rendering.
ovis-image artistic |
Deprecated | 9d | → |
| Lucy Edit [Fast] video-to-video |
Lucy Edit Fast is a rapid, localized video editing model that lets you modify specific elements like objects, or backgrounds in just 10 seconds.
edit |
Deprecated | 12d | → |
| LTX Video 2.0 Retake video-to-video | Change sections of a video using LTX-2 | Deprecated | 12d | → |
| LTX Video 2.0 Pro image-to-video | Create high-fidelity video with audio from images with LTX-2 Pro | Deprecated | 12d | → |
| LTX Video 2.0 Fast image-to-video | Create high-fidelity video with audio from images with LTX-2 Fast | Deprecated | 12d | → |
| LTX Video 2.0 Pro text-to-video | Create high-fidelity video with audio from text with LTX-2 Pro. | Deprecated | 12d | → |
| LTX Video 2.0 Fast text-to-video | Create high-fidelity video with audio from text with LTX-2 Fast | Deprecated | 12d | → |
| Z Image text-to-image |
Z-Image Turbo is a super fast text-to-image model of 6B parameters developed by Tongyi-MAI.
turbo z-image fast |
Deprecated | 12d | → |
| LTX Video 2.0 Retake video-to-video | Change sections of a video using LTX-2 | Deprecated | 12d | → |
| Flux 2 Lora Gallery image-to-image |
Add a background to images with white/clean background
stylized transform |
Deprecated | 13d | → |
| Flux 2 Lora Gallery image-to-image |
Virtually furnishes an empty apartment
stylized transform |
Deprecated | 13d | → |
| Flux 2 Lora Gallery text-to-image |
Ballpoint pen sketch drawing style
stylized transform |
Deprecated | 13d | → |
| Flux 2 Lora Gallery text-to-image |
Transforms images into comic book style
stylized transform |
Deprecated | 13d | → |
| Flux 2 Lora Gallery image-to-image |
Extends a face into a full body portrait
stylized transform |
Deprecated | 13d | → |
| Flux 2 Lora Gallery text-to-image |
HDR surrealistic effect with intense colors
stylized transform |
Deprecated | 13d | → |
| Flux 2 Lora Gallery image-to-image |
Generates same object from different angles (azimuth/elevation)
stylized transform |
Deprecated | 13d | → |
| Flux 2 Lora Gallery text-to-image |
Makes images more photorealistic and natural
stylized transform |
Deprecated | 13d | → |
| Flux 2 Lora Gallery text-to-image |
Generates satellite/aerial view style images
stylized transform |
Deprecated | 13d | → |
| Flux 2 Lora Gallery image-to-image |
Virtual clothing try-on (2 images: person + garment)
stylized transform |
Deprecated | 13d | → |
| Flux 2 Lora Gallery text-to-image |
Applies sepia vintage effect to images
stylized transform |
Deprecated | 13d | → |
| Flux 2 Pro image-to-image | Text-to-image generation with FLUX.2 [pro] from Black Forest Labs. Optimized for maximum quality, exceptional photorealism and artistic images. | Deprecated | 13d | → |
| Flux 2 Pro text-to-image | Image editing with FLUX.2 [pro] from Black Forest Labs. Ideal for high-quality image manipulation, style transfer, and sequential editing workflows | Deprecated | 13d | → |
| Flux 2 text-to-image | Text-to-image generation with FLUX.2 [dev] from Black Forest Labs. Enhanced realism, crisper text generation, and native editing capabilities. | Deprecated | 13d | → |
| Flux 2 image-to-image | Image-to-image editing with FLUX.2 [dev] from Black Forest Labs. Precise modifications using natural language descriptions and hex color control. | Deprecated | 13d | → |
| Flux 2 text-to-image | Text-to-image generation with LoRA support for FLUX.2 [dev] from Black Forest Labs. Custom style adaptation and fine-tuned model variations. | Deprecated | 13d | → |
| Flux 2 text-to-image | Image-to-image editing with LoRA support for FLUX.2 [dev] from Black Forest Labs. Specialized style transfer and domain-specific modifications. | Deprecated | 13d | → |
| Flux 2 Flex text-to-image |
Text-to-image generation with FLUX.2 [flex] from Black Forest Labs. Features adjustable inference steps and guidance scale for fine-tuned control. Enhanced typography and text rendering capabilities.
stylized transform |
Deprecated | 13d | → |
| Flux 2 Flex image-to-image | Image editing with FLUX.2 [flex] from Black Forest Labs. Supports multi-reference editing with customizable inference steps and enhanced text rendering. | Deprecated | 13d | → |
| Flux 2 Trainer training | Fine-tune FLUX.2 [dev] from Black Forest Labs with custom datasets. Create specialized LoRA adaptations for specific styles and domains. | Deprecated | 13d | → |
| Flux 2 Trainer training | Fine-tune FLUX.2 [dev] from Black Forest Labs with custom datasets. Create specialized LoRA adaptations for specific editing tasks. | Deprecated | 13d | → |
| Crystal Upscaler image-to-image |
An advanced image enhancement tool designed specifically for facial details and portrait photography, utilizing Clarity AI's upscaling technology.
image-to-image |
Deprecated | 14d | → |
| Chrono Edit Lora Gallery image-to-image |
Upscales and cleans up the image.
upscale details |
Deprecated | 17d | → |
| Chrono Edit Lora Gallery image-to-image |
You can make edits simply by drawing a quick sketch on the input image.
paint edit sketch |
Deprecated | 17d | → |
| Chrono Edit Lora image-to-image |
LoRA endpoint for the Chrono Edit model.
image-to-image image-editing |
Deprecated | 17d | → |
| Hunyuan Video V1.5 text-to-video |
Hunyuan Video 1.5 is Tencent's latest and best video model
hunyuan-video text-to-video |
Deprecated | 18d | → |
| Sam 3 vision |
SAM 3 is a unified foundation model for promptable segmentation in images and videos. It can detect, segment, and track objects using text or visual prompts such as points, boxes, and masks.
embeddings mask real-time |
Deprecated | 18d | → |
| Segment Anything Model 3 image-to-image |
SAM 3 is a unified foundation model for promptable segmentation in images and videos. It can detect, segment, and track objects using text or visual prompts such as points, boxes, and masks.
segmentation mask real-time |
Deprecated | 18d | → |
| Sam 3 video-to-video |
SAM 3 is a unified foundation model for promptable segmentation in images and videos. It can detect, segment, and track objects using text or visual prompts such as points, boxes, and masks.
segmentation mask real-time |
Deprecated | 18d | → |
| Sam 3 video-to-video |
SAM 3 is a unified foundation model for promptable segmentation in images and videos. It can detect, segment, and track objects using text or visual prompts such as points, boxes, and masks.
segmentation mask real-time rle |
Deprecated | 18d | → |
| Sam 3 image-to-image |
SAM 3 is a unified foundation model for promptable segmentation in images and videos. It can detect, segment, and track objects using text or visual prompts such as points, boxes, and masks.
segmentation rle real-time |
Deprecated | 18d | → |
| Nano Banana Pro text-to-image |
Nano Banana Pro (a.k.a Nano Banana 2) is Google's new state-of-the-art image generation and editing model
realism typography |
Deprecated | 18d | → |
| Nano Banana Pro image-to-image |
Nano Banana Pro (a.k.a Nano Banana 2) is Google's new state-of-the-art image generation and editing model
realism typography |
Deprecated | 18d | → |
| Gemini 3 Pro Image Preview text-to-image |
Nano Banana Pro (a.k.a Nano Banana 2) is Google's new state-of-the-art image generation and editing model
realism typography |
Deprecated | 18d | → |
| Gemini 3 Pro Image Preview image-to-image |
Nano Banana Pro (a.k.a Nano Banana 2) is Google's new state-of-the-art image generation and editing model
realism typography |
Deprecated | 18d | → |
| Lynx image-to-video |
Generate subject consistent videos using Lynx from ByteDance!
image-to-video subject |
Deprecated | 21d | → |
| Maya1 text-to-speech |
Maya1 is a state-of-the-art speech model by Maya Research for expressive voice generation, built to capture real human emotion and precise voice design.
text-to-speech tts |
Deprecated | 24d | → |
| OpenRouter Chat Completions [OpenAI Compatible] llm | Run any LLM (Large Language Model) with fal, powered by OpenRouter. This endpoint is compatible with the OpenAI API. | Deprecated | 25d | → |
| OpenRouter llm | Run any LLM (Large Language Model) with fal, powered by OpenRouter. | Deprecated | 25d | → |
| OpenRouter [Vision] vision | Run any VLM (Vision Language Model) with fal, powered by OpenRouter. | Deprecated | 25d | → |
| OpenRouter Embeddings [OpenAI Compatible] llm | The OpenRouter Embeddings API with fal, powered by OpenRouter, provides unified access to a wide range of large language models - including GPT, Claude, Gemini, and many others through a single API interface. | Deprecated | 25d | → |
| OpenRouter Responses [OpenAI Compatible] llm | The OpenRouter Responses API with fal, powered by OpenRouter, provides unified access to a wide range of large language models - including GPT, Claude, Gemini, and many others through a single API interface. | Deprecated | 25d | → |
| Fibo Mashup image-to-image |
Combine three images to create an amazing mashup image with Bria's FIBO model.
bria fibo image-to-image |
Deprecated | 25d | → |
| Editto video-to-video |
Edit videos using instruction-based prompting using Editto model!
video-edit wan-vace |
Deprecated | 27d | → |
| Qwen Image Edit Plus Lora Gallery image-to-image |
Add a realistic scene behind the object with white background
stylized transform |
Deprecated | 27d | → |
| Qwen Image Edit Plus Lora Gallery image-to-image |
Generate full portrait from a cropped face photo
stylized transform |
Deprecated | 27d | → |
| Qwen Image Edit Plus Lora Gallery image-to-image |
Create group photos
stylized transform |
Deprecated | 27d | → |
| Qwen Image Edit Plus Lora Gallery image-to-image |
Blend products into backgrounds with automatic perspective and lighting correction
stylized transform |
Deprecated | 27d | → |
| Qwen Image Edit Plus Lora Gallery image-to-image |
Create cinematic transitions and scene progressions (camera movements, framing changes)
stylized transform |
Deprecated | 27d | → |
| Qwen Image Edit Plus Lora Gallery image-to-image |
Remove unwanted elements (objects, people, text) while maintaining image consistency
stylized transform |
Deprecated | 27d | → |
| Qwen Image Edit Plus Lora Gallery image-to-image |
Remove existing lighting and apply soft, even illumination
stylized transform |
Deprecated | 27d | → |
| Qwen Image Edit Plus Lora Gallery image-to-image |
Apply designs/graphics onto people's shirts
stylized transform |
Deprecated | 27d | → |
| Qwen Image Edit Plus Lora Gallery image-to-image |
Precise camera position and angle control (rotation, zoom, vertical movement)
stylized transform |
Deprecated | 27d | → |
| Flashvsr video-to-video |
Upscale your videos using FlashVSR with the fastest speeds!
upscale video-to-video |
Deprecated | 28d | → |
| Pixverse image-to-video | Generate high quality video clips by swapping person, objects and background using Pixverse Swap. | Deprecated | 28d | → |
| Infinity Star text-to-video |
InfinityStar’s unified 8B spacetime autoregressive engine to turn any text prompt into crisp 720p videos - 10× faster than diffusion models.
text-to-video |
Deprecated | 11/7 | → |
| Sana Video text-to-video |
Leverage Sana's ultra-fast processing speed to generate high-quality assets that transform your text prompts into production-ready videos
text-to-video |
Deprecated | 11/7 | → |
| Crystal Upscaler image-to-image |
An advanced image enhancement tool designed specifically for facial details and portrait photography, utilizing Clarity AI's upscaling technology.
image-to-image |
Deprecated | 11/5 | → |
| Image Outpaint image-to-image |
Directional outpainting. Choose edges to expand. left, right, top, or center (uniform all sides). Only expanded areas are generated; an optional zoom-out pulls the frame back by the chosen amount.
outpainting |
Deprecated | 11/5 | → |
| Workflow Utilities video-to-video |
Add automatic subtitles to videos
auto-subtitle captioning |
Deprecated | 11/4 | → |
| Reve image-to-image |
Reve’s fast edit model lets you upload an existing image and then transform it via a text prompt at lightning speed!
image-to-image |
Deprecated | 11/4 | → |
| Reve image-to-image |
Reve’s fast remix model lets you upload an reference images and then combine/transform them via a text prompt at lightning speed!
image-to-image |
Deprecated | 11/4 | → |
| Fashion Size Estimator vision |
Fashion Size Estimator model analyzes human body images to predict clothing size recommendations and estimate key body measurements including height, bust, waist, and hip dimensions.
utility editing |
Deprecated | 11/3 | → |
| Bytedance Upscaler video-to-video |
Upscale videos with Bytedance's video upscaler.
upscaler video bytedance |
Deprecated | 11/3 | → |
| Flux Vision Upscaler image-to-image | Flux Vision Upscaler for magnify/upscaling images with high fidelity and creativity. | Deprecated | 11/2 | → |
| Emu 3.5 Image text-to-image | Generate images from text using Emu 3.5 Image | Deprecated | 11/2 | → |
| Emu 3.5 Image image-to-image | Edit images with a text prompt using Emu 3.5 Image | Deprecated | 11/2 | → |
| Sima Video Upscaler Lite video-to-video |
Upscale your videos at real-time speeds with Sima Labs!
upscale video-to-video |
Deprecated | 10/31 | → |
| Sima Upscaler image-to-image |
Upscale your images at blazingly fast speeds with Sima Labs!
upscale image-to-image |
Deprecated | 10/31 | → |
| Chrono Edit image-to-image |
NVIDIA's Logically Consistent and Physics-Aware Image Editing Model
image-editing |
Deprecated | 10/31 | → |
| Minimax Music text-to-audio |
Generate music from text prompts using the MiniMax Music 2.0 model, which leverages advanced AI techniques to create high-quality, diverse musical compositions.
music audio |
Deprecated | 10/30 | → |
| LongCat Video Distilled text-to-video | Generate long videos in 720p/30fps from text using LongCat Video Distilled | Deprecated | 10/30 | → |
| LongCat Video Distilled image-to-video | Generate long videos in 720p/30fps from images using LongCat Video Distilled | Deprecated | 10/30 | → |
| LongCat Video text-to-video | Generate long videos from text using LongCat Video | Deprecated | 10/30 | → |
| LongCat Video image-to-video | Generate long videos from images using LongCat Video | Deprecated | 10/30 | → |
| LongCat Video image-to-video | Generate long videos in 720p/30fps from images using LongCat Video | Deprecated | 10/30 | → |
| LongCat Video text-to-video | Generate long videos in 720p/30fps from text using LongCat Video | Deprecated | 10/30 | → |
| Qwen Image Edit Trainer training | LoRA trainer for Qwen Image Edit | Deprecated | 10/30 | → |
| Qwen Image Edit Plus Trainer training | LoRA trainer for Qwen Image Edit Plus | Deprecated | 10/30 | → |
| Omnipart unknown | Image-to-3D endpoint for OmniPart, a part-aware 3D generator with semantic decoupling and structural cohesion. | Deprecated | 10/30 | → |
| Fibo json-to-image |
SOTA Open source model trained on licensed data, transforming intent into structured control for precise, high-quality AI image generation in enterprise and agentic workflows.
bria fibo prompt-adherence |
Deprecated | 10/29 | → |
| Fibo text-to-json |
Structured Prompt Generation endpoint for Fibo, Bria's SOTA Open source model
bria fibo structured-prompting |
Deprecated | 10/29 | → |
| Video As Prompt video-to-video |
A model for unified semantic control in video generation. It animates a static reference image using the motion and semantics from a reference video as a prompt.
video-as-prompt semantic control |
Deprecated | 10/29 | → |
| MiniMax Speech 2.6 [HD] text-to-speech |
Generate speech from text prompts and different voices using the MiniMax Speech-2.6 HD model, which leverages advanced AI techniques to create high-quality text-to-speech.
text-to-speech |
Deprecated | 10/29 | → |
| MiniMax Speech 2.6 [Turbo] text-to-speech |
Generate speech from text prompts and different voices using the MiniMax Speech-2.6 HD model, which leverages advanced AI techniques to create high-quality text-to-speech.
text-to-speech |
Deprecated | 10/29 | → |
| Bytedance image-to-3d |
Image to 3D endpoint for Bytedance's high-quality Seed3D 3d model generator.
seed3d.quality bytedance 3d |
Deprecated | 10/29 | → |
| LongCat Video Distilled text-to-video | Generate long videos from text using LongCat Video Distilled | Deprecated | 10/29 | → |
| LongCat Video Distilled image-to-video | Generate long videos from images using LongCat Video Distilled | Deprecated | 10/29 | → |
| MiniMax Hailuo 2.3 [Pro] (Text to Video) text-to-video |
MiniMax Hailuo-2.3 Text To Video API (Pro, 1080p): Advanced text-to-video generation model with 1080p resolution
text-to-video |
Deprecated | 10/28 | → |
| MiniMax Hailuo 2.3 [Standard] (Text to Video) text-to-video |
MiniMax Hailuo-2.3 Text To Video API (Standard, 768p): Advanced text-to-video generation model with 768p resolution
text-to-video |
Deprecated | 10/28 | → |
| MiniMax Hailuo 2.3 Fast [Pro] (Image to Video) image-to-video |
MiniMax Hailuo-2.3-Fast Image To Video API (Pro, 1080p): Advanced fast image-to-video generation model with 1080p resolution
image-to-video |
Deprecated | 10/28 | → |
| MiniMax Hailuo 2.3 [Standard] (Image to Video) image-to-video |
MiniMax Hailuo-2.3 Image To Video API (Standard, 768p): Advanced image-to-video generation model with 768p resolution
image-to-video |
Deprecated | 10/28 | → |
| MiniMax Hailuo 2.3 Fast [Standard] (Image to Video) image-to-video |
MiniMax Hailuo-2.3-Fast Image To Video API (Standard, 768p): Advanced fast image-to-video generation model with 768p resolution
image-to-video |
Deprecated | 10/28 | → |
| MiniMax Hailuo 2.3 [Pro] (Image to Video) image-to-video |
MiniMax Hailuo-2.3 Image To Video API (Pro, 1080p): Advanced image-to-video generation model with 1080p resolution
image-to-video |
Deprecated | 10/28 | → |
| Demucs audio-to-audio |
SOTA stemming model for voice, drums, bass, guitar and more.
audio |
Deprecated | 10/27 | → |
| Piflow text-to-image |
Use the faster speed of piflow to generate images with same quality to that of slower models.
text-to-image |
Deprecated | 10/27 | → |
| Birefnet video-to-video |
Video background removal version of bilateral reference framework (BiRefNet) for high-resolution dichotomous image segmentation (DIS)
utility editing |
Deprecated | 10/26 | → |
| Audio Understanding audio-to-audio |
A audio understanding model to analyze audio content and answer questions about what's happening in the audio based on user prompts.
utility audio |
Deprecated | 10/24 | → |
| Bytedance text-to-video |
Text to Video endpoint for Seedance 1.0 Pro Fast, a next-generation video model designed to deliver maximum performance at minimal cost
bytedance fast motion |
Deprecated | 10/24 | → |
| Bytedance image-to-video |
Image to Video endpoint for Seedance 1.0 Pro Fast, a next-generation video model designed to deliver maximum performance at minimal cost
bytedance seedance pro fast |
Deprecated | 10/24 | → |
| Vidu text-to-video | Use the latest Vidu Q2 models which much more better quality and control on your videos. | Deprecated | 10/24 | → |
| Vidu image-to-video |
Use the latest Vidu Q2 models which much more better quality and control on your videos.
image-to-video |
Deprecated | 10/24 | → |
| Vidu image-to-video |
Use the latest Vidu Q2 models which much more better quality and control on your videos.
image-to-video |
Deprecated | 10/24 | → |
| Vidu video-to-video | Use the latest Vidu Q2 models which much more better quality and control on your videos. | Deprecated | 10/24 | → |
| LTX Video 2.0 Pro text-to-video | Create high-fidelity video with audio from text with LTX-2 Pro. | Deprecated | 10/23 | → |
| LTX Video 2.0 Pro image-to-video | Create high-fidelity video with audio from images with LTX-2 Pro | Deprecated | 10/23 | → |
| LTX Video 2.0 Fast text-to-video | Create high-fidelity video with audio from text with LTX-2 Fast | Deprecated | 10/23 | → |
| LTX Video 2.0 Fast image-to-video | Create high-fidelity video with audio from images with LTX-2 Fast | Deprecated | 10/23 | → |
| Kling Video image-to-video |
Kling 2.5 Turbo Standard: Top-tier image-to-video generation with unparalleled motion fluidity, cinematic visuals, and exceptional prompt precision.
stylized transform |
Deprecated | 10/22 | → |
| GPT Image 1 Mini text-to-image |
GPT Image 1 mini combines OpenAI's advanced language capabilities, powered by GPT-5, with GPT Image 1 Mini for efficient image generation.
text-to-image |
Deprecated | 10/22 | → |
| GPT Image 1 Mini image-to-image |
GPT Image 1 mini combines OpenAI's advanced language capabilities, powered by GPT-5, with GPT Image 1 Mini for efficient image generation.
image-to-image |
Deprecated | 10/22 | → |
| Music Generation text-to-audio |
Generate royalty-free instrumental music from electronic, hip hop, and indie rock to cinematic and classical genres. Perfect for games, films, social content, podcasts, and more.
speech audio music |
Deprecated | 10/21 | → |
| Sound Effect Generation text-to-audio |
Create professional-grade sound effects from animal and vehicle to nature, sci-fi, and otherworldly sounds. Perfect for films, games, and digital content.
speech audio effects |
Deprecated | 10/21 | → |
| Krea Wan 14b- Text to Video text-to-video |
Fast Text-to-Video endpoint for Krea's Wan 14b model.
text to video fast |
Deprecated | 10/20 | → |
| Qwen 3 Guard llm |
Use Qwen 3 Guard to detect and classify text as safe or harmful, delivering precise and reliable safety categorization.
filter safety utility |
Deprecated | 10/20 | → |
| Meshy 5 Remesh 3d-to-3d |
Meshy-5 remesh allows you to remesh and export existing 3D models into various formats
3d-to-3d |
Deprecated | 10/18 | → |
| Meshy 5 Retexture 3d-to-3d |
Meshy-5 retexture applies new, high-quality textures to existing 3D models using either text prompts or reference images. It supports PBR material generation for realistic, production-ready results.
3d-to-3d |
Deprecated | 10/18 | → |
| Reve image-to-image |
Reve’s edit model lets you upload an existing image and then transform it via a text prompt
image-to-image |
Deprecated | 10/17 | → |
| Reve text-to-image |
Reve’s text-to-image model generates detailed visual output that closely follow your instructions, with strong aesthetic quality and accurate text rendering.
text-to-image |
Deprecated | 10/17 | → |
| Reve image-to-image |
Reve’s remix model lets you upload an reference images and then combine/transform them via a text prompt
image-to-image |
Deprecated | 10/17 | → |
| Wan Alpha text-to-video |
Generate videos with transparent backgrounds
transparent alpha |
Deprecated | 10/16 | → |
| Mirelo SFX V1.5 video-to-video |
Generate synced sounds for any video, and return it with its new sound track (like MMAudio)
video-to-video sfx |
Deprecated | 10/15 | → |
| Mirelo SFX V1.5 video-to-audio |
Generate synced sounds for any video, and return the new sound track (like MMAudio)
video-to-audio sfx |
Deprecated | 10/15 | → |
| Image2Pixel image-to-image |
Turn images into pixel-perfect retro art
post-processing pixel-art |
Deprecated | 10/14 | → |
| Kandinsky5 text-to-video | Kandinsky 5.0 is a diffusion model for fast, high-quality text-to-video generation. | Deprecated | 10/13 | → |
| Kandinsky5 text-to-video | Kandinsky 5.0 Distilled is a lightweight diffusion model for fast, high-quality text-to-video generation. | Deprecated | 10/13 | → |
| DreamOmni2 image-to-image | DreamOmni2 is a unified multimodal model for text and image guided image editing. | Deprecated | 10/10 | → |
| Moondream3 Preview [Caption] vision |
Moondream 3 is a vision language model that brings frontier-level visual reasoning with native object detection, pointing, and OCR capabilities to real-world applications requiring fast, inexpensive inference at scale.
Vision |
Deprecated | 10/10 | → |
| Moondream 3 Preview [Query] vision |
Moondream 3 is a vision language model that brings frontier-level visual reasoning with native object detection, pointing, and OCR capabilities to real-world applications requiring fast, inexpensive inference at scale.
Vision |
Deprecated | 10/10 | → |
| Moondream3 Preview [Point] vision |
Moondream 3 is a vision language model that brings frontier-level visual reasoning with native object detection, pointing, and OCR capabilities to real-world applications requiring fast, inexpensive inference at scale.
Vision |
Deprecated | 10/10 | → |
| Moondream3 Preview [Detect] vision |
Moondream 3 is a vision language model that brings frontier-level visual reasoning with native object detection, pointing, and OCR capabilities to real-world applications requiring fast, inexpensive inference at scale.
Vision |
Deprecated | 10/10 | → |
| Kling Video video-to-audio | Generate audio from input videos using Kling | Deprecated | 10/9 | → |
| Sora 2 video-to-video |
Video-to-video remix endpoint for Sora 2, OpenAI’s advanced model that transforms existing videos based on new text or image prompts allowing rich edits, style changes, and creative reinterpretations while preserving motion and structure
video to video audio sora |
Deprecated | 10/9 | → |
| Meshy 6 Preview image-to-3d |
Meshy-6-Preview is the latest model from Meshy. It generates realistic and production ready 3D models.
image-to-3d |
Deprecated | 10/8 | → |
| Meshy 5 Multi image-to-3d |
Meshy-5 multi image generates realistic and production ready 3D models from multiple images.
multi-image-to-3d |
Deprecated | 10/8 | → |
| Meshy 6 Preview text-to-3d |
Meshy-6-Preview is the latest model from Meshy. It generates realistic and production ready 3D models.
text-to-3d |
Deprecated | 10/8 | → |
| Hunyuan Part 3d-to-3d |
Use the capabilities of hunyuan part to generate point clouds from your 3D files.
3D-to-3D point-cloud |
Deprecated | 10/8 | → |
| Wan 2.1 VACE Long Reframe video-to-video | Reframe entire videos scene-by-scene using Wan VACE 2.1 | Deprecated | 10/8 | → |
| Index TTS 2.0 text-to-speech |
Generate natural, clear speeches using Index TTS 2.0 from IndexTeam
text-to-speech |
Deprecated | 10/7 | → |
| Sora 2 text-to-video |
Text-to-video endpoint for Sora 2 Pro, OpenAI's state-of-the-art video model capable of creating richly detailed, dynamic clips with audio from natural language or images.
text-to-video audio sora-2-pro |
Deprecated | 10/6 | → |
| Sora 2 image-to-video |
Image-to-video endpoint for Sora 2 Pro, OpenAI's state-of-the-art video model capable of creating richly detailed, dynamic clips with audio from natural language or images.
image-to-video audio sora-2-pro |
Deprecated | 10/6 | → |
| Sora 2 image-to-video |
Image-to-video endpoint for Sora 2, OpenAI's state-of-the-art video model capable of creating richly detailed, dynamic clips with audio from natural language or images.
image-to-video audio sora |
Deprecated | 10/6 | → |
| Sora 2 text-to-video |
Text-to-video endpoint for Sora 2, OpenAI's state-of-the-art video model capable of creating richly detailed, dynamic clips with audio from natural language or images.
text to video audio sora |
Deprecated | 10/6 | → |
| Ovi Text to Video text-to-video | A unified paradigm for audio-video generation | Deprecated | 10/3 | → |
| Lucidflux image-to-image |
LucidFlux for upscaling images with very high fidelity
image-to-image |
Deprecated | 10/3 | → |
| Qwen Image Edit Plus Lora image-to-image |
LoRA endpoint for the Qwen Image Edit Plus model.
image-to-image image-editing |
Deprecated | 10/3 | → |
| Ovi image-to-video |
Ovi can generate videos with audio from image and text inputs.
image-to-audio-video image-to-video |
Deprecated | 10/3 | → |
| Fabric 1.0 Fast image-to-video | VEED Fabric 1.0 is an image-to-video API that turns any image into a talking video | Deprecated | 10/1 | → |
| Qwen Image Edit image-to-image |
Image to Image Endpoint for Qwen's Image Editing model. Has superior text editing capabilities.
stylized transform |
Deprecated | 9/30 | → |
| Hunyuan Image text-to-image |
Leverage the state-of-the-art capabilities of Hunyuan Image 3.0 to generate visual content that effectively conveys the messaging of your written material.
text-to-image |
Deprecated | 9/28 | → |
| Hyper3d image-to-3d |
Rodin by Hyper3D generates realistic and production ready 3D models from text or images.
image-to-3d |
Deprecated | 9/27 | → |
| Lynx image-to-video |
Generate subject consistent videos using Lynx from ByteDance!
image-to-video subject |
Deprecated | 9/26 | → |
| Wan 2.5 Text to Image text-to-image | Wan 2.5 text-to-image model. | Deprecated | 9/26 | → |
| Wan 2.5 Image to Image image-to-image | Wan 2.5 image-to-image model. | Deprecated | 9/26 | → |
| Wan 2.5 Text to Video text-to-video | Wan 2.5 text-to-video model. | Deprecated | 9/24 | → |
| Wan 2.5 Image to Video image-to-video | Wan 2.5 image-to-video model. | Deprecated | 9/24 | → |
| Bytedance OmniHuman v1.5 image-to-video |
Omnihuman v1.5 is a new and improved version of Omnihuman. It generates video using an image of a human figure paired with an audio file. It produces vivid, high-quality videos where the character’s emotions and movements maintain a strong correlation with the audio.
image-to-video lipsync |
Deprecated | 9/23 | → |
| Product Photoshoot image-to-image | Create product advertisements with an example image of the product | Deprecated | 9/23 | → |
| Kling v2.5 Text to Video text-to-video |
Kling 2.5 Turbo Pro: Top-tier text-to-video generation with unparalleled motion fluidity, cinematic visuals, and exceptional prompt precision.
animation stylized |
Deprecated | 9/23 | → |
| Kling Video image-to-video |
Kling 2.5 Turbo Pro: Top-tier image-to-video generation with unparalleled motion fluidity, cinematic visuals, and exceptional prompt precision.
stylized transform |
Deprecated | 9/23 | → |
| Qwen Image Edit Plus image-to-image |
Endpoint for Qwen's Image Editing Plus model. Has superior text editing capabilities and multi-image support.
image-editing image-to-image high-quality-text |
Deprecated | 9/23 | → |
| Infinitalk video-to-video |
Infinitalk model generates a talking avatar video from an image and audio file. The avatar lip-syncs to the provided audio with natural facial expressions.
video-to-video |
Deprecated | 9/22 | → |
| SeedVR2 image-to-image |
Use SeedVR2 to upscale your images
upscale image-to-image |
Deprecated | 9/22 | → |
| SeedVR2 video-to-video |
Upscale your videos using SeedVR2 with temporal consistency!
upscale video-to-video |
Deprecated | 9/22 | → |
| Wan VACE Video Edit video-to-video |
Edit videos using plain language and Wan VACE
video-edit wan-vace |
Deprecated | 9/22 | → |
| Wan-2.2 Animate Move video-to-video |
Wan-Animate is a video model that generates high-fidelity character videos by replicating the expressions and movements of characters from reference videos.
video to video motion |
Deprecated | 9/21 | → |
| Wan-2.2 Animate Replace video-to-video |
Wan-Animate Replace is a model that can integrate animated characters into reference videos, replacing the original character while preserving the scene’s lighting and color tone for seamless environmental integration.
video to video motion |
Deprecated | 9/21 | → |
| Fabric 1.0 image-to-video | VEED Fabric 1.0 is an image-to-video API that turns any image into a talking video | Deprecated | 9/20 | → |
| Headshot Generator image-to-image |
Generate professional headshot photos with customizable backgrounds.
headshot profile-photo |
Deprecated | 9/19 | → |
| Object Removal image-to-image |
Remove unwanted objects seamlessly from any image.
remove object-removal |
Deprecated | 9/19 | → |
| Perspective Change image-to-image |
Easily adjust the perspective of any image to different angles.
change-angle perspective |
Deprecated | 9/19 | → |
| Photography Effects image-to-image |
Apply diverse photography styles and effects to transform your images.
style-transfer photography |
Deprecated | 9/19 | → |
| Portrait Enhance image-to-image |
Enhance and refine portrait photos with improved clarity and detail.
image-edit enhancement |
Deprecated | 9/19 | → |
| Photo Restoration image-to-image |
Restore old or damaged photos by fixing colors, scratches, and resolution.
photo-restoration image-enhance |
Deprecated | 9/19 | → |
| Style Transfer image-to-image |
Apply artistic styles like impressionism, cubism, or surrealism to your images.
style-transfer |
Deprecated | 9/19 | → |
| Relighting image-to-image |
Adjust and enhance images with different lighting styles.
relighting |
Deprecated | 9/19 | → |
| Texture Transform image-to-image |
Transform objects with different surface textures like marble, wood, or fabric.
texture-transform |
Deprecated | 9/19 | → |
| Virtual Try-on image-to-image |
Try on clothes virtually by combining person and clothing images.
fashion try-on virtual-try-on |
Deprecated | 9/19 | → |
| Product Photography image-to-image |
Generate professional product photography with realistic lighting and backgrounds.
product marketing |
Deprecated | 9/19 | → |
| Product Holding image-to-image |
Place products naturally in a person’s hands for realistic marketing visuals.
product marketing |
Deprecated | 9/19 | → |
| Lucy Edit [Dev] video-to-video | Lucy Edit Dev | Deprecated | 9/18 | → |
| Lucy Edit [Pro] video-to-video | Lucy Edit Pro | Deprecated | 9/18 | → |
| Isaac 01 vision |
Isaac-01 is a multimodal vision-language model from Perceptron for various vision language tasks.
multimodal vision |
Deprecated | 9/18 | → |
| Wan 2.2 VACE Fun A14B video-to-video | VACE Fun for Wan 2.2 A14B from Alibaba-PAI | Deprecated | 9/17 | → |
| Wan 2.2 VACE Fun A14B video-to-video | VACE Fun for Wan 2.2 A14B from Alibaba-PAI | Deprecated | 9/17 | → |
| Wan 2.2 VACE Fun A14B video-to-video | VACE Fun for Wan 2.2 A14B from Alibaba-PAI | Deprecated | 9/17 | → |
| Wan 2.2 VACE Fun A14B video-to-video | VACE Fun for Wan 2.2 A14B from Alibaba-PAI | Deprecated | 9/17 | → |
| Wan 2.2 VACE Fun A14B video-to-video | VACE Fun for Wan 2.2 A14B from Alibaba-PAI | Deprecated | 9/17 | → |
| Qwen Image Edit image-to-image |
Inpainting Endpoint for the Qwen Edit Image editing model.
image-to-image inpainting qwen-image |
Deprecated | 9/17 | → |
| FLUX.1 SRPO [dev] text-to-image | FLUX.1 SRPO [dev] is a 12 billion parameter flow transformer that generates high-quality images from text with incredible aesthetics. It is suitable for personal and commercial use. | Deprecated | 9/16 | → |
| FLUX.1 SRPO [dev] image-to-image | FLUX.1 SRPO [dev] is a 12 billion parameter flow transformer that generates high-quality images from text with incredible aesthetics. It is suitable for personal and commercial use. | Deprecated | 9/16 | → |
| FLUX.1 SRPO [dev] text-to-image | FLUX.1 SRPO [dev] is a 12 billion parameter flow transformer that generates high-quality images from text with incredible aesthetics. It is suitable for personal and commercial use. | Deprecated | 9/15 | → |
| FLUX.1 SRPO [dev] image-to-image | FLUX.1 SRPO [dev] is a 12 billion parameter flow transformer that generates high-quality images from text with incredible aesthetics. It is suitable for personal and commercial use. | Deprecated | 9/15 | → |
| Pshuman image-to-3d |
Use the 6D pose estimation capabilities of PSHuman to generate 3D files from single image.
image-to-3D |
Deprecated | 9/13 | → |
| Kling AI Avatar Pro image-to-video |
Kling AI Avatar Pro: The premium endpoint for creating avatar videos with realistic humans, animals, cartoons, or stylized characters
stylized transform |
Deprecated | 9/13 | → |
| Kling AI Avatar image-to-video |
Kling AI Avatar Standard: Endpoint for creating avatar videos with realistic humans, animals, cartoons, or stylized characters
stylized transform |
Deprecated | 9/13 | → |
| Kling TTS text-to-speech |
Generate speech from text prompts and different voices using the Kling TTS model, which leverages advanced AI techniques to create high-quality text-to-speech.
audio |
Deprecated | 9/13 | → |
| MiniMax (Hailuo AI) Music v1.5 text-to-audio |
Generate music from text prompts using the MiniMax model, which leverages advanced AI techniques to create high-quality, diverse musical compositions.
music |
Deprecated | 9/12 | → |
| Decart Lucy 14b image-to-video | Lucy-14B delivers lightning fast performance that redefines what's possible with image-to-video AI | Deprecated | 9/10 | → |
| Qwen Image Edit Lora image-to-image |
LoRA inference endpoint for the Qwen Image Editing model.
image-to-image image-editing lora |
Deprecated | 9/10 | → |
| Stable Audio 25 audio-to-audio |
Generate high quality music and sound effects using Stable Audio 2.5 from StabilityAI
audio |
Deprecated | 9/10 | → |
| Stable Audio 2.5 text-to-audio |
Generate high quality music and sound effects using Stable Audio 2.5 from StabilityAI
audio |
Deprecated | 9/10 | → |
| Stable Audio 2.5 audio-to-audio |
Generate high quality music and sound effects using Stable Audio 2.5 from StabilityAI
audio |
Deprecated | 9/10 | → |
| Hunyuan Image text-to-image |
Use the amazing capabilities of hunyuan image 2.1 to generate images that express the feelings of your text.
text-to-image |
Deprecated | 9/9 | → |
| Elevenlabs text-to-audio |
Generate realistic audio dialogues using Eleven-v3 from ElevenLabs.
audio |
Deprecated | 9/9 | → |
| Vidu image-to-image |
Vidu Reference-to-Image creates images by using a reference images and combining them with a prompt.
images-to-image |
Deprecated | 9/9 | → |
| Bytedance text-to-image |
A new-generation image creation model ByteDance, Seedream 4.0 integrates image generation and image editing capabilities into a single, unified architecture.
stylized transform |
Deprecated | 9/9 | → |
| Bytedance image-to-image |
A new-generation image creation model ByteDance, Seedream 4.0 integrates image generation and image editing capabilities into a single, unified architecture.
stylized transform |
Deprecated | 9/9 | → |
| Hunyuan Video Foley video-to-video |
Use the capabilities of the hunyuan foley model to bring life to your videos by adding sound effect to them.
video-to-video add-sound |
Deprecated | 9/8 | → |
| Avatars Audio to Video audio-to-video | High-quality avatar videos that feel real, generated from your audio | Deprecated | 9/4 | → |
| Avatars Text to Video text-to-video | High-quality avatar videos that feel real, generated from your text | Deprecated | 9/4 | → |
| Chatterbox text-to-speech |
Whether you're working on memes, videos, games, or AI agents, Chatterbox brings your content to life. Use the first tts from resemble ai.
text-to-speech multilingual |
Deprecated | 9/4 | → |
| Wan image-to-image |
Wan 2.2's 14B model edit high-resolution, photorealistic images with powerful prompt understanding and fine-grained visual detail
image-to-image |
Deprecated | 9/3 | → |
| Elevenlabs text-to-audio |
Generate sound effects using ElevenLabs advanced sound effects model.
sound |
Deprecated | 9/2 | → |
| Sync Lipsync video-to-video |
Generate high-quality realistic lipsync animations from audio while preserving unique details like natural teeth and unique facial features using the state-of-the-art Sync Lipsync 2 Pro model.
animation lip sync high-quality |
Deprecated | 9/2 | → |
| Bytedance image-to-video |
Seedance lite reference-to-video allows the use of 1 to 4 images as reference to create a high-quality video.
reference-to-video image-to-video |
Deprecated | 9/1 | → |
| Uso image-to-image |
Use USO to perform subject driven generations using reference image.
image-to-image |
Deprecated | 8/30 | → |
| Sonauto V2 text-to-audio |
Replace sections of an existing audio with newly generated content
music text-to-music text-to-audio |
Deprecated | 8/28 | → |
| Sonauto V2 audio-to-audio |
Extend an existing song
music text-to-music text-to-audio |
Deprecated | 8/28 | → |
| Wan 2.2 Fun Control video-to-video |
Generate pose or depth controlled video using Alibaba-PAI's Wan 2.2 Fun
wan pose depth |
Deprecated | 8/28 | → |
| Decart image-to-video | Lucy-5B is a model that can create 5-second I2V videos in under 5 seconds, achieving >1x RTF end-to-end | Deprecated | 8/28 | → |
| Pixverse text-to-video | Generate high quality video clips from text and image prompts using PixVerse v5 | Deprecated | 8/27 | → |
| Pixverse v5 Image to Video image-to-video |
Generate high quality video clips from text and image prompts using PixVerse v5
stylized transform |
Deprecated | 8/27 | → |
| Pixverse image-to-video |
Create seamless transition between images using PixVerse v5
stylized transform |
Deprecated | 8/27 | → |
| VibeVoice 1.5B text-to-speech |
Generate long, expressive multi-voice speech using Microsoft's powerful TTS
text-to-speech multi-speaker podcast |
Deprecated | 8/27 | → |
| VibeVoice 7B text-to-speech |
Generate long, expressive multi-voice speech using Microsoft's powerful TTS
text-to-speech multi-speaker podcast |
Deprecated | 8/27 | → |
| Wan-2.2 Speech-to-Video 14B audio-to-video |
Wan-S2V is a video model that generates high-quality videos from static images and audio, with realistic facial expressions, body movements, and professional camera work for film and television applications
audio-to-video talking-head |
Deprecated | 8/27 | → |
| Nano Banana text-to-image |
Google's state-of-the-art image generation and editing model
image-generation |
Deprecated | 8/26 | → |
| Nano Banana image-to-image |
Google's state-of-the-art image generation and editing model
image-editing |
Deprecated | 8/26 | → |
| Gemini 2.5 Flash Image text-to-image |
Nano Banana is Google's state-of-the-art image generation and editing model
text-to-image |
Deprecated | 8/26 | → |
| Gemini 2.5 Flash Image image-to-image |
Gemini 2.5 Flash Image is Google's state-of-the-art image generation and editing model
image-editing |
Deprecated | 8/26 | → |
| Video video-to-video |
Upscale videos up to 8K output resolution. Trained on fully licensed and commercially safe data.
video-upscaling upscale |
Deprecated | 8/26 | → |
| Qwen Image image-to-image |
Qwen-Image (Image-to-Image) transforms and edits input images with high fidelity, enabling precise style transfer, enhancement, and creative modification.
image-to-image |
Deprecated | 8/25 | → |
| Infinitalk image-to-video |
Infinitalk model generates a talking avatar video from an image and audio file. The avatar lip-syncs to the provided audio with natural facial expressions.
stylized transform |
Deprecated | 8/23 | → |
| Infinitalk text-to-video | Infinitalk model generates a talking avatar video from a text and audio file. The avatar lip-syncs to the provided audio with natural facial expressions. | Deprecated | 8/23 | → |
| Elevenlabs text-to-audio |
Generate text-to-speech audio using Eleven-v3 from ElevenLabs.
audio |
Deprecated | 8/20 | → |
| Nextstep 1 image-to-image | Endpoint for NextStep-1 Autoregressive Image Editing model. | Deprecated | 8/20 | → |
| Reimagine image-to-image |
Reimagine uses a structure reference for generating new images while preserving the structure of an input image, guided by text prompts.
Perfect for transforming sketches, illustrations, or photos into new illustrations. Trained exclusively on licensed data
bria |
Deprecated | 8/20 | → |
| Mirelo SFX video-to-video |
Generate synced sounds for any video, and return it with its new sound track
video-to-video sfx |
Deprecated | 8/19 | → |
| Mirelo SFX video-to-audio |
Generate synced sounds for any video, and return the new sound track
sfx |
Deprecated | 8/19 | → |
| Qwen Image Edit image-to-image |
Endpoint for Qwen's Image Editing model. Has superior text editing capabilities.
image-editing image-to-image high-quality-text |
Deprecated | 8/18 | → |
| Qwen Image Trainer training |
Qwen Image LoRA training
lora personalization |
Deprecated | 8/14 | → |
| Marey Realism V1.5 text-to-video | Generate a video from a text prompt with Marey, a generative video model trained exclusively on fully licensed data. | Deprecated | 8/14 | → |
| Marey Realism V1.5 image-to-video | Generate a video starting from an image as the first frame with Marey, a generative video model trained exclusively on fully licensed data. | Deprecated | 8/14 | → |
| Marey Realism V1.5 video-to-video | Pull motion from a reference video and apply it to new subjects or scenes. | Deprecated | 8/14 | → |
| Marey Realism V1.5 video-to-video | Ideal for matching human movement. Your input video determines human poses, gestures, and body movements that will appear in the generated video. | Deprecated | 8/14 | → |
| Stable Avatar audio-to-video |
Stable Avatar generates audio-driven video avatars up to five minutes long
stable-avatar talking-head audio-to-video |
Deprecated | 8/14 | → |
| ControlNet SDXL image-to-image |
Generate Images with ControlNet.
diffusion controlnet manipulation |
Deprecated | 8/13 | → |
| MusePose video-to-video | Animate a reference image with a driving video using MusePose. | Deprecated | 8/13 | → |
| Segment Anything Model image-to-image |
SAM.
segmentation mask |
Deprecated | 8/13 | → |
| LLaVA v1.5 13B vision |
Vision
multimodal vision |
Deprecated | 8/13 | → |
| LTX Video-0.9.5 image-to-video |
Generate videos from prompts and images using LTX Video-0.9.5
video image-to-video |
Deprecated | 8/13 | → |
| Hidream E1 Full image-to-image | Edit images with natural language | Deprecated | 8/13 | → |
| LTX Video-0.9.7 image-to-video |
Deprecated.
Use fal-ai/ltx-video-13b-dev or fal-ai/ltx-video-13b-distilled instead.
video image-to-video |
Deprecated | 8/13 | → |
| LTX Video-0.9.7 text-to-video |
Deprecated.
Use fal-ai/ltx-video-13b-dev or fal-ai/ltx-video-13b-distilled instead.
video text-video |
Deprecated | 8/13 | → |
| LTX Video-0.9.7 video-to-video |
Deprecated.
Use fal-ai/ltx-video-13b-dev or fal-ai/ltx-video-13b-distilled instead.
video image-to-video text-to-video |
Deprecated | 8/13 | → |
| Ltx Video V097 video-to-video | Deprecated. Use fal-ai/ltx-video-13b-dev or fal-ai/ltx-video-13b-distilled instead. | Deprecated | 8/13 | → |
| LTX Video-0.9.7 LoRA text-to-video |
Deprecated.
Use fal-ai/ltx-video-13b-dev or fal-ai/ltx-video-13b-distilled instead.
video ltx-video text-to-video |
Deprecated | 8/13 | → |
| Stable Diffusion with LoRAs text-to-image |
Run Any Stable Diffusion model with customizable LoRA weights.
diffusion lora customization |
Deprecated | 8/13 | → |
| Remove Background image-to-image |
Remove the background from an image.
background removal utility editing |
Deprecated | 8/13 | → |
| Upscale Images image-to-image |
Upscale images by a given factor.
upscaling high-res |
Deprecated | 8/13 | → |
| Inpainting sdxl and sd image-to-image |
Inpaint images with SD and SDXL
editing diffusion |
Deprecated | 8/13 | → |
| Animatediff SparseCtrl LCM text-to-video |
Animate Your Drawings with Latent Consistency Models!
lcm animation stylized |
Deprecated | 8/13 | → |
| Optimized Latent Consistency (SDv1.5) image-to-image |
Produce high-quality images with minimal inference steps. Optimized for 512x512 input image size.
diffusion lcm real-time |
Deprecated | 8/13 | → |
| Fooocus text-to-image |
Default parameters with automated optimizations and quality improvements.
stylized |
Deprecated | 8/13 | → |
| ControlNet SDXL image-to-image |
Generate Images with ControlNet.
diffusion controlnet editing manipulation |
Deprecated | 8/13 | → |
| ControlNet SDXL image-to-image |
Generate Images with ControlNet.
diffusion controlnet editing manipulation |
Deprecated | 8/13 | → |
| PuLID image-to-image |
Tuning-free ID customization.
editing customization personalization |
Deprecated | 8/13 | → |
| Marigold Depth Estimation image-to-image |
Create depth maps using Marigold depth estimation.
depth utility |
Deprecated | 8/13 | → |
| Stable Audio Open text-to-audio |
Open source text-to-audio model.
music |
Deprecated | 8/13 | → |
| DiffusionEdge text-to-image |
Diffusion based high quality edge detection
detection |
Deprecated | 8/13 | → |
| TripoSR image-to-3d | State of the art Image to 3D Object generation | Deprecated | 8/13 | → |
| Latent Consistency (SDXL & SDv1.5) text-to-image |
Produce high-quality images with minimal inference steps.
diffusion lcm real-time |
Deprecated | 8/13 | → |
| Clarity Upscaler image-to-image |
Clarity upscaler for upscaling images with high very fidelity.
upscaling |
Deprecated | 8/13 | → |
| AnimateDiff video-to-video |
Re-animate your videos!
animation stylized |
Deprecated | 8/13 | → |
| MiniMax (Hailuo AI) Video 01 text-to-video |
Generate video clips from your prompts using MiniMax model
motion transformation |
Deprecated | 8/13 | → |
| Fooocus Inpainting text-to-image |
Default parameters with automated optimizations and quality improvements.
stylized editing |
Deprecated | 8/13 | → |
| AnimateDiff Turbo video-to-video |
Re-animate your videos in lightning speed!
animation stylized turbo |
Deprecated | 8/13 | → |
| Midas Depth Estimation image-to-image |
Create depth maps using Midas depth estimation.
depth utility |
Deprecated | 8/13 | → |
| Stable Video Diffusion Turbo image-to-video |
Generate short video clips from your images using SVD v1.1 at Lightning Speed
turbo |
Deprecated | 8/13 | → |
| Face Retoucher image-to-image |
Automatically retouches faces to smooth skin and remove blemishes.
editing |
Deprecated | 8/13 | → |
| Fooocus Image Prompt text-to-image |
Default parameters with automated optimizations and quality improvements.
stylized |
Deprecated | 8/13 | → |
| Illusion Diffusion text-to-image |
Create illusions conditioned on image.
composition stylized |
Deprecated | 8/13 | → |
| AnimateDiff Turbo text-to-video |
Animate your ideas in lightning speed!
animation stylized turbo |
Deprecated | 8/13 | → |
| LLaVA v1.6 34B vision |
Vision
multimodal vision |
Deprecated | 8/13 | → |
| Any LLM llm |
Use any large language model from our selected catalogue (powered by OpenRouter)
chat claude gpt streaming |
Deprecated | 8/13 | → |
| Fooocus text-to-image | Fooocus extreme speed mode as a standalone app. | Deprecated | 8/13 | → |
| Latent Consistency Models (v1.5/XL) image-to-image |
Run SDXL at the speed of light
lcm diffusion turbo real-time editing |
Deprecated | 8/13 | → |
| Latent Consistency Models (v1.5/XL) text-to-image |
Run SDXL at the speed of light
lcm diffusion turbo real-time |
Deprecated | 8/13 | → |
| Latent Consistency Models (v1.5/XL) image-to-image |
Run SDXL at the speed of light
lcm diffusion turbo real-time editing |
Deprecated | 8/13 | → |
| Whisper speech-to-text |
Whisper is a model for speech transcription and translation.
transcription translation speech |
Deprecated | 8/13 | → |
| AnimateDiff text-to-video |
Animate your ideas!
animation stylized |
Deprecated | 8/13 | → |
| AMT Interpolation video-to-video |
Interpolate between video frames
interpolation editing |
Deprecated | 8/13 | → |
| Playground v2.5 image-to-image |
State-of-the-art open-source model in aesthetic quality
artistic style |
Deprecated | 8/13 | → |
| Hyper SDXL text-to-image |
Hyper-charge SDXL's performance and creativity.
diffusion real-time |
Deprecated | 8/13 | → |
| Stable Diffusion XL Lightning image-to-image |
Run SDXL at the speed of light
diffusion lightning |
Deprecated | 8/13 | → |
| Playground v2.5 image-to-image |
State-of-the-art open-source model in aesthetic quality
inpaint artistic style |
Deprecated | 8/13 | → |
| Stable Diffusion XL Lightning image-to-image |
Run SDXL at the speed of light
diffusion lightning editing |
Deprecated | 8/13 | → |
| Birefnet Background Removal image-to-image |
bilateral reference framework (BiRefNet) for high-resolution dichotomous image segmentation (DIS)
background removal segmentation high-res utility |
Deprecated | 8/13 | → |
| Creative Upscaler image-to-image |
Create creative upscaled images.
upscaling |
Deprecated | 8/13 | → |
| ControlNet SDXL text-to-image |
Generate Images with ControlNet.
diffusion controlnet manipulation |
Deprecated | 8/13 | → |
| T2V Turbo - Video Crafter text-to-video |
Generate short video clips from your prompts
turbo |
Deprecated | 8/13 | → |
| PhotoMaker image-to-image |
Customizing Realistic Human Photos via Stacked ID Embedding
editing customization realism personalization |
Deprecated | 8/13 | → |
| Face to Sticker image-to-image |
Create stickers from faces.
sticker editing |
Deprecated | 8/13 | → |
| Fooocus text-to-image |
Fooocus extreme speed mode as a standalone app.
stylized |
Deprecated | 8/13 | → |
| Moondream vision |
Answer questions from the images.
multimodal vision |
Deprecated | 8/13 | → |
| NSFW Filter vision |
Predict the probability of an image being NSFW.
filter safety utility |
Deprecated | 8/13 | → |
| Wizper (Whisper v3 -- fal.ai edition) speech-to-text |
[Experimental] Whisper v3 Large -- but optimized by our inference wizards. Same WER, double the performance!
transcription speech |
Deprecated | 8/13 | → |
| Sad Talker image-to-video |
Learning Realistic 3D Motion Coefficients for Stylized Audio-Driven Single Image Talking Face Animation
animation |
Deprecated | 8/13 | → |
| AuraSR image-to-image |
Upscale your images with AuraSR.
upscaling high-res |
Deprecated | 8/13 | → |
| Stable Diffusion XL Lightning text-to-image |
Run SDXL at the speed of light
diffusion lightning real-time |
Deprecated | 8/13 | → |
| MuseTalk image-to-video |
MuseTalk is a real-time high quality audio-driven lip-syncing model. Use MuseTalk to animate a face with your own audio.
animation lip sync real-time |
Deprecated | 8/13 | → |
| Layer Diffusion XL text-to-image | SDXL with an alpha channel. | Deprecated | 8/13 | → |
| Stable Diffusion v1.5 text-to-image |
Stable Diffusion v1.5
diffusion |
Deprecated | 8/13 | → |
| Stable Diffusion XL image-to-image |
Run SDXL at the speed of light
diffusion high-res lora ip-adapter controlnet |
Deprecated | 8/13 | → |
| Stable Diffusion XL image-to-image |
Run SDXL at the speed of light
diffusion high-res lora ip-adapter controlnet |
Deprecated | 8/13 | → |
| Stable Diffusion with LoRAs image-to-image |
Run Any Stable Diffusion model with customizable LoRA weights.
diffusion lora customization fine-tuning |
Deprecated | 8/13 | → |
| Stable Diffusion with LoRAs image-to-image |
Run Any Stable Diffusion model with customizable LoRA weights.
diffusion lora customization fine-tuning |
Deprecated | 8/13 | → |
| IP Adapter Face ID image-to-image |
High quality zero-shot personalization
ip-adapter personalization customization editing |
Deprecated | 8/13 | → |
| Hyper SDXL image-to-image |
Hyper-charge SDXL's performance and creativity.
diffusion |
Deprecated | 8/13 | → |
| Dreamshaper text-to-image |
Dreamshaper model.
stylized diffusion |
Deprecated | 8/13 | → |
| Realistic Vision text-to-image |
Generate realistic images.
realism diffusion |
Deprecated | 8/13 | → |
| Hyper SDXL image-to-image |
Hyper-charge SDXL's performance and creativity.
diffusion editing |
Deprecated | 8/13 | → |
| Playground v2.5 text-to-image |
State-of-the-art open-source model in aesthetic quality
artistic style |
Deprecated | 8/13 | → |
| Lightning Models text-to-image |
Collection of SDXL Lightning models.
diffusion lightning |
Deprecated | 8/13 | → |
| Omni Zero image-to-image |
Any pose, any style, any identity
style transfer |
Deprecated | 8/13 | → |
| CCSR Upscaler image-to-image |
SOTA Image Upscaler
upscaling |
Deprecated | 8/13 | → |
| SD 1.5 Depth ControlNet image-to-image |
SD 1.5 ControlNet
diffusion editing manipulation controlnet |
Deprecated | 8/13 | → |
| DWPose Pose Prediction image-to-image |
Predict poses from images.
pose utility |
Deprecated | 8/13 | → |
| Stable Video Diffusion Turbo text-to-video |
Generate short video clips from your images using SVD v1.1 at Lightning Speed
lcm diffusion turbo |
Deprecated | 8/13 | → |
| Luma Dream Machine image-to-video |
Generate video clips from your images using Luma Dream Machine v1.5
motion transformation |
Deprecated | 8/13 | → |
| Luma Photon text-to-image | Generate images from your prompts using Luma Photon. Photon is the most creative, personalizable, and intelligent visual models for creatives, bringing a step-function change in the cost of high-quality image generation. | Deprecated | 8/13 | → |
| SoteDiffusion text-to-image |
Anime finetune of Würstchen V3.
lcm stylized |
Deprecated | 8/13 | → |
| Stable Diffusion V3 image-to-image |
Stable Diffusion 3 Medium (Image to Image) is a Multimodal Diffusion Transformer (MMDiT) model that improves image quality, typography, prompt understanding, and efficiency.
diffusion editing style |
Deprecated | 8/13 | → |
| Stable Diffusion XL text-to-image |
Run SDXL at the speed of light
diffusion lora embeddings high-res style |
Deprecated | 8/13 | → |
| Florence-2 Large vision |
Florence-2 is an advanced vision foundation model that uses a prompt-based approach to handle a wide range of vision and vision-language tasks
captioning multimodal vision |
Deprecated | 8/13 | → |
| Florence-2 Large image-to-image |
Florence-2 is an advanced vision foundation model that uses a prompt-based approach to handle a wide range of vision and vision-language tasks
multimodal vision segmentation |
Deprecated | 8/13 | → |
| Florence-2 Large vision |
Florence-2 is an advanced vision foundation model that uses a prompt-based approach to handle a wide range of vision and vision-language tasks
captioning multimodal vision |
Deprecated | 8/13 | → |
| Florence-2 Large vision |
Florence-2 is an advanced vision foundation model that uses a prompt-based approach to handle a wide range of vision and vision-language tasks
multimodal vision |
Deprecated | 8/13 | → |
| Florence-2 Large image-to-image |
Florence-2 is an advanced vision foundation model that uses a prompt-based approach to handle a wide range of vision and vision-language tasks
multimodal vision |
Deprecated | 8/13 | → |
| Florence-2 Large image-to-image |
Florence-2 is an advanced vision foundation model that uses a prompt-based approach to handle a wide range of vision and vision-language tasks
multimodal vision segmentation |
Deprecated | 8/13 | → |
| Florence-2 Large vision |
Florence-2 is an advanced vision foundation model that uses a prompt-based approach to handle a wide range of vision and vision-language tasks
ocr multimodal vision |
Deprecated | 8/13 | → |
| Florence-2 Large vision |
Florence-2 is an advanced vision foundation model that uses a prompt-based approach to handle a wide range of vision and vision-language tasks
captioning multimodal vision |
Deprecated | 8/13 | → |
| Florence-2 Large image-to-image |
Florence-2 is an advanced vision foundation model that uses a prompt-based approach to handle a wide range of vision and vision-language tasks
ocr multimodal vision |
Deprecated | 8/13 | → |
| Florence-2 Large image-to-image |
Florence-2 is an advanced vision foundation model that uses a prompt-based approach to handle a wide range of vision and vision-language tasks
multimodal vision |
Deprecated | 8/13 | → |
| Florence-2 Large image-to-image |
Florence-2 is an advanced vision foundation model that uses a prompt-based approach to handle a wide range of vision and vision-language tasks
multimodal vision |
Deprecated | 8/13 | → |
| Florence-2 Large vision |
Florence-2 is an advanced vision foundation model that uses a prompt-based approach to handle a wide range of vision and vision-language tasks
multimodal vision |
Deprecated | 8/13 | → |
| Florence-2 Large image-to-image |
Florence-2 is an advanced vision foundation model that uses a prompt-based approach to handle a wide range of vision and vision-language tasks
multimodal vision detection |
Deprecated | 8/13 | → |
| Florence-2 Large image-to-image |
Florence-2 is an advanced vision foundation model that uses a prompt-based approach to handle a wide range of vision and vision-language tasks
detection multimodal vision |
Deprecated | 8/13 | → |
| Stable Cascade text-to-image |
Stable Cascade: Image generation on a smaller & cheaper latent space.
diffusion lcm |
Deprecated | 8/13 | → |
| Era 3D image-to-image | A powerful image to novel multiview model with normals. | Deprecated | 8/13 | → |
| Live Portrait image-to-video |
Transfer expression from a video to a portrait.
expression animation |
Deprecated | 8/13 | → |
| FLUX.1 [dev] image-to-image |
FLUX.1 Image-to-Image is a high-performance endpoint for the FLUX.1 [dev] model that enables rapid transformation of existing images, delivering high-quality style transfers and image modifications with the core FLUX capabilities.
style transfer |
Deprecated | 8/13 | → |
| AMT Frame Interpolation image-to-video |
Interpolate between image frames
interpolation editing |
Deprecated | 8/13 | → |
| Kolors text-to-image |
Photorealistic Text-to-Image
realism diffusion |
Deprecated | 8/13 | → |
| SDXL ControlNet Union image-to-image |
An efficent SDXL multi-controlnet inpainting model.
diffusion controlnet composition |
Deprecated | 8/13 | → |
| SDXL ControlNet Union image-to-image |
An efficent SDXL multi-controlnet image-to-image model.
diffusion controlnet composition |
Deprecated | 8/13 | → |
| SDXL ControlNet Union text-to-image |
An efficent SDXL multi-controlnet text-to-image model.
diffusion controlnet composition |
Deprecated | 8/13 | → |
| FLUX.1 [dev] with LoRAs text-to-image |
Super fast endpoint for the FLUX.1 [dev] model with LoRA support, enabling rapid and high-quality image generation using pre-trained LoRA adaptations for personalization, specific styles, brand identities, and product-specific outputs.
lora personalization |
Deprecated | 8/13 | → |
| PixArt-Σ text-to-image |
Weak-to-Strong Training of Diffusion Transformer for 4K Text-to-Image Generation
diffusion |
Deprecated | 8/13 | → |
| Sana text-to-image | Sana can synthesize high-resolution, high-quality images with strong text-image alignment at a remarkably fast speed, with the ability to generate 4K images in less than a second. | Deprecated | 8/13 | → |
| FLUX.1 Subject text-to-image |
Super fast endpoint for the FLUX.1 [schnell] model with subject input capabilities, enabling rapid and high-quality image generation for personalization, specific styles, brand identities, and product-specific outputs.
personalization customization |
Deprecated | 8/13 | → |
| Fooocus Upscale or Vary text-to-image |
Default parameters with automated optimizations and quality improvements.
upscaling vary stylized |
Deprecated | 8/13 | → |
| FLUX.1 [dev] with LoRAs image-to-image |
FLUX LoRA Image-to-Image is a high-performance endpoint that transforms existing images using FLUX models, leveraging LoRA adaptations to enable rapid and precise image style transfer, modifications, and artistic variations.
lora style transfer |
Deprecated | 8/13 | → |
| FLUX.1 [dev] with Controlnets and Loras image-to-image |
A specialized FLUX endpoint combining differential diffusion control with LoRA, ControlNet, and IP-Adapter support, enabling precise, region-specific image transformations through customizable change maps.
lora controlnet ip-adapter |
Deprecated | 8/13 | → |
| FLUX.1 [dev] with Controlnets and Loras image-to-image |
FLUX General Inpainting is a versatile endpoint that enables precise image editing and completion, supporting multiple AI extensions including LoRA, ControlNet, and IP-Adapter for enhanced control over inpainting results and sophisticated image modifications.
lora controlnet ip-adapter |
Deprecated | 8/13 | → |
| FLUX.1 [dev] with Controlnets and Loras image-to-image |
FLUX General Image-to-Image is a versatile endpoint that transforms existing images with support for LoRA, ControlNet, and IP-Adapter extensions, enabling precise control over style transfer, modifications, and artistic variations through multiple guidance methods.
lora controlnet ip-adapter |
Deprecated | 8/13 | → |
| Segment Anything Model 2 video-to-video |
SAM 2 is a model for segmenting images and videos in real-time.
segmentation mask real-time |
Deprecated | 8/13 | → |
| Segment Anything Model 2 image-to-image |
SAM 2 is a model for segmenting images and videos in real-time.
segmentation mask real-time |
Deprecated | 8/13 | → |
| Stable Diffusion V3 text-to-image |
Stable Diffusion 3 Medium (Text to Image) is a Multimodal Diffusion Transformer (MMDiT) model that improves image quality, typography, prompt understanding, and efficiency.
diffusion style |
Deprecated | 8/13 | → |
| FLUX.1 [dev] with Controlnets and Loras text-to-image |
A versatile endpoint for the FLUX.1 [dev] model that supports multiple AI extensions including LoRA, ControlNet conditioning, and IP-Adapter integration, enabling comprehensive control over image generation through various guidance methods.
lora controlnet ip-adapter |
Deprecated | 8/13 | → |
| ControlNeXt SVD video-to-video |
Animate a reference image with a driving video using ControlNeXt.
animation stylized |
Deprecated | 8/13 | → |
| Stable Video Diffusion text-to-video | Generate short video clips from your prompts using SVD v1.1 | Deprecated | 8/13 | → |
| Image Preprocessors image-to-image |
PIDI (Pidinet) preprocessor.
detection preprocess utility controlnet |
Deprecated | 8/13 | → |
| Image Preprocessors image-to-image |
M-LSD line segment detection preprocessor.
preprocess utility controlnet |
Deprecated | 8/13 | → |
| Image Preprocessors image-to-image |
TEED (Temporal Edge Enhancement Detection) preprocessor.
preprocess detection utility controlnet |
Deprecated | 8/13 | → |
| Image Preprocessors image-to-image |
ZoeDepth preprocessor.
depth preprocess utility controlnet |
Deprecated | 8/13 | → |
| Image Preprocessors image-to-image |
Segment Anything Model (SAM) preprocessor.
segmentation preprocess utility mask controlnet |
Deprecated | 8/13 | → |
| Image Preprocessors image-to-image |
Line art preprocessor.
preprocess utility sketch controlnet |
Deprecated | 8/13 | → |
| Image Preprocessors image-to-image |
MiDaS depth estimation preprocessor.
depth preprocess utility controlnet |
Deprecated | 8/13 | → |
| Image Preprocessors image-to-image |
Depth Anything v2 preprocessor.
depth preprocess utility controlnet |
Deprecated | 8/13 | → |
| Image Preprocessors image-to-image |
Scribble preprocessor.
preprocess utility editing controlnet sketch |
Deprecated | 8/13 | → |
| Image Preprocessors image-to-image |
Holistically-Nested Edge Detection (HED) preprocessor.
preprocess detection utility controlnet |
Deprecated | 8/13 | → |
| High Quality Stable Video Diffusion image-to-video | Generate short video clips from your images using SVD v1.1 | Deprecated | 8/13 | → |
| FLUX.1 [dev] with Controlnets and Loras image-to-image |
A general purpose endpoint for the FLUX.1 [dev] model, implementing the RF-Inversion pipeline. This can be used to edit a reference image based on a prompt.
rf-inversion editing lora |
Deprecated | 8/13 | → |
| FLUX.1 [dev] Inpainting with LoRAs text-to-image |
Super fast endpoint for the FLUX.1 [dev] inpainting model with LoRA support, enabling rapid and high-quality image inpaingting using pre-trained LoRA adaptations for personalization, specific styles, brand identities, and product-specific outputs.
lora personalization |
Deprecated | 8/13 | → |
| Live Portrait image-to-image |
Transfer expression from a video to a portrait.
expression animation |
Deprecated | 8/13 | → |
| FLUX.1 [pro] text-to-image | FLUX.1 [pro] new is an accelerated version of FLUX.1 [pro], maintaining professional-grade image quality while delivering significantly faster generation speeds. | Deprecated | 8/13 | → |
| LTX Video (preview) text-to-video | Generate videos from prompts using LTX Video | Deprecated | 8/13 | → |
| Kling 1.0 text-to-video |
Generate video clips from your prompts using Kling 1.0 (pro)
motion |
Deprecated | 8/13 | → |
| Kling 1.0 image-to-video |
Generate video clips from your images using Kling 1.0
motion |
Deprecated | 8/13 | → |
| Kling 1.5 image-to-video | Generate video clips from your images using Kling 1.5 (pro) | Deprecated | 8/13 | → |
| Kling 1.0 image-to-video |
Generate video clips from your images using Kling 1.0 (pro)
motion |
Deprecated | 8/13 | → |
| Any VLM vision |
Use any vision language model from our selected catalogue (powered by OpenRouter)
multimodal vision streaming |
Deprecated | 8/13 | → |
| CogVideoX-5B video-to-video |
Generate videos from videos and prompts using CogVideoX-5B
editing |
Deprecated | 8/13 | → |
| F5 TTS text-to-audio |
F5 TTS
speech |
Deprecated | 8/13 | → |
| CogVideoX-5B image-to-video | Generate videos from images and prompts using CogVideoX-5B | Deprecated | 8/13 | → |
| Hunyuan Video text-to-video |
Hunyuan Video is an Open video generation model with high visual quality, motion diversity, text-video alignment, and generation stability. This endpoint generates videos from text descriptions.
motion |
Deprecated | 8/13 | → |
| Stable Diffusion 3.5 Medium text-to-image |
Stable Diffusion 3.5 Medium is a Multimodal Diffusion Transformer (MMDiT) text-to-image model that features improved performance in image quality, typography, complex prompt understanding, and resource-efficiency.
diffusion typography style |
Deprecated | 8/13 | → |
| Stable Diffusion 3.5 Large text-to-image |
Stable Diffusion 3.5 Large is a Multimodal Diffusion Transformer (MMDiT) text-to-image model that features improved performance in image quality, typography, complex prompt understanding, and resource-efficiency.
diffusion typography style |
Deprecated | 8/13 | → |
| Birefnet Background Removal image-to-image |
bilateral reference framework (BiRefNet) for high-resolution dichotomous image segmentation (DIS)
background removal segmentation high-res utility |
Deprecated | 8/13 | → |
| PuLID Flux image-to-image |
An endpoint for personalized image generation using Flux as per given description.
personalization style transfer |
Deprecated | 8/13 | → |
| MiniMax (Hailuo AI) Video 01 image-to-video |
Generate video clips from your images using MiniMax Video model
motion transformation |
Deprecated | 8/13 | → |
| FLUX.1 [dev] Differential Diffusion image-to-image |
FLUX.1 Differential Diffusion is a rapid endpoint that enables swift, granular control over image transformations through change maps, delivering fast and precise region-specific modifications while maintaining FLUX.1 [dev]'s high-quality output.
transformation |
Deprecated | 8/13 | → |
| Train Flux LoRAs For Portraits training |
FLUX LoRA training optimized for portrait generation, with bright highlights, excellent prompt following and highly detailed results.
lora personalization |
Deprecated | 8/13 | → |
| Mochi 1 text-to-video | Mochi 1 preview is an open state-of-the-art video generation model with high-fidelity motion and strong prompt adherence in preliminary evaluation. | Deprecated | 8/13 | → |
| IC-Light-v2 for Image Relighting image-to-image |
An endpoint for re-lighting photos and changing their backgrounds per a given description
relighting editing |
Deprecated | 8/13 | → |
| Kolors Image to Image image-to-image |
Photorealistic Image-to-Image
realism editing diffusion |
Deprecated | 8/13 | → |
| FLUX.1 [pro] Redux image-to-image |
FLUX.1 [pro] Redux is a high-performance endpoint for the FLUX.1 [pro] model that enables rapid transformation of existing images, delivering high-quality style transfers and image modifications with the core FLUX capabilities.
style transfer |
Deprecated | 8/13 | → |
| FLUX.1 [pro] Fill image-to-image |
FLUX.1 [pro] Fill is a high-performance endpoint for the FLUX.1 [pro] model that enables rapid transformation of existing images, delivering high-quality style transfers and image modifications with the core FLUX capabilities.
editing |
Deprecated | 8/13 | → |
| FLUX1.1 [pro] ultra Redux image-to-image |
FLUX1.1 [pro] ultra Redux is a high-performance endpoint for the FLUX1.1 [pro] model that enables rapid transformation of existing images, delivering high-quality style transfers and image modifications with the core FLUX capabilities.
style transfer high-res |
Deprecated | 8/13 | → |
| LTX Video (preview) image-to-video | Generate videos from images using LTX Video | Deprecated | 8/13 | → |
| FLUX.1 [dev] Depth with LoRAs image-to-image |
Generate high-quality images from depth maps using Flux.1 [dev] depth estimation model. The model produces accurate depth representations for scene understanding and 3D visualization.
depth lora utility composition |
Deprecated | 8/13 | → |
| FLUX.1 [dev] Redux image-to-image |
FLUX.1 [dev] Redux is a high-performance endpoint for the FLUX.1 [dev] model that enables rapid transformation of existing images, delivering high-quality style transfers and image modifications with the core FLUX capabilities.
|
Deprecated | 8/13 | → |
| FLUX1.1 [pro] Redux image-to-image |
FLUX1.1 [pro] Redux is a high-performance endpoint for the FLUX1.1 [pro] model that enables rapid transformation of existing images, delivering high-quality style transfers and image modifications with the core FLUX capabilities.
style transfer |
Deprecated | 8/13 | → |
| FLUX.1 [schnell] text-to-image | FLUX.1 [schnell] is a 12 billion parameter flow transformer that generates high-quality images from text in 1 to 4 steps, suitable for personal and commercial use. | Deprecated | 8/13 | → |
| Kling 1.5 text-to-video | Generate video clips from your prompts using Kling 1.5 (pro) | Deprecated | 8/13 | → |
| FLUX.1 [schnell] Redux image-to-image |
FLUX.1 [schnell] Redux is a high-performance endpoint for the FLUX.1 [schnell] model that enables rapid transformation of existing images, delivering high-quality style transfers and image modifications with the core FLUX capabilities.
style transfer |
Deprecated | 8/13 | → |
| OmniGen v1 text-to-image |
OmniGen is a unified image generation model that can generate a wide range of images from multi-modal prompts. It can be used for various tasks such as Image Editing, Personalized Image Generation, Virtual Try-On, Multi Person Generation and more!
multimodal editing try-on |
Deprecated | 8/13 | → |
| AuraFlow text-to-image |
AuraFlow v0.3 is an open-source flow-based text-to-image generation model that achieves state-of-the-art results on GenEval. The model is currently in beta.
typography style |
Deprecated | 8/13 | → |
| Luma Photon Flash text-to-image | Generate images from your prompts using Luma Photon Flash. Photon Flash is the most creative, personalizable, and intelligent visual models for creatives, bringing a step-function change in the cost of high-quality image generation. | Deprecated | 8/13 | → |
| Kling 1.0 text-to-video |
Generate video clips from your prompts using Kling 1.0
motion |
Deprecated | 8/13 | → |
| Ideogram V2 Remix image-to-image |
Reimagine existing images with Ideogram V2's remix feature. Create variations and adaptations while preserving core elements and adding new creative directions through prompt guidance.
realism typography |
Deprecated | 8/13 | → |
| Ideogram V2 Turbo Remix image-to-image |
Rapidly create image variations with Ideogram V2 Turbo Remix. Fast and efficient reimagining of existing images while maintaining creative control through prompt guidance.
realism typography |
Deprecated | 8/13 | → |
| Ideogram V2 Turbo Edit image-to-image |
Edit images faster with Ideogram V2 Turbo. Quick modifications and adjustments while preserving the high-quality standards and realistic outputs of Ideogram.
realism typography |
Deprecated | 8/13 | → |
| Video Upscaler video-to-video |
The video upscaler endpoint uses RealESRGAN on each frame of the input video to upscale the video to a higher resolution.
video generation video to video ai video high fidelity motion |
Deprecated | 8/13 | → |
| Ideogram V2 Turbo text-to-image |
Accelerated image generation with Ideogram V2 Turbo. Create high-quality visuals, posters, and logos with enhanced speed while maintaining Ideogram's signature quality.
realism typography |
Deprecated | 8/13 | → |
| Ideogram V2 text-to-image |
Generate high-quality images, posters, and logos with Ideogram V2. Features exceptional typography handling and realistic outputs optimized for commercial and creative use.
realism typography |
Deprecated | 8/13 | → |
| Luma Dream Machine text-to-video |
Generate video clips from your prompts using Luma Dream Machine v1.5
motion transformation |
Deprecated | 8/13 | → |
| MMAudio V2 video-to-video |
MMAudio generates synchronized audio given video and/or text inputs. It can be combined with video models to get videos with audio.
ai video fast |
Deprecated | 8/13 | → |
| Trellis image-to-3d |
Generate 3D models from your images using Trellis. A native 3D generative model enabling versatile and high-quality 3D asset creation.
stylized |
Deprecated | 8/13 | → |
| Ideogram V2 Edit image-to-image |
Transform existing images with Ideogram V2's editing capabilities. Modify, adjust, and refine images while maintaining high fidelity and realistic outputs with precise prompt control.
realism typography |
Deprecated | 8/13 | → |
| MiniMax (Hailuo AI) Video 01 Live text-to-video |
Generate video clips from your prompts using MiniMax model
motion transformation |
Deprecated | 8/13 | → |
| Hyper3D Rodin image-to-3d |
Rodin by Hyper3D generates realistic and production ready 3D models from text or images.
stylized |
Deprecated | 8/13 | → |
| Recraft 20b text-to-image |
Recraft 20b is a new and affordable text-to-image model.
image generation vector art typograph style |
Deprecated | 8/13 | → |
| MiniMax (Hailuo AI) Video 01 Live image-to-video |
Generate video clips from your images using MiniMax Video model
motion transformation |
Deprecated | 8/13 | → |
| MiniMax (Hailuo AI) Music text-to-audio |
Generate music from text prompts using the MiniMax model, which leverages advanced AI techniques to create high-quality, diverse musical compositions.
music |
Deprecated | 8/13 | → |
| Leffa Virtual TryOn image-to-image |
Leffa Virtual TryOn is a high quality image based Try-On endpoint which can be used for commercial try on.
try-on fashion clothing |
Deprecated | 8/13 | → |
| FLUX1.1 [pro] ultra text-to-image |
FLUX1.1 [pro] ultra is the newest version of FLUX1.1 [pro], maintaining professional-grade image quality while delivering up to 2K resolution with improved photo realism.
high-res realism |
Deprecated | 8/13 | → |
| Leffa Pose Transfer image-to-image |
Leffa Pose Transfer is an endpoint for changing pose of an image with a reference image.
pose utility |
Deprecated | 8/13 | → |
| try-on image-to-image |
Image based high quality Virtual Try-On
try-on fashion clothing |
Deprecated | 8/13 | → |
| Bria RMBG 2.0 image-to-image |
Bria RMBG 2.0 enables seamless removal of backgrounds from images, ideal for professional editing tasks. Trained exclusively on licensed data for safe and risk-free commercial use. Model weights for commercial use are available here: https://share-eu1.hsforms.com/2GLpEVQqJTI2Lj7AMYwgfIwf4e04
background removal image segmentation high resolution utility rembg |
Deprecated | 8/13 | → |
| Bria Product Shot image-to-image |
Place any product in any scenery with just a prompt or reference image while maintaining high integrity of the product. Trained exclusively on licensed data for safe and risk-free commercial use and optimized for eCommerce.
product photography |
Deprecated | 8/13 | → |
| Bria Text-to-Image HD text-to-image |
Bria's Text-to-Image model for HD images. Trained exclusively on licensed data for safe and risk-free commercial use. Available also as source code and weights. For access to weights: https://bria.ai/contact-us
image generation |
Deprecated | 8/13 | → |
| FLUX.1 [dev] Fill with LoRAs image-to-image |
FLUX.1 [dev] Fill is a high-performance endpoint for the FLUX.1 [pro] model that enables rapid transformation of existing images, delivering high-quality style transfers and image modifications with the core FLUX capabilities.
editing lora |
Deprecated | 8/13 | → |
| Bria Eraser image-to-image |
Bria Eraser enables precise removal of unwanted objects from images while maintaining high-quality outputs. Trained exclusively on licensed data for safe and risk-free commercial use. Access the model's source code and weights: https://bria.ai/contact-us
image editing object removal |
Deprecated | 8/13 | → |
| Bria Background Replace image-to-image |
Bria Background Replace allows for efficient swapping of backgrounds in images via text prompts or reference image, delivering realistic and polished results. Trained exclusively on licensed data for safe and risk-free commercial use
image editing |
Deprecated | 8/13 | → |
| PlayAI Text-to-Speech v3 text-to-speech | Blazing-fast text-to-speech. Generate audio with improved emotional tones and extensive multilingual support. Ideal for high-volume processing and efficient workflows. | Deprecated | 8/13 | → |
| Bria GenFill image-to-image |
Bria GenFill enables high-quality object addition or visual transformation. Trained exclusively on licensed data for safe and risk-free commercial use. Access the model's source code and weights: https://bria.ai/contact-us
image editing |
Deprecated | 8/13 | → |
| Bria Text-to-Image Base text-to-image |
Bria's Text-to-Image model, trained exclusively on licensed data for safe and risk-free commercial use. Available also as source code and weights. For access to weights: https://bria.ai/contact-us
image generation |
Deprecated | 8/13 | → |
| Bria Text-to-Image Fast text-to-image |
Bria's Text-to-Image model with perfect harmony of latency and quality. Trained exclusively on licensed data for safe and risk-free commercial use. Available also as source code and weights. For access to weights: https://bria.ai/contact-us
image generation |
Deprecated | 8/13 | → |
| Bria Expand Image image-to-image |
Bria Expand expands images beyond their borders in high quality. Trained exclusively on licensed data for safe and risk-free commercial use. Access the model's source code and weights: https://bria.ai/contact-us
outpainting |
Deprecated | 8/13 | → |
| PlayAI Text-to-Speech Dialog text-to-audio |
Generate natural-sounding multi-speaker dialogues, and audio. Perfect for expressive outputs, storytelling, games, animations, and interactive media.
audio |
Deprecated | 8/13 | → |
| Dubbing video-to-video |
This endpoint delivers seamlessly localized videos by generating lip-synced dubs in multiple languages, ensuring natural and immersive multilingual experiences
animation lip sync dubbing |
Deprecated | 8/13 | → |
| Sad Talker image-to-video |
Learning Realistic 3D Motion Coefficients for Stylized Audio-Driven Single Image Talking Face Animation
animation |
Deprecated | 8/13 | → |
| MMAudio V2 Text to Audio text-to-audio |
MMAudio generates synchronized audio given text inputs. It can generate sounds described by a prompt.
audio fast |
Deprecated | 8/13 | → |
| Switti 512 text-to-image | Switti is a scale-wise transformer for fast text-to-image generation that outperforms existing T2I AR models and competes with state-of-the-art T2I diffusion models while being faster than distilled diffusion models. | Deprecated | 8/13 | → |
| Switti 1024 text-to-image | Switti is a scale-wise transformer for fast text-to-image generation that outperforms existing T2I AR models and competes with state-of-the-art T2I diffusion models while being faster than distilled diffusion models. | Deprecated | 8/13 | → |
| Train Flux LoRA training |
Train styles, people and other subjects at blazing speeds.
lora personalization |
Deprecated | 8/13 | → |
| Auto-Captioner video-to-video |
Automatically generates text captions for your videos from the audio as per text colour/font specifications
captioning video |
Deprecated | 8/13 | → |
| Kling 1.6 image-to-video | Generate video clips from your images using Kling 1.6 (std) | Deprecated | 8/13 | → |
| Kling 1.6 text-to-video | Generate video clips from your prompts using Kling 1.6 (std) | Deprecated | 8/13 | → |
| Kling 1.6 image-to-video | Generate video clips from your images using Kling 1.6 (pro) | Deprecated | 8/13 | → |
| MoonDreamNext Detection image-to-image |
MoonDreamNext Detection is a multimodal vision-language model for gaze detection, bbox detection, point detection, and more.
multimodal |
Deprecated | 8/13 | → |
| MoonDreamNext vision |
MoonDreamNext is a multimodal vision-language model for captioning, gaze detection, bbox detection, point detection, and more.
multimodal vision |
Deprecated | 8/13 | → |
| Sa2VA 8B Image vision |
Sa2VA is an MLLM capable of question answering, visual prompt understanding, and dense object segmentation at both image and video levels
multimodal vision |
Deprecated | 8/13 | → |
| Sa2VA 4B Image vision |
Sa2VA is an MLLM capable of question answering, visual prompt understanding, and dense object segmentation at both image and video levels
multimodal vision |
Deprecated | 8/13 | → |
| Sa2VA 4B Video vision |
Sa2VA is an MLLM capable of question answering, visual prompt understanding, and dense object segmentation at both image and video levels
multimodal vision |
Deprecated | 8/13 | → |
| Sa2VA 8B Video vision |
Sa2VA is an MLLM capable of question answering, visual prompt understanding, and dense object segmentation at both image and video levels
multimodal vision |
Deprecated | 8/13 | → |
| sync.so -- lipsync 1.9.0-beta video-to-video |
Generate realistic lipsync animations from audio using advanced algorithms for high-quality synchronization.
animation lip sync |
Deprecated | 8/13 | → |
| TransPixar V1 text-to-video | Transform text into stunning videos with TransPixar - an AI model that generates both RGB footage and alpha channels, enabling seamless compositing and creative video effects. | Deprecated | 8/13 | → |
| CogVideoX-5B text-to-video | Generate videos from prompts using CogVideoX-5B | Deprecated | 8/13 | → |
| Train Hunyuan LoRA training |
Train Hunyuan Video lora on people, objects, characters and more!
lora personalization |
Deprecated | 8/13 | → |
| Hunyuan Video LoRA Inference text-to-video | Hunyuan Video is an Open video generation model with high visual quality, motion diversity, text-video alignment, and generation stability | Deprecated | 8/13 | → |
| FLUX.1 [pro] Depth Fine-tuned image-to-image |
Generate high-quality images from depth maps using Flux.1 [pro] depth estimation model with a fine-tuned LoRA. The model produces accurate depth representations for scene understanding and 3D visualization.
depth utility composition |
Deprecated | 8/13 | → |
| FLUX.1 [pro] Fill Fine-tuned image-to-image |
FLUX.1 [pro] Fill Fine-tuned is a high-performance endpoint for the FLUX.1 [pro] model with a fine-tuned LoRA that enables rapid transformation of existing images, delivering high-quality style transfers and image modifications with the core FLUX capabilities.
editing |
Deprecated | 8/13 | → |
| FLUX1.1 [pro] ultra Fine-tuned text-to-image |
FLUX1.1 [pro] ultra fine-tuned is the newest version of FLUX1.1 [pro] with a fine-tuned LoRA, maintaining professional-grade image quality while delivering up to 2K resolution with improved photo realism.
high-res realism |
Deprecated | 8/13 | → |
| FLUX1.1 [pro] text-to-image | FLUX1.1 [pro] is an enhanced version of FLUX.1 [pro], improved image generation capabilities, delivering superior composition, detail, and artistic fidelity compared to its predecessor. | Deprecated | 8/13 | → |
| Train Flux LoRAs For Pro Models training |
FLUX LoRA for Pro endpoints.
lora personalization |
Deprecated | 8/13 | → |
| FLUX.1 [pro] Depth image-to-image |
Generate high-quality images from depth maps using Flux.1 [pro] depth estimation model. The model produces accurate depth representations for scene understanding and 3D visualization.
depth utility composition |
Deprecated | 8/13 | → |
| FLUX.1 [pro] Canny Fine-tuned image-to-image |
Utilize Flux.1 [pro] Controlnet with a fine-tuned LoRA to generate high-quality images with precise control over composition, style, and structure through advanced edge detection and guidance mechanisms.
controlnet detection editing composition |
Deprecated | 8/13 | → |
| FLUX.1 [dev] Canny with LoRAs image-to-image |
Utilize Flux.1 [dev] Controlnet to generate high-quality images with precise control over composition, style, and structure through advanced edge detection and guidance mechanisms.
controlnet detection lora editing composition |
Deprecated | 8/13 | → |
| FLUX.1 [pro] Canny image-to-image |
Utilize Flux.1 [pro] Controlnet to generate high-quality images with precise control over composition, style, and structure through advanced edge detection and guidance mechanisms.
controlnet detection editing composition |
Deprecated | 8/13 | → |
| MoonDreamNext Batch vision |
MoonDreamNext Batch is a multimodal vision-language model for batch captioning.
multimodal |
Deprecated | 8/13 | → |
| MiniMax (Hailuo AI) Video 01 Subject Reference image-to-video |
Generate video clips maintaining consistent, realistic facial features and identity across dynamic video content
subject transformation |
Deprecated | 8/13 | → |
| FFmpeg API Compose video-to-video |
Compose videos from multiple media sources using FFmpeg API.
ffmpeg |
Deprecated | 8/13 | → |
| FFmpeg API Waveform json |
Get waveform data from audio files using FFmpeg API.
ffmpeg |
Deprecated | 8/13 | → |
| FFmpeg API Metadata json |
Get encoding metadata from video and audio files using FFmpeg API.
ffmpeg |
Deprecated | 8/13 | → |
| Kling Kolors Virtual TryOn v1.5 image-to-image |
Kling Kolors Virtual TryOn v1.5 is a high quality image based Try-On endpoint which can be used for commercial try on.
try-on fashion clothing |
Deprecated | 8/13 | → |
| Luma Ray 2 text-to-video |
Ray2 is a large-scale video generative model capable of creating realistic visuals with natural, coherent motion.
motion transformation |
Deprecated | 8/13 | → |
| YuE: Lyrics to Song text-to-audio |
YuE is a groundbreaking series of open-source foundation models designed for music generation, specifically for transforming lyrics into full songs.
music |
Deprecated | 8/13 | → |
| DeepSeek Janus-Pro text-to-image |
DeepSeek Janus-Pro is a novel text-to-image model that unifies multimodal understanding and generation through an autoregressive framework
stylized |
Deprecated | 8/13 | → |
| PixVerse v3.5: Image to Video image-to-video | Generate high quality video clips from text and image prompts using PixVerse v3.5 | Deprecated | 8/13 | → |
| PixVerse v3.5 Fast text-to-video | Generate high quality video clips quickly from text prompts using PixVerse v3.5 Fast | Deprecated | 8/13 | → |
| PixVerse v3.5: Image to Video Fast image-to-video | Generate high quality video clips from text and image prompts quickly using PixVerse v3.5 Fast | Deprecated | 8/13 | → |
| PixVerse v3.5 text-to-video | Generate high quality video clips from text prompts using PixVerse v3.5 | Deprecated | 8/13 | → |
| Hunyuan Video LoRA Inference (Video-to-Video) video-to-video |
Hunyuan Video is an Open video generation model with high visual quality, motion diversity, text-video alignment, and generation stability. Use this endpoint to generate videos from videos.
video to video motion lora |
Deprecated | 8/13 | → |
| Hunyuan Video (Video-to-Video) video-to-video |
Hunyuan Video is an Open video generation model with high visual quality, motion diversity, text-video alignment, and generation stability. Use this endpoint to generate videos from videos.
video to video motion |
Deprecated | 8/13 | → |
| Lumina Image 2 text-to-image |
Lumina-Image-2.0 is a 2 billion parameter flow-based diffusion transforer which features improved performance in image quality, typography, complex prompt understanding, and resource-efficiency.
diffusion typography style |
Deprecated | 8/13 | → |
| CodeFormer image-to-image |
Fix distorted or blurred photos of people with CodeFormer.
image-restoration faces utility |
Deprecated | 8/13 | → |
| Hunyuan Video Image-to-Video LoRA Inference image-to-video |
Image to Video for the Hunyuan Video model using a custom trained LoRA.
motion |
Deprecated | 8/13 | → |
| Ideogram Upscale image-to-image |
Ideogram Upscale enhances the resolution of the reference image by up to 2X and might enhance the reference image too. Optionally refine outputs with a prompt for guided improvements.
upscaling high-res |
Deprecated | 8/13 | → |
| Imagen3 Fast text-to-image | Imagen3 Fast is a high-quality text-to-image model that generates realistic images from text prompts. | Deprecated | 8/13 | → |
| Imagen3 text-to-image | Imagen3 is a high-quality text-to-image model that generates realistic images from text prompts. | Deprecated | 8/13 | → |
| Ben-Video-Bg-Rm video-to-video |
A model for high quality and smooth background removal for videos.
segmentation background removal |
Deprecated | 8/13 | → |
| MiniMax (Hailuo AI) Video 01 Director text-to-video |
Generate video clips more accurately with respect to natural language descriptions and using camera movement instructions for shot control.
motion transformation camera-controls |
Deprecated | 8/13 | → |
| FLUX.1 [dev] Control LoRA Depth text-to-image |
FLUX Control LoRA Depth is a high-performance endpoint that uses a control image to transfer structure to the generated image, using a depth map.
lora style transfer |
Deprecated | 8/13 | → |
| FLUX.1 [dev] Control LoRA Canny text-to-image |
FLUX Control LoRA Canny is a high-performance endpoint that uses a control image to transfer structure to the generated image, using a Canny edge map.
lora style transfer |
Deprecated | 8/13 | → |
| ben-v2-image image-to-image |
A fast and high quality model for image background removal.
background removal |
Deprecated | 8/13 | → |
| FLUX.1 [dev] Control LoRA Depth image-to-image |
FLUX Control LoRA Depth is a high-performance endpoint that uses a control image using a depth map to transfer structure to the generated image and another initial image to guide color.
lora style transfer |
Deprecated | 8/13 | → |
| FLUX.1 [dev] Control LoRA Canny image-to-image |
FLUX Control LoRA Canny is a high-performance endpoint that uses a control image using a Canny edge map to transfer structure to the generated image and another initial image to guide color.
lora style transfer |
Deprecated | 8/13 | → |
| GOT OCR 2.0 vision |
GOT-OCR2 works on a wide range of tasks, including plain document OCR, scene text OCR, formatted document OCR, and even OCR for tables, charts, mathematical formulas, geometric shapes, molecular formulas and sheet music.
optical character recognition high-res utility |
Deprecated | 8/13 | → |
| Luma Ray 2 (Image to Video) image-to-video |
Ray2 is a large-scale video generative model capable of creating realistic visuals with natural, coherent motion.
motion transformation |
Deprecated | 8/13 | → |
| Kokoro TTS (Italian) text-to-audio |
A high-quality Italian text-to-speech model delivering smooth and expressive speech synthesis.
speech |
Deprecated | 8/13 | → |
| Zonos-Audio-Clone text-to-audio |
Clone voice of any person and speak anything in their voice using zonos' voice cloning.
voice cloning |
Deprecated | 8/13 | → |
| Kokoro TTS (Japanese) text-to-audio |
A fast and natural-sounding Japanese text-to-speech model optimized for smooth pronunciation.
speech |
Deprecated | 8/13 | → |
| Kokoro TTS text-to-audio |
Kokoro is a lightweight text-to-speech model that delivers comparable quality to larger models while being significantly faster and more cost-efficient.
speech |
Deprecated | 8/13 | → |
| Kokoro TTS (British English) text-to-audio |
A high-quality British English text-to-speech model offering natural and expressive voice synthesis.
speech |
Deprecated | 8/13 | → |
| Kokoro TTS (French) text-to-audio |
An expressive and natural French text-to-speech model for both European and Canadian French.
speech |
Deprecated | 8/13 | → |
| Kokoro TTS (Spanish) text-to-audio |
A natural-sounding Spanish text-to-speech model optimized for Latin American and European Spanish.
speech |
Deprecated | 8/13 | → |
| Kokoro TTS (Brazilian Portuguese) text-to-audio |
A natural and expressive Brazilian Portuguese text-to-speech model optimized for clarity and fluency.
speech |
Deprecated | 8/13 | → |
| Kokoro TTS (Hindi) text-to-audio |
A fast and expressive Hindi text-to-speech model with clear pronunciation and accurate intonation.
speech |
Deprecated | 8/13 | → |
| Kokoro TTS (Mandarin Chinese) text-to-audio |
A highly efficient Mandarin Chinese text-to-speech model that captures natural tones and prosody.
speech |
Deprecated | 8/13 | → |
| Flow-Edit text-to-image |
The model provides you high quality image editing capabilities.
editing |
Deprecated | 8/13 | → |
| Skyreels V1 (Image-to-Video) image-to-video |
SkyReels V1 is the first and most advanced open-source human-centric video foundation model. By fine-tuning HunyuanVideo on O(10M) high-quality film and television clips
motion |
Deprecated | 8/13 | → |
| Post Processing image-to-image |
Post Processing is an endpoint that can enhance images using a variety of techniques including grain, blur, sharpen, and more.
stylized utility |
Deprecated | 8/13 | → |
| NAFNet-denoise image-to-image |
Use NAFNet to fix issues like blurriness and noise in your images. This model specializes in image restoration and can help enhance the overall quality of your photography.
image-restoration deblur denoise |
Deprecated | 8/13 | → |
| NAFNet-deblur image-to-image |
Use NAFNet to fix issues like blurriness and noise in your images. This model specializes in image restoration and can help enhance the overall quality of your photography.
image-restoration deblur denoise |
Deprecated | 8/13 | → |
| Veo 2 text-to-video |
Veo 2 creates videos with realistic motion and high quality output. Explore different styles and find your own with extensive camera controls.
motion transformation |
Deprecated | 8/13 | → |
| DRCT-Super-Resolution image-to-image |
Upscale your images with DRCT-Super-Resolution.
upscaling high-res |
Deprecated | 8/13 | → |
| MiniMax (Hailuo AI) Video 01 Director - Image to Video image-to-video |
Generate video clips more accurately with respect to initial image, natural language descriptions, and using camera movement instructions for shot control.
motion transformation camera-controls |
Deprecated | 8/13 | → |
| Segment Anything Model 2 image-to-image |
SAM 2 is a model for segmenting images automatically. It can return individual masks or a single mask for the entire image.
segmentation mask |
Deprecated | 8/13 | → |
| Video Prompt Generator llm |
Generate video prompts using a variety of techniques including camera direction, style, pacing, special effects and more.
motion transformation chat claude gpt |
Deprecated | 8/13 | → |
| Wan-2.1 Image-to-Video image-to-video |
Wan-2.1 is a image-to-video model that generates high-quality videos with high visual quality and motion diversity from images
image to video motion |
Deprecated | 8/13 | → |
| Wan-2.1 Text-to-Video text-to-video |
Wan-2.1 is a text-to-video model that generates high-quality videos with high visual quality and motion diversity from text prompts
text to video motion |
Deprecated | 8/13 | → |
| DDColor image-to-image |
Bring colors into old or new black and white photos with DDColor.
image-recolorization faces utility |
Deprecated | 8/13 | → |
| EVF-SAM2 Segmentation image-to-image |
EVF-SAM2 combines natural language understanding with advanced segmentation capabilities, allowing you to precisely mask image regions using intuitive positive and negative text prompts.
segmentation mask |
Deprecated | 8/13 | → |
| Ideogram V2A text-to-image |
Generate high-quality images, posters, and logos with Ideogram V2A. Features exceptional typography handling and realistic outputs optimized for commercial and creative use.
realism typography |
Deprecated | 8/13 | → |
| ElevenLabs Sound Effects text-to-audio |
Generate sound effects using ElevenLabs advanced sound effects model.
sound |
Deprecated | 8/13 | → |
| ElevenLabs TTS Turbo v2.5 text-to-speech |
Generate high-speed text-to-speech audio using ElevenLabs TTS Turbo v2.5.
audio |
Deprecated | 8/13 | → |
| ElevenLabs Audio Isolation audio-to-audio |
Isolate audio tracks using ElevenLabs advanced audio isolation technology.
audio |
Deprecated | 8/13 | → |
| ElevenLabs Speech to Text speech-to-text |
Generate text from speech using ElevenLabs advanced speech-to-text model.
speech |
Deprecated | 8/13 | → |
| Ideogram V2A Turbo text-to-image |
Accelerated image generation with Ideogram V2A Turbo. Create high-quality visuals, posters, and logos with enhanced speed while maintaining Ideogram's signature quality.
realism typography |
Deprecated | 8/13 | → |
| Wan-2.1 1.3B Text-to-Video text-to-video |
Wan-2.1 1.3B is a text-to-video model that generates high-quality videos with high visual quality and motion diversity from text promptsat faster speeds.
text to video motion |
Deprecated | 8/13 | → |
| ElevenLabs TTS Multilingual v2 text-to-audio |
Generate multilingual text-to-speech audio using ElevenLabs TTS Multilingual v2.
audio |
Deprecated | 8/13 | → |
| Ideogram V2A Turbo Remix image-to-image |
Rapidly create image variations with Ideogram V2A Turbo Remix. Fast and efficient reimagining of existing images while maintaining creative control through prompt guidance.
realism typography |
Deprecated | 8/13 | → |
| Kling 1.6 text-to-video | Generate video clips from your prompts using Kling 1.6 (pro) | Deprecated | 8/13 | → |
| Ideogram V2A Remix image-to-image |
Create variations of existing images with Ideogram V2A Remix while maintaining creative control through prompt guidance.
realism typography |
Deprecated | 8/13 | → |
| SWIN2SR image-to-image |
Enhance low-resolution images with the superior quality of Swin2SR for sharper, clearer results.
image-enhancement |
Deprecated | 8/13 | → |
| DocRes image-to-image |
Enhance low-resolution, blur, shadowed documents with the superior quality of docres for sharper, clearer results.
image-enhancement |
Deprecated | 8/13 | → |
| DocRes-dewarp image-to-image |
Enhance wraped, folded documents with the superior quality of docres for sharper, clearer results.
image-enhancement |
Deprecated | 8/13 | → |
| DiffRhythm: Lyrics to Song text-to-audio |
DiffRhythm is a blazing fast model for transforming lyrics into full songs. It boasts the capability to generate full songs in less than 30 seconds.
music |
Deprecated | 8/13 | → |
| Topaz Video Upscale video-to-video |
Professional-grade video upscaling using Topaz technology. Enhance your videos with high-quality upscaling.
upscaling high-res |
Deprecated | 8/13 | → |
| CogView text-to-image |
Generate high quality images from text prompts using CogView4. Longer text prompts will result in better quality images.
stylized |
Deprecated | 8/13 | → |
| Juggernaut Flux Base text-to-image |
Juggernaut Base Flux by RunDiffusion is a drop-in replacement for Flux [Dev] that delivers sharper details, richer colors, and enhanced realism, while instantly boosting LoRAs and LyCORIS with full compatibility.
image generation |
Deprecated | 8/13 | → |
| Juggernaut Flux Pro text-to-image |
Juggernaut Pro Flux by RunDiffusion is the flagship Juggernaut model rivaling some of the most advanced image models available, often surpassing them in realism. It combines Juggernaut Base with RunDiffusion Photo and features enhancements like reduced background blurriness.
image generation |
Deprecated | 8/13 | → |
| LTX Video-0.9.5 video-to-video |
Generate videos from prompts,images, and videos using LTX Video-0.9.5
video image-to-video text-to-video |
Deprecated | 8/13 | → |
| LTX Video-0.9.5 text-to-video |
Generate videos from prompts using LTX Video-0.9.5
video text-video |
Deprecated | 8/13 | → |
| LTX Video-0.9.5 video-to-video |
Generate videos from prompts and videos using LTX Video-0.9.5
video video-to-video |
Deprecated | 8/13 | → |
| Juggernaut Flux Base image-to-image |
Juggernaut Base Flux by RunDiffusion is a drop-in replacement for Flux [Dev] that delivers sharper details, richer colors, and enhanced realism, while instantly boosting LoRAs and LyCORIS with full compatibility.
image generation |
Deprecated | 8/13 | → |
| Juggernaut Flux Base LoRA text-to-image |
Juggernaut Base Flux LoRA by RunDiffusion is a drop-in replacement for Flux [Dev] that delivers sharper details, richer colors, and enhanced realism to all your LoRAs and LyCORIS with full compatibility.
image generation |
Deprecated | 8/13 | → |
| Rundiffusion Photo Flux text-to-image |
RunDiffusion Photo Flux provides insane realism. With this enhancer, textures and skin details burst to life, turning your favorite prompts into vivid, lifelike creations. Recommended to keep it at 0.65 to 0.80 weight. Supports resolutions up to 1536x1536.
image generation lora |
Deprecated | 8/13 | → |
| Juggernaut Flux Pro image-to-image |
Juggernaut Pro Flux by RunDiffusion is the flagship Juggernaut model rivaling some of the most advanced image models available, often surpassing them in realism. It combines Juggernaut Base with RunDiffusion Photo and features enhancements like reduced background blurriness.
image generation |
Deprecated | 8/13 | → |
| Juggernaut Flux Lightning text-to-image |
Juggernaut Lightning Flux by RunDiffusion provides blazing-fast, high-quality images rendered at five times the speed of Flux. Perfect for mood boards and mass ideation, this model excels in both realism and prompt adherence.
image generation |
Deprecated | 8/13 | → |
| Hunyuan Video Image-to-Video Inference image-to-video |
Image to Video for the high-quality Hunyuan Video I2V model.
motion |
Deprecated | 8/13 | → |
| Kling 1.6 text-to-video | Generate video clips from your prompts using Kling 1.6 (std) | Deprecated | 8/13 | → |
| Kling 1.6 text-to-video | Generate video clips from your prompts using Kling 1.6 (pro) | Deprecated | 8/13 | → |
| Kling 1.0 text-to-video |
Generate video clips from your prompts using Kling 1.0
motion |
Deprecated | 8/13 | → |
| Kling 1.5 text-to-video | Generate video clips from your prompts using Kling 1.5 (pro) | Deprecated | 8/13 | → |
| Wan-2.1 Image-to-Video with LoRAs image-to-video |
Add custom LoRAs to Wan-2.1 is a image-to-video model that generates high-quality videos with high visual quality and motion diversity from images
image to video motion lora |
Deprecated | 8/13 | → |
| Easel AI Advanced Face Swap image-to-image |
Swap faces of one or two people at once, while preserving user and scene details!
face swap utility editing |
Deprecated | 8/13 | → |
| Veo 2 (Image to Video) image-to-video |
Veo 2 creates videos from images with realistic motion and very high quality output.
motion transformation |
Deprecated | 8/13 | → |
| Wan-2.1 Pro Text-to-Video text-to-video |
Wan-2.1 Pro is a premium text-to-video model that generates high-quality 1080p videos at 30fps with up to 6 seconds duration, delivering exceptional visual quality and motion diversity from text prompts
text to video motion |
Deprecated | 8/13 | → |
| Wan-2.1 Pro Image-to-Video image-to-video |
Wan-2.1 Pro is a premium image-to-video model that generates high-quality 1080p videos at 30fps with up to 6 seconds duration, delivering exceptional visual quality and motion diversity from images
image to video motion |
Deprecated | 8/13 | → |
| Vidu Template to Video image-to-video |
Vidu Template to Video lets you create different effects by applying motion templates to your images.
motion template |
Deprecated | 8/13 | → |
| Vidu Reference to Video image-to-video |
Vidu Reference to Video creates videos by using a reference images and combining them with a prompt.
motion reference |
Deprecated | 8/13 | → |
| Vidu Start-End to Video image-to-video |
Vidu Start-End to Video generates smooth transition videos between specified start and end images.
motion transition |
Deprecated | 8/13 | → |
| Vidu Image to Video image-to-video |
Vidu Image to Video generates high-quality videos with exceptional visual quality and motion diversity from a single image
motion image to video |
Deprecated | 8/13 | → |
| Wan Effects image-to-video |
Wan Effects generates high-quality videos with popular effects from images
motion effects |
Deprecated | 8/13 | → |
| CSM-1B text-to-audio |
CSM (Conversational Speech Model) is a speech generation model from Sesame that generates RVQ audio codes from text and audio inputs.
conversational text to speech |
Deprecated | 8/13 | → |
| Pika Image to Video (v2.1) image-to-video |
Pika v2.1 creates videos from images with high quality output.
editing effects animation |
Deprecated | 8/13 | → |
| Pika Scenes (v2.2) image-to-video |
Pika Scenes v2.2 creates videos from a images with high quality output.
editing effects animation |
Deprecated | 8/13 | → |
| Pika Image to Video (v2.2) image-to-video |
Pika v2.2 creates videos from images with high quality output.
editing effects animation |
Deprecated | 8/13 | → |
| Pika Text to Video Turbo (v2) text-to-video |
Pika v2 Turbo creates videos from a text prompt with high quality output.
editing effects animation |
Deprecated | 8/13 | → |
| Pika Text to Video (v2.1) text-to-video |
Pika v2.1 creates videos from a text prompt with high quality output.
editing effects animation |
Deprecated | 8/13 | → |
| Invisible Watermark image-to-image |
Invisible Watermark is a model that can add an invisible watermark to an image.
utility editing |
Deprecated | 8/13 | → |
| Pika Text to Video (v2.2) text-to-video |
Pika v2.2 creates videos from a text prompt with high quality output.
editing effects animation |
Deprecated | 8/13 | → |
| Pika Image to Video Turbo (v2) image-to-video |
Pika v2 Turbo creates videos from images with high quality output.
editing effects animation |
Deprecated | 8/13 | → |
| Pika Effects (v1.5) image-to-video |
Pika Effects are AI-powered video effects designed to modify objects, characters, and environments in a fun, engaging, and visually compelling manner.
editing effects animation |
Deprecated | 8/13 | → |
| Luma Ray 2 Flash text-to-video |
Ray2 Flash is a fast video generative model capable of creating realistic visuals with natural, coherent motion.
motion transformation |
Deprecated | 8/13 | → |
| Luma Ray 2 Flash (Image to Video) image-to-video |
Ray2 Flash is a fast video generative model capable of creating realistic visuals with natural, coherent motion.
motion transformation |
Deprecated | 8/13 | → |
| Gemini Flash Edit Multi Image image-to-image |
Gemini Flash Edit Multi Image is a model that can edit multiple images using a text prompt and a reference image.
editing |
Deprecated | 8/13 | → |
| Hunyuan3D image-to-3d |
Generate 3D models from your images using Hunyuan 3D. A native 3D generative model enabling versatile and high-quality 3D asset creation.
stylized |
Deprecated | 8/13 | → |
| Hunyuan3D image-to-3d |
Generate 3D models from your images using Hunyuan 3D. A native 3D generative model enabling versatile and high-quality 3D asset creation.
stylized |
Deprecated | 8/13 | → |
| Hunyuan3D image-to-3d |
Generate 3D models from your images using Hunyuan 3D. A native 3D generative model enabling versatile and high-quality 3D asset creation.
stylized |
Deprecated | 8/13 | → |
| Hunyuan3D image-to-3d |
Generate 3D models from your images using Hunyuan 3D. A native 3D generative model enabling versatile and high-quality 3D asset creation.
stylized |
Deprecated | 8/13 | → |
| Hunyuan3D image-to-3d |
Generate 3D models from your images using Hunyuan 3D. A native 3D generative model enabling versatile and high-quality 3D asset creation.
stylized |
Deprecated | 8/13 | → |
| Gemini Flash Edit Multi Image image-to-image |
Gemini Flash Edit is a model that can edit single image using a text prompt and a reference image.
editing |
Deprecated | 8/13 | → |
| Hunyuan3D image-to-3d |
Generate 3D models from your images using Hunyuan 3D. A native 3D generative model enabling versatile and high-quality 3D asset creation.
stylized |
Deprecated | 8/13 | → |
| MixDehazer image-to-image | An advanced dehaze model to remove atmospheric haze, restoring clarity and detail in images through intelligent neural network processing. | Deprecated | 8/13 | → |
| Thera image-to-image | Fix low resolution images with fast speed and quality of thera. | Deprecated | 8/13 | → |
| Wan-2.1 LoRA Trainer training |
Train custom LoRAs for Wan-2.1 I2V 480P
lora training |
Deprecated | 8/13 | → |
| Wan-2.1 Text-to-Video with LoRAs text-to-video |
Add custom LoRAs to Wan-2.1 is a text-to-video model that generates high-quality videos with high visual quality and motion diversity from images
"text to video" "motion" "lora" |
Deprecated | 8/13 | → |
| LatentSync video-to-video |
LatentSync is a video-to-video model that generates lip sync animations from audio using advanced algorithms for high-quality synchronization.
animation lip sync |
Deprecated | 8/13 | → |
| Kling LipSync Audio-to-Video text-to-video |
Kling LipSync is an audio-to-video model that generates realistic lip movements from audio input.
audio to video lipsync |
Deprecated | 8/13 | → |
| Kling LipSync Text-to-Video text-to-video |
Kling LipSync is a text-to-video model that generates realistic lip movements from text input.
text to video lipsync |
Deprecated | 8/13 | → |
| music generator text-to-audio |
CassetteAI’s model generates a 30-second sample in under 2 seconds and a full 3-minute track in under 10 seconds. At 44.1 kHz stereo audio, expect a level of professional consistency with no breaks, no squeaks, and no random interruptions in your creations.
music cassetteai |
Deprecated | 8/13 | → |
| Sana Sprint text-to-image |
Sana Sprint is a text-to-image model capable of generating 4K images with exceptional speed.
text to image 4k high-speed |
Deprecated | 8/13 | → |
| Sana v1.5 4.8B text-to-image |
Sana v1.5 4.8B is a powerful text-to-image model that generates ultra-high quality 4K images with remarkable detail.
text to image 4k high-quality |
Deprecated | 8/13 | → |
| Sana v1.5 1.6B text-to-image |
Sana v1.5 1.6B is a lightweight text-to-image model that delivers 4K image generation with impressive efficiency.
text to image 4k lightweight |
Deprecated | 8/13 | → |
| Orpheus TTS text-to-speech |
Orpheus TTS is a state-of-the-art, Llama-based Speech-LLM designed for high-quality, empathetic text-to-speech generation. This model has been finetuned to deliver human-level speech synthesis, achieving exceptional clarity, expressiveness, and real-time performances.
text to speech voice synthesis high-fidelity |
Deprecated | 8/13 | → |
| Ghiblify Images image-to-image |
Reimagine and transform your ordinary photos into enchanting Studio Ghibli style artwork
stylized transform |
Deprecated | 8/13 | → |
| PixVerse v4: Text to Video Fast text-to-video | Generate high quality and fast video clips from text and image prompts using PixVerse v4 fast | Deprecated | 8/13 | → |
| PixVerse v3.5: Transition image-to-video | Create seamless transition between images using PixVerse v3.5 | Deprecated | 8/13 | → |
| PixVerse v4: Text to Video text-to-video | Generate high quality video clips from text and image prompts using PixVerse v4 | Deprecated | 8/13 | → |
| PixVerse v3.5: Effects image-to-video | Generate high quality video clips with different effects using PixVerse v3.5 | Deprecated | 8/13 | → |
| PixVerse v4: Image to Video image-to-video | Generate high quality video clips from text and image prompts using PixVerse v4 | Deprecated | 8/13 | → |
| PixVerse v4: Image to Video Fast image-to-video | Generate fast high quality video clips from text and image prompts using PixVerse v4 | Deprecated | 8/13 | → |
| StarVector image-to-image |
AI vectorization model that transforms raster images into scalable SVG graphics, preserving visual details while enabling infinite scaling and easy editing capabilities.
image-to-image |
Deprecated | 8/13 | → |
| FLUX.1 [dev] text-to-image | FLUX.1 [dev] is a 12 billion parameter flow transformer that generates high-quality images from text. It is suitable for personal and commercial use. | Deprecated | 8/13 | → |
| Sync Lipsync 2.0 video-to-video |
Generate realistic lipsync animations from audio using advanced algorithms for high-quality synchronization with Sync Lipsync 2.0 model
animation lip sync |
Deprecated | 8/13 | → |
| Sound Effects Generator text-to-audio |
Create stunningly realistic sound effects in seconds - CassetteAI's Sound Effects Model generates high-quality SFX up to 30 seconds long in just 1 second of processing time
sound sfx sound-effects cassetteai |
Deprecated | 8/13 | → |
| Speech-to-Text speech-to-text |
Leverage the rapid processing capabilities of AI models to enable accurate and efficient real-time speech-to-text transcription.
|
Deprecated | 8/13 | → |
| Speech-To-text speech-to-text |
Leverage the rapid processing capabilities of AI models to enable accurate and efficient real-time speech-to-text transcription.
streaming |
Deprecated | 8/13 | → |
| Speech-to-Text speech-to-text |
Leverage the rapid processing capabilities of AI models to enable accurate and efficient real-time speech-to-text transcription.
streaming |
Deprecated | 8/13 | → |
| Speech-to-Text speech-to-text | Leverage the rapid processing capabilities of AI models to enable accurate and efficient real-time speech-to-text transcription. | Deprecated | 8/13 | → |
| Video Sound Effects Generator video-to-video |
Add sound effects to your videos
sound-effects sfx cassetteai |
Deprecated | 8/13 | → |
| finegrain eraser image-to-image |
Finegrain Eraser removes objects—along with their shadows, reflections, and lighting artifacts—using only natural language, seamlessly filling the scene with contextually accurate content.
utility editing |
Deprecated | 8/13 | → |
| finegrain eraser image-to-image |
Finegrain Eraser removes any object selected with a bounding box—along with its shadows, reflections, and lighting artifacts—seamlessly reconstructing the scene with contextually accurate content.
utility editing |
Deprecated | 8/13 | → |
| finegrain eraser image-to-image |
Finegrain Eraser removes any object selected with a mask—along with its shadows, reflections, and lighting artifacts—seamlessly reconstructing the scene with contextually accurate content.
utility editing |
Deprecated | 8/13 | → |
| Hidream I1 Fast text-to-image |
HiDream-I1 fast is a new open-source image generative foundation model with 17B parameters that achieves state-of-the-art image generation quality within 16 steps.
|
Deprecated | 8/13 | → |
| Hidream I1 Dev text-to-image | HiDream-I1 dev is a new open-source image generative foundation model with 17B parameters that achieves state-of-the-art image generation quality within seconds. | Deprecated | 8/13 | → |
| Hidream I1 Full text-to-image | HiDream-I1 full is a new open-source image generative foundation model with 17B parameters that achieves state-of-the-art image generation quality within seconds. | Deprecated | 8/13 | → |
| Vace video-to-video |
Vace a video generation model that uses a source image, mask, and video to create prompted videos with controllable sources.
video-to-video image-to-video text-to-video |
Deprecated | 8/13 | → |
| Cartoonify image-to-image |
Transform images into 3D cartoon artwork using an AI model that applies cartoon stylization while preserving the original image's composition and details.
stylized transform |
Deprecated | 8/13 | → |
| Kling 2.0 Master text-to-video | Generate video clips from your prompts using Kling 2.0 Master | Deprecated | 8/13 | → |
| Kling 2.0 Master image-to-video | Generate video clips from your images using Kling 2.0 Master | Deprecated | 8/13 | → |
| Tavus LipSync v2 video-to-video | Generate lip sync using Tavus' state-of-the-art model for high-quality synchronization. | Deprecated | 8/13 | → |
| Framepack image-to-video |
Framepack is an efficient Image-to-video model that autoregressively generates videos.
image to video motion |
Deprecated | 8/13 | → |
| Turbo Flux Trainer training | A blazing fast FLUX dev LoRA trainer for subjects and styles. | Deprecated | 8/13 | → |
| Wan-2.1 First-Last-Frame-to-Video image-to-video |
Wan-2.1 flf2v generates dynamic videos by intelligently bridging a given first frame to a desired end frame through smooth, coherent motion sequences.
image to video motion |
Deprecated | 8/13 | → |
| Instant Character image-to-image |
InstantCharacter creates high-quality, consistent characters from text prompts, supporting diverse poses, styles, and appearances with strong identity control.
personalization customization |
Deprecated | 8/13 | → |
| Plushify image-to-image | Turn any image into a cute plushie! | Deprecated | 8/13 | → |
| FASHN Virtual Try-On V1.5 image-to-image |
FASHN v1.5 delivers precise virtual try-on capabilities, accurately rendering garment details like text and patterns at 576x864 resolution from both on-model and flat-lay photo references.
try-on fashion clothing |
Deprecated | 8/13 | → |
| Juggernaut Flux Lora image-to-image | Juggernaut Base Flux LoRA Inpainting by RunDiffusion is a drop-in replacement for Flux [Dev] inpainting that delivers sharper details, richer colors, and enhanced realism to all your LoRAs and LyCORIS with full compatibility. | Deprecated | 8/13 | → |
| Pipecat's Smart Turn model speech-to-text | An open source, community-driven and native audio turn detection model by Pipecat AI. | Deprecated | 8/13 | → |
| MAGI-1 (Distilled) text-to-video |
MAGI-1 distilled is a faster video generation model with exceptional understanding of physical interactions and cinematic prompts
text-to-video |
Deprecated | 8/13 | → |
| Dia text-to-speech |
Dia directly generates realistic dialogue from transcripts. Audio conditioning enables emotion control. Produces natural nonverbals like laughter and throat clearing.
text-to-speech |
Deprecated | 8/13 | → |
| Framepack image-to-video |
Framepack is an efficient Image-to-video model that autoregressively generates videos.
image to video motion |
Deprecated | 8/13 | → |
| Dia Tts audio-to-audio |
Clone dialog voices from a sample audio and generate dialogs from text prompts using the Dia TTS which leverages advanced AI techniques to create high-quality text-to-speech.
speech |
Deprecated | 8/13 | → |
| MAGI-1 (Distilled) image-to-video |
MAGI-1 distilled generates videos faster from images with exceptional understanding of physical interactions and prompting
image-to-video |
Deprecated | 8/13 | → |
| MAGI-1 (Distilled) video-to-video |
MAGI-1 distilled extends videos faster with an exceptional understanding of physical interactions and prompts
video-to-video video-extend |
Deprecated | 8/13 | → |
| Pixverse image-to-video |
Generate high quality video clips with different effects using PixVerse v4
image-to-video |
Deprecated | 8/13 | → |
| MAGI-1 image-to-video |
MAGI-1 generates videos from images with exceptional understanding of physical interactions and prompting
image-to-video |
Deprecated | 8/13 | → |
| MAGI-1 text-to-video |
MAGI-1 is a video generation model with exceptional understanding of physical interactions and cinematic prompts
text-to-video |
Deprecated | 8/13 | → |
| MAGI-1 video-to-video |
MAGI-1 extends videos with an exceptional understanding of physical interactions and prompts
video-to-video |
Deprecated | 8/13 | → |
| gpt-image-1 text-to-image | OpenAI's latest image generation and editing model: gpt-1-image. Currently powered with bring-your-own-key. | Deprecated | 8/13 | → |
| gpt-image-1 image-to-image | OpenAI's latest image generation and editing model: gpt-1-image. Currently powered with bring-your-own-key. | Deprecated | 8/13 | → |
| Uno image-to-image |
An AI model that transforms input images into new ones based on text prompts, blending reference visuals with your creative directions.
image-to-image |
Deprecated | 8/13 | → |
| Image2svg image-to-image |
Image2SVG transforms raster images into clean vector graphics, preserving visual quality while enabling scalable, customizable SVG outputs with precise control over detail levels.
utility editing |
Deprecated | 8/13 | → |
| Tripo3D image-to-3d |
State of the art Image to 3D Object generation. Generate 3D model from a single image!
image-to-3d stylized |
Deprecated | 8/13 | → |
| Step1X Edit image-to-image |
Step1X-Edit transforms your photos with simple instructions into stunning, professional-quality edits—rivaling top proprietary tools.
editing |
Deprecated | 8/13 | → |
| Moondream2 vision |
Moondream2 is a highly efficient open-source vision language model that combines powerful image understanding capabilities with a remarkably small footprint.
image-to-image |
Deprecated | 8/13 | → |
| Moondream2 vision |
Moondream2 is a highly efficient open-source vision language model that combines powerful image understanding capabilities with a remarkably small footprint.
image-to-image |
Deprecated | 8/13 | → |
| Moondream2 vision |
Moondream2 is a highly efficient open-source vision language model that combines powerful image understanding capabilities with a remarkably small footprint.
Vision |
Deprecated | 8/13 | → |
| Moondream2 vision |
Moondream2 is a highly efficient open-source vision language model that combines powerful image understanding capabilities with a remarkably small footprint.
Vision |
Deprecated | 8/13 | → |
| F Lite (texture mode) text-to-image | F Lite is a 10B parameter diffusion model created by Fal and Freepik, trained exclusively on copyright-safe and SFW content. This is a high texture density variant of the model. | Deprecated | 8/13 | → |
| F Lite text-to-image | F Lite is a 10B parameter diffusion model created by Fal and Freepik, trained exclusively on copyright-safe and SFW content. | Deprecated | 8/13 | → |
| Ideogram V3 Edit image-to-image |
Transform existing images with Ideogram V3's editing capabilities. Modify, adjust, and refine images while maintaining high fidelity and realistic outputs with precise prompt control.
realism typography |
Deprecated | 8/13 | → |
| Ideogram image-to-image |
Reimagine existing images with Ideogram V3's remix feature. Create variations and adaptations while preserving core elements and adding new creative directions through prompt guidance.
realism typography |
Deprecated | 8/13 | → |
| Ideogram Replace Background image-to-image | Replace backgrounds existing images with Ideogram V3's replace background feature. Create variations and adaptations while preserving core elements and adding new creative directions through prompt guidance. | Deprecated | 8/13 | → |
| Ideogram Text to Image text-to-image |
Generate high-quality images, posters, and logos with Ideogram V3. Features exceptional typography handling and realistic outputs optimized for commercial and creative use.
realism typography |
Deprecated | 8/13 | → |
| Ideogram image-to-image |
Extend existing images with Ideogram V3's reframe feature. Create expanded versions and adaptations while preserving main image and adding new creative directions through prompt guidance.
realism typography |
Deprecated | 8/13 | → |
| Trellis image-to-3d |
Generate 3D models from multiple images using Trellis. A native 3D generative model enabling versatile and high-quality 3D asset creation.
stylized |
Deprecated | 8/13 | → |
| Hidream I1 Full image-to-image |
HiDream-I1 full is a new open-source image generative foundation model with 17B parameters that achieves state-of-the-art image generation quality within seconds.
image-to-image hidream |
Deprecated | 8/13 | → |
| MiniMax (Hailuo AI) Text to Image text-to-image |
Generate high quality images from text prompts using MiniMax Image-01. Longer text prompts will result in better quality images.
stylized realism |
Deprecated | 8/13 | → |
| Minimax Image Subject Reference image-to-image |
Generate images from text and a reference image using MiniMax Image-01 for consistent character appearance.
stylized transform |
Deprecated | 8/13 | → |
| MiniMax Speech-02 HD text-to-speech |
Generate speech from text prompts and different voices using the MiniMax Speech-02 HD model, which leverages advanced AI techniques to create high-quality text-to-speech.
speech |
Deprecated | 8/13 | → |
| MiniMax Speech-02 Turbo text-to-speech |
Generate fast speech from text prompts and different voices using the MiniMax Speech-02 Turbo model, which leverages advanced AI techniques to create high-quality text-to-speech.
speech |
Deprecated | 8/13 | → |
| MiniMax Voice Cloning text-to-speech |
Clone a voice from a sample audio and generate speech from text prompts using the MiniMax model, which leverages advanced AI techniques to create high-quality text-to-speech.
speech |
Deprecated | 8/13 | → |
| Easel Avatar text-to-image |
Create scenes with one or two people using just selfies and text prompt (without LoRAs)
avatars loras image-generation |
Deprecated | 8/13 | → |
| Recraft V3 text-to-image |
Recraft V3 is a text-to-image model with the ability to generate long texts, vector art, images in brand style, and much more. As of today, it is SOTA in image generation, proven by Hugging Face's industry-leading Text-to-Image Benchmark by Artificial Analysis.
vector typography style |
Deprecated | 8/13 | → |
| Recraft V3 image-to-image |
Recraft V3 is a text-to-image model with the ability to generate long texts, vector art, images in brand style, and much more. As of today, it is SOTA in image generation, proven by Hugging Face's industry-leading Text-to-Image Benchmark by Artificial Analysis.
vector typography style |
Deprecated | 8/13 | → |
| Recraft V3 Create Style training |
Recraft V3 Create Style is capable of creating unique styles for Recraft V3 based on your images.
style vector personalization |
Deprecated | 8/13 | → |
| Recraft Crisp Upscale image-to-image |
Enhances a given raster image using 'crisp upscale' tool, boosting resolution with a focus on refining small details and faces.
upscaling |
Deprecated | 8/13 | → |
| Recraft Creative Upscale image-to-image |
Enhances a given raster image using the 'creative upscale' tool, increasing image resolution, making the image sharper and cleaner.
upscaling |
Deprecated | 8/13 | → |
| LTX Video Trainer training |
Train LTX Video 0.9.7 for custom styles and effects.
ltx-video fine-tuning |
Deprecated | 8/13 | → |
| ACE-Step text-to-audio |
Generate music with lyrics from text using ACE-Step
text-to-audio text-to-music |
Deprecated | 8/13 | → |
| Vidu Image to Video image-to-video |
Vidu Q1 Image to Video generates high-quality 1080p videos with exceptional visual quality and motion diversity from a single image
stylized transform |
Deprecated | 8/13 | → |
| Vidu Text to Video text-to-video |
Vidu Q1 Text to Video generates high-quality 1080p videos with exceptional visual quality and motion diversity
stylized transform |
Deprecated | 8/13 | → |
| Vidu Start End to Video image-to-video |
Vidu Q1 Start-End to Video generates smooth transition 1080p videos between specified start and end images.
stylized transform |
Deprecated | 8/13 | → |
| Rembg Enhance (Remove Background Enhance) image-to-image |
Rembg-enhance is optimized for 2D vector images, 3D graphics, and photos by leveraging matting technology.
background removal image editing utility segmentation high resolution rembg |
Deprecated | 8/13 | → |
| ACE-Step text-to-audio |
Generate music from a simple prompt using ACE-Step
text-to-audio text-to-music |
Deprecated | 8/13 | → |
| ACE-Step audio-to-audio |
Generate music from a lyrics and example audio using ACE-Step
audio-to-audio audio-edit |
Deprecated | 8/13 | → |
| ACE-Step audio-to-audio |
Modify a portion of provided audio with lyrics and/or style using ACE-Step
audio-to-audio audio-inpaint audio-repaint |
Deprecated | 8/13 | → |
| ACE-Step audio-to-audio |
Extend the beginning or end of provided audio with lyrics and/or style using ACE-Step
audio-to-audio audio-outpaint audio-extend |
Deprecated | 8/13 | → |
| Framepack F1 image-to-video |
Framepack is an efficient Image-to-video model that autoregressively generates videos.
image to video motion |
Deprecated | 8/13 | → |
| Hunyuan Custom image-to-video |
HunyuanCustom revolutionizes video generation with unmatched identity consistency across multiple input types. Its innovative fusion modules and alignment networks outperform competitors, maintaining subject integrity while responding flexibly to text, image, audio, and video conditions.
image-to-video |
Deprecated | 8/13 | → |
| Pixverse image-to-video |
Generate high quality video clips with different effects using PixVerse v4.5
image-to-video |
Deprecated | 8/13 | → |
| Pixverse text-to-video |
Generate high quality video clips from text and image prompts using PixVerse v4.5
stylized transform |
Deprecated | 8/13 | → |
| Pixverse text-to-video |
Generate high quality and fast video clips from text and image prompts using PixVerse v4.5 fast
stylized transform |
Deprecated | 8/13 | → |
| Pixverse image-to-video |
Generate high quality video clips from text and image prompts using PixVerse v4.5
stylized transform |
Deprecated | 8/13 | → |
| Pixverse image-to-video |
Generate fast high quality video clips from text and image prompts using PixVerse v4.5
stylized transform |
Deprecated | 8/13 | → |
| Pixverse image-to-video |
Create seamless transition between images using PixVerse v4.5
stylized transform |
Deprecated | 8/13 | → |
| LTX Video-0.9.7 LoRA image-to-video |
Generate videos from prompts and images using LTX Video-0.9.7 and custom LoRA
video ltx-video image-to-video |
Deprecated | 8/13 | → |
| LTX Video-0.9.7 LoRA video-to-video |
Generate videos from prompts, images, and videos using LTX Video-0.9.7 and custom LoRA
video ltx-video video-to-video multicondition-to-video image-to-video |
Deprecated | 8/13 | → |
| Flux Lora text-to-image |
Super fast endpoint for the FLUX.1 [dev] model with LoRA support, enabling rapid and high-quality image generation using pre-trained LoRA adaptations for personalization, specific styles, brand identities, and product-specific outputs.
lora personalization |
Deprecated | 8/13 | → |
| Easel Gifswap image-to-image |
Swap faces on GIFs
utility editing |
Deprecated | 8/13 | → |
| LTX Video-0.9.7 13B Distilled text-to-video |
Generate videos from prompts using LTX Video-0.9.7 13B Distilled and custom LoRA
video ltx-video text-to-video |
Deprecated | 8/13 | → |
| LTX Video-0.9.7 13B text-to-video |
Generate videos from prompts using LTX Video-0.9.7 13B and custom LoRA
video ltx-video text-to-video |
Deprecated | 8/13 | → |
| LTX Video-0.9.7 13B image-to-video |
Generate videos from prompts and images using LTX Video-0.9.7 13B and custom LoRA
video ltx-video image-to-video |
Deprecated | 8/13 | → |
| LTX Video-0.9.7 13B video-to-video |
Extend videos using LTX Video-0.9.7 13B and custom LoRA
video ltx-video video-to-video extend-video |
Deprecated | 8/13 | → |
| LTX Video-0.9.7 13B video-to-video |
Generate videos from prompts, images, and videos using LTX Video-0.9.7 13B and custom LoRA
video ltx-video video-to-video multicondition-to-video image-to-video |
Deprecated | 8/13 | → |
| LTX Video-0.9.7 13B Distilled image-to-video |
Generate videos from prompts and images using LTX Video-0.9.7 13B Distilled and custom LoRA
video ltx-video image-to-video |
Deprecated | 8/13 | → |
| LTX Video-0.9.7 13B Distilled video-to-video |
Generate videos from prompts, images, and videos using LTX Video-0.9.7 13B Distilled and custom LoRA
video ltx-video video-to-video multicondition-to-video image-to-video |
Deprecated | 8/13 | → |
| LTX Video-0.9.7 13B Distilled video-to-video |
Extend videos using LTX Video-0.9.7 13B Distilled and custom LoRA
video ltx-video video-to-video extend-video |
Deprecated | 8/13 | → |
| DreamO text-to-image |
DreamO is an image customization framework designed to support a wide range of tasks while facilitating seamless integration of multiple conditions.
stylized realism |
Deprecated | 8/13 | → |
| Kling 1.6 Elements image-to-video | Generate video clips from your multiple image references using Kling 1.6 (pro) | Deprecated | 8/13 | → |
| Kling 1.6 Elements image-to-video | Generate video clips from your multiple image references using Kling 1.6 (standard) | Deprecated | 8/13 | → |
| Imagen 4 text-to-image | Google’s highest quality image generation model | Deprecated | 8/13 | → |
| Imagen 4 Ultra text-to-image | Google’s highest quality image generation model | Deprecated | 8/13 | → |
| Lyria2 text-to-audio |
Lyria 2 is Google's latest music generation model, you can generate any type of music with this model.
music stylized |
Deprecated | 8/13 | → |
| Bagel text-to-image |
Bagel is a 7B parameter from Bytedance-Seed multimodal model that can generate both text and images.
text-to-image multimodal |
Deprecated | 8/13 | → |
| Bagel image-to-image |
Bagel is a 7B parameter multimodal model from Bytedance-Seed that can generate both images and text.
image-to-image image-editing |
Deprecated | 8/13 | → |
| Bagel image-to-json |
Bagel is a 7B parameter multimodal model from Bytedance-Seed that can generate both text and images.
image-to-text vlm |
Deprecated | 8/13 | → |
| Wan VACE 14B video-to-video |
VACE is a video generation model that uses a source image, mask, and video to create prompted videos with controllable sources.
image-to-video video-to-video text-to-video |
Deprecated | 8/13 | → |
| Hunyuan Portrait image-to-video |
HunyuanPortrait is a diffusion-based framework for generating lifelike, temporally consistent portrait animations.
animation lip sync |
Deprecated | 8/13 | → |
| Avatars audio-to-video |
Generate high-quality videos with UGC-like avatars from audio
lipsync audio-to-video |
Deprecated | 8/13 | → |
| Avatars text-to-video |
Generate high-quality videos with UGC-like avatars from text
lipsync text-to-video |
Deprecated | 8/13 | → |
| Lipsync video-to-video |
Generate realistic lipsync from any audio using VEED's latest model
lipsync video-to-video |
Deprecated | 8/13 | → |
| FLUX.1 Kontext [pro] image-to-image | FLUX.1 Kontext [pro] handles both text and reference images as inputs, seamlessly enabling targeted, local edits and complex transformations of entire scenes. | Deprecated | 8/13 | → |
| FLUX.1 Kontext [dev] image-to-image | Frontier image editing model. | Deprecated | 8/13 | → |
| FLUX.1 Kontext [pro] text-to-image | The FLUX.1 Kontext [pro] text-to-image delivers state-of-the-art image generation results with unprecedented prompt following, photorealistic rendering, and flawless typography. | Deprecated | 8/13 | → |
| Kling 2.1 (standard) image-to-video | Kling 2.1 Standard is a cost-efficient endpoint for the Kling 2.1 model, delivering high-quality image-to-video generation | Deprecated | 8/13 | → |
| Kling 2.1 (pro) image-to-video | Kling 2.1 Pro is an advanced endpoint for the Kling 2.1 model, offering professional-grade videos with enhanced visual fidelity, precise camera movements, and dynamic motion control, perfect for cinematic storytelling. | Deprecated | 8/13 | → |
| Kling 2.1 Master image-to-video |
Kling 2.1 Master: The premium endpoint for Kling 2.1, designed for top-tier image-to-video generation with unparalleled motion fluidity, cinematic visuals, and exceptional prompt precision.
_marquee-video-model |
Deprecated | 8/13 | → |
| Kling 2.1 Master text-to-video | Kling 2.1 Master: The premium endpoint for Kling 2.1, designed for top-tier text-to-video generation with unparalleled motion fluidity, cinematic visuals, and exceptional prompt precision. | Deprecated | 8/13 | → |
| FLUX.1 Kontext [max] text-to-image | FLUX.1 Kontext [max] text-to-image is a new premium model brings maximum performance across all aspects – greatly improved prompt adherence. | Deprecated | 8/13 | → |
| FLUX.1 Kontext [max] image-to-image | FLUX.1 Kontext [max] is a model with greatly improved prompt adherence and typography generation meet premium consistency for editing without compromise on speed. | Deprecated | 8/13 | → |
| Hunyuan Avatar image-to-video |
HunyuanAvatar is a High-Fidelity Audio-Driven Human Animation model for Multiple Characters .
stylized transform |
Deprecated | 8/13 | → |
| FLUX.1 Kontext [max] image-to-image | Experimental version of FLUX.1 Kontext [max] with multi image handling capabilities | Deprecated | 8/13 | → |
| Image Editing image-to-image |
See how you or others might look at different ages, from younger to older, while preserving core facial features.
stylized transform |
Deprecated | 8/13 | → |
| Image Editing image-to-image |
Replace your photo's background with any scene you desire, from beach sunsets to urban landscapes, with perfect lighting and shadows
stylized transform |
Deprecated | 8/13 | → |
| Image Editing image-to-image |
Transform your photos into vibrant cool cartoons with bold outlines and rich colors.
stylized transform |
Deprecated | 8/13 | → |
| Image Editing image-to-image |
Perfect your photos with professional color grading, balanced tones, and vibrant yet natural colors
stylized transform |
Deprecated | 8/13 | → |
| Image Editing image-to-image |
Change facial expressions in photos to any emotion you desire, from smiles to serious looks.
stylized transform |
Deprecated | 8/13 | → |
| Image Editing image-to-image |
Enhance facial features with professional retouching while maintaining a natural, realistic look
stylized transform |
Deprecated | 8/13 | → |
| Image Editing image-to-image |
Experiment with different hairstyles, from bald to any style you can imagine, while maintaining natural lighting and realistic results.
stylized transform |
Deprecated | 8/13 | → |
| Image Editing image-to-image |
Remove unwanted objects or people from your photos while seamlessly blending the background.
stylized transform |
Deprecated | 8/13 | → |
| Image Editing image-to-image |
Turn your casual photos into stunning professional studio portraits with perfect lighting and high-end photography style.
stylized transform |
Deprecated | 8/13 | → |
| Image Editing image-to-image |
Place your subject in any scene you imagine, from enchanted forests to urban settings, with professional composition and lighting
stylized transform |
Deprecated | 8/13 | → |
| Image Editing image-to-image |
Transform your photos into artistic masterpieces inspired by famous styles like Van Gogh's Starry Night or any artistic style you choose.
stylized transform |
Deprecated | 8/13 | → |
| Image Editing image-to-image |
Transform your photos to any time of day, from golden hour to midnight, with appropriate lighting and atmosphere.
stylized transform |
Deprecated | 8/13 | → |
| Image Editing image-to-image |
Add realistic weather effects like snowfall, rain, or fog to your photos while maintaining the scene's mood.
stylized transform |
Deprecated | 8/13 | → |
| Chatterbox text-to-speech |
Whether you're working on memes, videos, games, or AI agents, Chatterbox brings your content to life. Use the first tts from resemble ai.
text-to-speech |
Deprecated | 8/13 | → |
| Chatterbox speech-to-speech |
Whether you're working on memes, videos, games, or AI agents, Chatterbox brings your content to life. Use the first tts from resemble ai.
speech-to-speech |
Deprecated | 8/13 | → |
| Image Editing image-to-image |
Restore and enhance old or damaged photos by removing imperfections, adding color while preserving the original character and details of the image.
stylized transform |
Deprecated | 8/13 | → |
| Image Editing image-to-image |
Remove all text and writing from images while preserving the background and natural appearance.
stylized transform |
Deprecated | 8/13 | → |
| PlayAI Inpaint audio-to-audio |
A novel way to perform audio editing, ensuring smooth transitions and consistent speaker characteristics for edits.
audio inpaint |
Deprecated | 8/13 | → |
| FLUX.1 [dev] text-to-image | FLUX.1 [dev] is a 12 billion parameter flow transformer that generates high-quality images from text. It is suitable for personal and commercial use. | Deprecated | 8/13 | → |
| FLUX.1 [schnell] text-to-image | Fastest inference in the world for the 12 billion parameter FLUX.1 [schnell] text-to-image model. | Deprecated | 8/13 | → |
| FLUX.1 [dev] image-to-image | FLUX.1 [dev] is a 12 billion parameter flow transformer that generates high-quality images from text. It is suitable for personal and commercial use. | Deprecated | 8/13 | → |
| FLUX.1 [dev] Redux image-to-image | FLUX.1 [dev] Redux is a high-performance endpoint for the FLUX.1 [dev] model that enables rapid transformation of existing images, delivering high-quality style transfers and image modifications with the core FLUX capabilities. | Deprecated | 8/13 | → |
| FLUX.1 [schnell] Redux image-to-image | FLUX.1 [schnell] Redux is a high-performance endpoint for the FLUX.1 [schnell] model that enables rapid transformation of existing images, delivering high-quality style transfers and image modifications with the core FLUX capabilities. | Deprecated | 8/13 | → |
| Chatterboxhd text-to-speech | Generate expressive, natural speech with Resemble AI's Chatterbox. Features unique emotion control, instant voice cloning from short audio, and built-in watermarking. | Deprecated | 8/13 | → |
| Chatterboxhd speech-to-speech | Transform voices using Resemble AI's Chatterbox. Convert audio to new voices or your own samples, with expressive results and built-in perceptual watermarking. | Deprecated | 8/13 | → |
| Luma Photon Reframe image-to-image |
Extend and reframe images with Luma Photon Reframe. This advanced tool intelligently expands your visuals, seamlessly blending new content to enhance creativity and adaptability, offering unmatched personalization and quality for creators at a fraction of the cost.
outpainting reframe |
Deprecated | 8/13 | → |
| Luma Photon Flash Reframe image-to-image |
This advanced tool intelligently expands your visuals, seamlessly blending new content to enhance creativity and adaptability, offering unmatched speed and quality for creators at a fraction of the cost.
flash reframe outpainting |
Deprecated | 8/13 | → |
| Luma Ray 2 Reframe video-to-video |
Adjust and enhance videos with Ray-2 Reframe. This advanced tool seamlessly reframes videos to your desired aspect ratio, intelligently inpainting missing regions to ensure realistic visuals and coherent motion, delivering exceptional quality and creative flexibility.
reframe outpaint |
Deprecated | 8/13 | → |
| Luma Ray 2 Flash Reframe video-to-video |
Adjust and enhance videos with Ray-2 Reframe. This advanced tool seamlessly reframes videos to your desired aspect ratio, intelligently inpainting missing regions to ensure realistic visuals and coherent motion, delivering exceptional quality and creative flexibility.
reframe outpaint flash |
Deprecated | 8/13 | → |
| Image Editing image-to-image |
Transform any person into their baby version, while preserving the original pose and expression with childlike features.
stylized transform |
Deprecated | 8/13 | → |
| Wan Vace 1 3b video-to-video |
Vace a video generation model that uses a source image, mask, and video to create prompted videos with controllable sources.
video-to-video |
Deprecated | 8/13 | → |
| Image Editing image-to-image |
The reframe endpoint intelligently adjusts an image's aspect ratio while preserving the main subject's position, composition, pose, and perspective
stylized transform |
Deprecated | 8/13 | → |
| Veo 3 text-to-video | Veo 3 by Google, the most advanced AI video generation model in the world. With sound on! | Deprecated | 8/13 | → |
| Luma Photon image-to-image |
Edit images from your prompts using Luma Photon. Photon is the most creative, personalizable, and intelligent visual models for creatives, bringing a step-function change in the cost of high-quality image generation.
image-to-image |
Deprecated | 8/13 | → |
| Luma Photon image-to-image |
Edit images from your prompts using Luma Photon. Photon is the most creative, personalizable, and intelligent visual models for creatives, bringing a step-function change in the cost of high-quality image generation.
image-to-image |
Deprecated | 8/13 | → |
| Ffmpeg Api Merge Audio-Video video-to-video |
Merge videos with standalone audio files or audio from video files.
ffmpeg |
Deprecated | 8/13 | → |
| Ffmpeg Api image-to-image |
ffmpeg endpoint for first, middle and last frame extraction from videos
utility editing |
Deprecated | 8/13 | → |
| Bytedance text-to-image | Seedream 3.0 is a bilingual (Chinese and English) text-to-image model that excels at text-to-image generation. | Deprecated | 8/13 | → |
| Wan-2.1 LoRA Trainer training |
Train custom LoRAs for Wan-2.1 FLF2V 720P
lora training |
Deprecated | 8/13 | → |
| Wan-2.1 LoRA Trainer training |
Train custom LoRAs for Wan-2.1 I2V 720P
lora training |
Deprecated | 8/13 | → |
| Wan-2.1 LoRA Trainer training |
Train custom LoRAs for Wan-2.1 T2V 14B
lora training |
Deprecated | 8/13 | → |
| Wan-2.1 LoRA Trainer training |
Train custom LoRAs for Wan-2.1 T2V 1.3B
lora training |
Deprecated | 8/13 | → |
| Imagen 4 text-to-image | Google’s highest quality image generation model | Deprecated | 8/13 | → |
| Recraft image-to-image |
Converts a given raster image to SVG format using Recraft model.
stylized transform |
Deprecated | 8/13 | → |
| Seedance 1.0 Lite text-to-video | Seedance 1.0 Lite | Deprecated | 8/13 | → |
| Seedance 1.0 Lite image-to-video | Seedance 1.0 Lite | Deprecated | 8/13 | → |
| Hunyuan 3D 2.1 image-to-3d |
Hunyuan3D-2.1 is a scalable 3D asset creation system that advances state-of-the-art 3D generation through Physically-Based Rendering (PBR).
image-to-3d |
Deprecated | 8/13 | → |
| Seedance 1.0 Pro image-to-video | Seedance 1.0 Pro, a high quality video generation model developed by Bytedance. | Deprecated | 8/13 | → |
| Seedance 1.0 Pro text-to-video | Seedance 1.0 Pro, a high quality video generation model developed by Bytedance. | Deprecated | 8/13 | → |
| Object Removal image-to-image |
Removes objects and their visual effects using natural language, replacing them with contextually appropriate content
utility editing |
Deprecated | 8/13 | → |
| Object Removal image-to-image |
Removes mask-selected objects and their visual effects, seamlessly reconstructing the scene with contextually appropriate content.
utility editing |
Deprecated | 8/13 | → |
| Object Removal image-to-image |
Removes box-selected objects and their visual effects, seamlessly reconstructing the scene with contextually appropriate content.
utility editing |
Deprecated | 8/13 | → |
| Bria 3.2 Text-to-Image text-to-image |
Bria’s Text-to-Image model, trained exclusively on licensed data for safe and risk-free commercial use. Excels in Text-Rendering and Aesthetics.
image generation |
Deprecated | 8/13 | → |
| PASD image-to-image |
Pixel-Aware Diffusion Model for Realistic Image Super-Resolution and Personalized Stylization
utility editing |
Deprecated | 8/13 | → |
| MiniMax Hailuo 02 [Standard] (Text to Video) text-to-video | MiniMax Hailuo-02 Text To Video API (Standard, 768p): Advanced video generation model with 768p resolution | Deprecated | 8/13 | → |
| MiniMax Hailuo 02 [Pro] (Text to Video) text-to-video | MiniMax Hailuo-02 Text To Video API (Pro, 1080p): Advanced video generation model with 1080p resolution | Deprecated | 8/13 | → |
| MiniMax Hailuo 02 [Pro] (Image to Video) image-to-video | MiniMax Hailuo-02 Image To Video API (Pro, 1080p): Advanced image-to-video generation model with 1080p resolution | Deprecated | 8/13 | → |
| MiniMax Hailuo 02 [Standard] (Image to Video) image-to-video | MiniMax Hailuo-02 Image To Video API (Standard, 768p, 512p): Advanced image-to-video generation model with 768p and 512p resolutions | Deprecated | 8/13 | → |
| Tripo3D image-to-3d |
State of the art Multiview to 3D Object generation. Generate 3D models from multiple images!
stylized multiview |
Deprecated | 8/13 | → |
| Chain Of Zoom image-to-image | Extreme Super-Resolution via Scale Autoregression and Preference Alignment | Deprecated | 8/13 | → |
| Wan VACE 14B video-to-video |
VACE is a video generation model that uses a source image, mask, and video to create prompted videos with controllable sources.
image-to-video video-to-video text-to-video |
Deprecated | 8/13 | → |
| Wan VACE 14B video-to-video |
VACE is a video generation model that uses a source image, mask, and video to create prompted videos with controllable sources.
image-to-video video-to-video text-to-video |
Deprecated | 8/13 | → |
| Wan VACE 14B video-to-video |
VACE is a video generation model that uses a source image, mask, and video to create prompted videos with controllable sources.
image-to-video video-to-video text-to-video |
Deprecated | 8/13 | → |
| Wan VACE 14B video-to-video |
VACE is a video generation model that uses a source image, mask, and video to create prompted videos with controllable sources.
image-to-video video-to-video text-to-video |
Deprecated | 8/13 | → |
| Wan VACE 14B video-to-video |
VACE is a video generation model that uses a source image, mask, and video to create prompted videos with controllable sources.
reframe |
Deprecated | 8/13 | → |
| Video Understanding vision |
A video understanding model to analyze video content and answer questions about what's happening in the video based on user prompts.
utility vision |
Deprecated | 8/13 | → |
| Ai Avatar image-to-video |
MultiTalk model generates a multi-person conversation video from an image and audio files. Creates a realistic scene where multiple people speak in sequence.
stylized transform |
Deprecated | 8/13 | → |
| Ai Avatar image-to-video |
MultiTalk model generates a multi-person conversation video from an image and text inputs. Converts text to speech for each person, generating a realistic conversation scene.
stylized transform |
Deprecated | 8/13 | → |
| Ai Avatar image-to-video |
MultiTalk model generates a talking avatar video from an image and audio file. The avatar lip-syncs to the provided audio with natural facial expressions.
stylized transform |
Deprecated | 8/13 | → |
| Ai Avatar image-to-video |
MultiTalk model generates a talking avatar video from an image and text. Converts text to speech automatically, then generates the avatar speaking with lip-sync.
stylized transform |
Deprecated | 8/13 | → |
| FASHN Virtual Try-On V1.6 image-to-image |
FASHN v1.6 delivers precise virtual try-on capabilities, accurately rendering garment details like text and patterns at 864x1296 resolution from both on-model and flat-lay photo references.
try-on fashion clothing |
Deprecated | 8/13 | → |
| Omnigen V2 text-to-image |
OmniGen is a unified image generation model that can generate a wide range of images from multi-modal prompts. It can be used for various tasks such as Image Editing, Personalized Image Generation, Virtual Try-On, Multi Person Generation and more!
multimodal editing try-on |
Deprecated | 8/13 | → |
| Flux Kontext Lora image-to-image |
Fast endpoint for the FLUX.1 Kontext [dev] model with LoRA support, enabling rapid and high-quality image editing using pre-trained LoRA adaptations for specific styles, brand identities, and product-specific outputs.
image-editing image-to-image |
Deprecated | 8/13 | → |
| Flux Kontext Lora text-to-image |
Super fast text-to-image endpoint for the FLUX.1 Kontext [dev] model with LoRA support, enabling rapid and high-quality image generation using pre-trained LoRA adaptations for personalization, specific styles, brand identities, and product-specific outputs.
text-to-image |
Deprecated | 8/13 | → |
| Image Editing image-to-image |
Transform your photos into cool plushies while keeping the original characters likeness
stylized transform |
Deprecated | 8/13 | → |
| Image Editing image-to-image |
Transform your photos into wojak style while keeping the original characters likeness
stylized transform |
Deprecated | 8/13 | → |
| Image Editing image-to-image |
Transform your character's hair into broccoli style while keeping the original characters likeness
stylized transform |
Deprecated | 8/13 | → |
| Flux Kontext Trainer training | LoRA trainer for FLUX.1 Kontext [dev] | Deprecated | 8/13 | → |
| Bytedance image-to-image |
SeedEdit 3.0 is an image editing model independently developed by ByteDance. It excels in accurately following editing instructions and effectively preserving image content, especially excelling in handling real images
image-editing image-to-image |
Deprecated | 8/13 | → |
| Luma Ray 2 Modify video-to-video |
Ray2 Modify is a video generative model capable of restyling or retexturing the entire shot, from turning live-action into CG or stylized animation, to changing wardrobe, props, or the overall aesthetic and swap environments or time periods, giving you control over background, location, or even weather.
modify restyle |
Deprecated | 8/13 | → |
| Video video-to-video |
Automatically remove backgrounds from videos -perfect for creating clean, professional content without a green screen.
background-removal |
Deprecated | 8/13 | → |
| Image Editing image-to-image |
Generate YouTube thumbnails with custom text
stylized transform |
Deprecated | 8/13 | → |
| Pixverse video-to-video |
Generate realistic lipsync animations from audio using advanced algorithms for high-quality synchronization with PixVerse Lipsync model
animation lip sync |
Deprecated | 8/13 | → |
| Pixverse video-to-video |
PixVerse Extend model is a video extending tool for your videos using with high-quality video extending techniques
utility editing |
Deprecated | 8/13 | → |
| Pixverse video-to-video |
PixVerse Extend model is a video extending tool for your videos using with high-quality video extending techniques
utility editing |
Deprecated | 8/13 | → |
| Post Processing image-to-image |
Apply Gaussian or Kuwahara blur effects with adjustable radius and sigma parameters
stylized transform |
Deprecated | 8/13 | → |
| Post Processing image-to-image |
Create chromatic aberration by shifting red, green, and blue channels horizontally or vertically with customizable shift amounts.
stylized transform |
Deprecated | 8/13 | → |
| Post Processing image-to-image |
Adjust color temperature, brightness, contrast, saturation, and gamma values for color correction.
stylized transform |
Deprecated | 8/13 | → |
| Post Processing image-to-image |
Apply various color tints (sepia, red, green, blue, cyan, magenta, yellow, purple, orange, warm, cool, lime, navy, vintage, rose, teal, maroon, peach, lavender, olive) with adjustable strength.
stylized transform |
Deprecated | 8/13 | → |
| Post Processing image-to-image |
Reduce color saturation using different methods (luminance Rec.709, luminance Rec.601, average, lightness) with adjustable factor.
stylized transform |
Deprecated | 8/13 | → |
| Post Processing image-to-image |
Blend two images together using smooth linear interpolation with a configurable blend factor.
stylized transform |
Deprecated | 8/13 | → |
| Post Processing image-to-image |
Apply dodge and burn effects with multiple modes and adjustable intensity.
stylized transform |
Deprecated | 8/13 | → |
| Post Processing image-to-image |
Apply film grain effect with different styles (modern, analog, kodak, fuji, cinematic, newspaper) and customizable intensity and scale
stylized transform |
Deprecated | 8/13 | → |
| Post Processing image-to-image |
Apply a parabolic distortion effect with configurable coefficient and vertex position.
stylized transform |
Deprecated | 8/13 | → |
| Post Processing image-to-image |
Apply sharpening effects with three modes: basic unsharp mask, smart sharpening with edge preservation, and Contrast Adaptive Sharpening (CAS).
stylized transform |
Deprecated | 8/13 | → |
| Post Processing image-to-image |
Apply solarization effect by inverting pixel values above a threshold
stylized transform |
Deprecated | 8/13 | → |
| Post Processing image-to-image |
Add a darkening vignette effect around the edges of the image with adjustable strength
stylized transform |
Deprecated | 8/13 | → |
| ThinkSound video-to-video |
Generate realistic audio for a video with an optional text prompt and combine
audio-generation video-to-audio |
Deprecated | 8/13 | → |
| ThinkSound video-to-video |
Generate realistic audio from a video with an optional text prompt
audio-generation video-to-audio |
Deprecated | 8/13 | → |
| Image Editing image-to-image |
Add details to faces, enhance face features, remove blur.
stylized transform realism |
Deprecated | 8/13 | → |
| Pixverse video-to-video |
Add immersive sound effects and background music to your videos using PixVerse sound effects generation
audio utility |
Deprecated | 8/13 | → |
| Bria image-to-image | Structure Reference allows generating new images while preserving the structure of an input image, guided by text prompts. Perfect for transforming sketches, illustrations, or photos into new illustrations. Trained exclusively on licensed data for safe and risk-free commercial use. | Deprecated | 8/13 | → |
| Vidu image-to-video |
Generate video clips from your multiple image references using Vidu Q1
stylized transform |
Deprecated | 8/13 | → |
| Ffmpeg Api json |
Get EBU R128 loudness normalization from audio files using FFmpeg API.
ffmpeg |
Deprecated | 8/13 | → |
| Veo 3 Fast text-to-video | Faster and more cost effective version of Google's Veo 3! | Deprecated | 8/13 | → |
| Veo 3 Fast [Image to Video] image-to-video | Generate videos from your images via Veo 3 Fast | Deprecated | 8/13 | → |
| Calligrapher image-to-image |
Use the text and font retaining capabilities of calligrapher to modify texts on your books, clothes and many more.
image-to-image |
Deprecated | 8/13 | → |
| Fashion Photoshoot image-to-image |
Instant fashion photoshoot with a selfie and an outfit
image-to-image |
Deprecated | 8/13 | → |
| any-llm Enterprise llm |
Run any large language model with fal, powered by OpenRouter.
This endpoint only supports models that do not train on private data.
Read more in OpenRouter's Privacy and Logging documentation.
chat claude gpt |
Deprecated | 8/13 | → |
| LTX-Video 13B 0.9.8 Distilled video-to-video |
Generate long videos from prompts, images, and videos using LTX Video-0.9.8 13B Distilled and custom LoRA
video ltx-video video-to-video multicondition-to-video image-to-video |
Deprecated | 8/13 | → |
| LTX-Video 13B 0.9.8 Distilled text-to-video |
Generate long videos from prompts using LTX Video-0.9.8 13B Distilled and custom LoRA
video ltx-video text-to-video |
Deprecated | 8/13 | → |
| LTX-Video 13B 0.9.8 Distilled image-to-video |
Generate long videos from prompts and images using LTX Video-0.9.8 13B Distilled and custom LoRA
video ltx-video image-to-video |
Deprecated | 8/13 | → |
| Luma Ray 2 Flash Modify video-to-video |
Ray2 Flash Modify is a video generative model capable of restyling or retexturing the entire shot, from turning live-action into CG or stylized animation, to changing wardrobe, props, or the overall aesthetic and swap environments or time periods, giving you control over background, location, or even weather.
modify restyle |
Deprecated | 8/13 | → |
| Lipsync video-to-video | Realistic lipsync video - optimized for speed, quality, and consistency. | Deprecated | 8/13 | → |
| MiniMax Voice Design text-to-speech |
Design a personalized voice from a text description, and generate speech from text prompts using the MiniMax model, which leverages advanced AI techniques to create high-quality text-to-speech.
speech |
Deprecated | 8/13 | → |
| FILM image-to-image |
Interpolate images with FILM - Frame Interpolation for Large Motion
interpolation |
Deprecated | 8/13 | → |
| FILM video-to-video |
Interpolate videos with FILM - Frame Interpolation for Large Motion
interpolation |
Deprecated | 8/13 | → |
| RIFE image-to-image |
Interpolate images with RIFE - Real-Time Intermediate Flow Estimation
interpolation |
Deprecated | 8/13 | → |
| RIFE video-to-video |
Interpolate videos with RIFE - Real-Time Intermediate Flow Estimation
interpolation |
Deprecated | 8/13 | → |
| LTX-Video 13B 0.9.8 Distilled video-to-video |
Extend videos using LTX Video-0.9.8 13B Distilled and custom LoRA
ltx-video extend |
Deprecated | 8/13 | → |
| Hidream E1 1 image-to-image | Edit images with natural language | Deprecated | 8/13 | → |
| Image Editing image-to-image | Retouch photos of faces. Remove blemishes and improve the skin. | Deprecated | 8/13 | → |
| Sky Raccoon text-to-image |
Generate images from a text prompt.
text-to-image |
Deprecated | 8/13 | → |
| OmniHuman image-to-video |
OmniHuman generates video using an image of a human figure paired with an audio file. It produces vivid, high-quality videos where the character’s emotions and movements maintain a strong correlation with the audio.
image-to-video lipsync |
Deprecated | 8/13 | → |
| NSFW Checker vision |
Predict whether an image is NSFW or SFW.
filter safety utility |
Deprecated | 8/13 | → |
| Hunyuan World image-to-3d | Hunyuan World 1.0 turns a single image into a panorama or a 3D world. It creates realistic scenes from the image, allowing you to explore and view it from different angles. | Deprecated | 8/13 | → |
| Hunyuan World image-to-image | Hunyuan World 1.0 turns a single image into a panorama or a 3D world. It creates realistic scenes from the image, allowing you to explore and view it from different angles. | Deprecated | 8/13 | → |
| Wan-2.2 Text-to-Video A14B text-to-video |
Wan-2.2 text-to-video is a video model that generates high-quality videos with high visual quality and motion diversity from text prompts.
text to video motion |
Deprecated | 8/13 | → |
| Wan v2.2 5B text-to-video | Wan 2.2's 5B model produces up to 5 seconds of video 720p at 24FPS with fluid motion and powerful prompt understanding | Deprecated | 8/13 | → |
| Flux Kontext Lora image-to-image |
Fast inpainting endpoint for the FLUX.1 Kontext [dev] model with LoRA support, enabling rapid and high-quality image inpainting with reference images, while using pre-trained LoRA adaptations for specific styles, brand identities, and product-specific outputs.
image-editing image-inpainting image-to-image |
Deprecated | 8/13 | → |
| Wan v2.2 5B image-to-video | Wan 2.2's 5B model produces up to 5 seconds of video 720p at 24FPS with fluid motion and powerful prompt understanding | Deprecated | 8/13 | → |
| FLUX.1 Krea [dev] text-to-image | FLUX.1 Krea [dev] is a 12 billion parameter flow transformer that generates high-quality images from text with incredible aesthetics. It is suitable for personal and commercial use. | Deprecated | 8/13 | → |
| FLUX.1 Krea [dev] Redux image-to-image | FLUX.1 Krea [dev] Redux is a high-performance endpoint for the FLUX.1 Krea [dev] model that enables rapid transformation of existing images, delivering high-quality style transfers and image modifications with the core FLUX capabilities. | Deprecated | 8/13 | → |
| FLUX.1 Krea [dev] image-to-image | FLUX.1 Krea [dev] is a 12 billion parameter flow transformer that generates high-quality images from text with incredible aesthetics. It is suitable for personal and commercial use. | Deprecated | 8/13 | → |
| FLUX.1 Krea [dev] text-to-image | FLUX.1 Krea [dev] is a 12 billion parameter flow transformer that generates high-quality images from text with incredible aesthetics. It is suitable for personal and commercial use. | Deprecated | 8/13 | → |
| FLUX.1 Krea [dev] Redux image-to-image | FLUX.1 Krea [dev] Redux is a high-performance endpoint for the FLUX.1 Krea [dev] model that enables rapid transformation of existing images, delivering high-quality style transfers and image modifications with the core FLUX capabilities. | Deprecated | 8/13 | → |
| FLUX.1 Krea [dev] image-to-image | FLUX.1 Krea [dev] is a 12 billion parameter flow transformer that generates high-quality images from text with incredible aesthetics. It is suitable for personal and commercial use. | Deprecated | 8/13 | → |
| Wan text-to-video |
Wan-2.2 turbo text-to-video is a video model that generates high-quality videos with high visual quality and motion diversity from text prompts.
text to video motion |
Deprecated | 8/13 | → |
| Wan image-to-video | Wan-2.2 Turbo image-to-video is a video model that generates high-quality videos with high visual quality and motion diversity from text prompts. | Deprecated | 8/13 | → |
| Veo3 image-to-video | Veo 3 is the latest state-of-the art video generation model from Google DeepMind | Deprecated | 8/13 | → |
| Fashion Try On image-to-image | Instant fashion try on with a full-body pic and an outfit | Deprecated | 8/13 | → |
| FLUX.1 Krea [dev] with LoRAs image-to-image |
FLUX LoRA Image-to-Image is a high-performance endpoint that transforms existing images using FLUX models, leveraging LoRA adaptations to enable rapid and precise image style transfer, modifications, and artistic variations.
lora style transfer |
Deprecated | 8/13 | → |
| FLUX.1 Krea [dev] with LoRAs text-to-image |
Super fast endpoint for the FLUX.1 [dev] model with LoRA support, enabling rapid and high-quality image generation using pre-trained LoRA adaptations for personalization, specific styles, brand identities, and product-specific outputs.
lora personalization |
Deprecated | 8/13 | → |
| FLUX.1 Krea [dev] Inpainting with LoRAs image-to-image |
Super fast endpoint for the FLUX.1 [dev] inpainting model with LoRA support, enabling rapid and high-quality image inpaingting using pre-trained LoRA adaptations for personalization, specific styles, brand identities, and product-specific outputs.
lora personalization |
Deprecated | 8/13 | → |
| Flux Krea Lora text-to-image |
Super fast endpoint for the FLUX.1 [dev] model with LoRA support, enabling rapid and high-quality image generation using pre-trained LoRA adaptations for personalization, specific styles, brand identities, and product-specific outputs.
lora personalization |
Deprecated | 8/13 | → |
| Train Flux Krea LoRA training |
Train styles, people and other subjects at blazing speeds using the FLUX.1 Krea [dev] base model.
lora personalization |
Deprecated | 8/13 | → |
| Wan video-to-video | Wan-2.2 video-to-video is a video model that generates high-quality videos with high visual quality and motion diversity from text prompts and source videos. | Deprecated | 8/13 | → |
| Qwen Image text-to-image |
Qwen-Image is an image generation foundation model in the Qwen series that achieves significant advances in complex text rendering and precise image editing.
text-to-image |
Deprecated | 8/13 | → |
| Wan text-to-image | Wan 2.2's 14B model generates high-resolution, photorealistic images with powerful prompt understanding and fine-grained visual detail | Deprecated | 8/13 | → |
| Wan text-to-image | Wan 2.2's 5B model generates high-resolution, photorealistic images with powerful prompt understanding and fine-grained visual detail | Deprecated | 8/13 | → |
| Wan text-to-image | Wan 2.2's 14B model with LoRA support generates high-fidelity images with enhanced prompt alignment, style adaptability. | Deprecated | 8/13 | → |
| Wan text-to-video |
Wan 2.2's 5B FastVideo model produces up to 5 seconds of video 720p at 24FPS with fluid motion and powerful prompt understanding
text to video motion |
Deprecated | 8/13 | → |
| Bytedance text-to-image |
Dreamina showcases superior picture effects, with significant improvements in picture aesthetics, precise and diverse styles, and rich details.
text-to-image |
Deprecated | 8/13 | → |
| Minimax image-to-video |
Create blazing fast and economical videos with MiniMax Hailuo-02 Image To Video API at 512p resolution
stylized transform |
Deprecated | 8/13 | → |
| Wan text-to-video | Wan 2.2's 5B distill model produces up to 5 seconds of video 720p at 24FPS with fluid motion and powerful prompt understanding | Deprecated | 8/13 | → |
| Wan v2.2 A14B Image-to-Video A14B with LoRAs image-to-video |
Wan-2.2 image-to-video is a video model that generates high-quality videos with high visual quality and motion diversity from text prompts and images. This endpoint supports LoRAs made for Wan 2.2
image-to-video motion lora |
Deprecated | 8/13 | → |
| Wan-2.2 Text-to-Video A14B with LoRAs text-to-video | Wan-2.2 text-to-video is a video model that generates high-quality videos with high visual quality and motion diversity from text prompts. This endpoint supports LoRAs made for Wan 2.2. | Deprecated | 8/13 | → |
| Ideogram V3 Character Remix image-to-image |
Transform your consistent character into different art styles, settings, or scenarios while maintaining their distinctive appearance and identity
character-consistency |
Deprecated | 8/13 | → |
| Ideogram V3 Character image-to-image |
Generate consistent character appearances across multiple images. Maintain facial features, proportions, and distinctive traits for cohesive storytelling and branding
character-consistency |
Deprecated | 8/13 | → |
| Ideogram V3 Character Edit image-to-image |
Modify consistent characters while preserving their core identity. Edit poses, expressions, or clothing without losing recognizable character features
character-consistency |
Deprecated | 8/13 | → |
| Wan 2.2 14B Image Trainer training |
Wan 2.2 text to image LoRA trainer. Fine-tune Wan 2.2 for subjects and styles with unprecedented detail.
lora personalization |
Deprecated | 8/13 | → |
| Ffmpeg Api video-to-video | Use ffmpeg capabilities to merge 2 or more videos. | Deprecated | 8/13 | → |
| Bytedance image-to-video |
Transform your images into stylized videos using this workflow.
image-to-video effects |
Deprecated | 8/13 | → |
| EchoMimic V3 audio-to-video |
EchoMimic V3 generates a talking avatar model from a picture, audio and text prompt.
echomimic talking-head audio-to-video |
Deprecated | 8/13 | → |