goodguy1963/enhance-and-upscale-ai-photos | Run with an API on Replicate

Readme

Upscale and Enhance AI-generated images into photorealistic masterpieces with pure compute-driven AI workflow

Advanced AI image enhancement that eliminates artifacts, adds natural depth, improves details, and upscales with photorealistic quality for stunning, professional results.

The model can be used commercially, when depth blur is disabled. (DepthAnythingV2 Large)

📋 Overview

This specialized workflow prioritizes quality over speed, using an intensive multi-stage process to transform AI-generated images. Rather than quick fixes, it employs a thorough sequence of professional enhancement techniques that combine multiple state-of-the-art Stable Diffusion models with powerful computer vision to:

✨ Fix AI artifacts and improve photorealism through deliberate multi-stage processing
🔍 Add natural depth with realistic depth-of-field effects and subtle background blur
🖼️ Preserve composition while enhancing details and correcting common AI flaws
📈 Upscale by 2x or 4x with AI-powered detail preservation and enhancement

🚀 Features

High-Quality Focus: Optimized for maximum image quality, not processing speed
Multi-Stage Processing Pipeline: Uses a sophisticated sequence of model applications rather than a single-pass approach
AI Image Enhancement: Specifically tuned to improve AI-generated images for more photorealistic results
Depth-Aware Processing: Uses Depth Anything V2 to add realistic depth effects often missing in AI art
Intelligent Depth Blur: Applies light, natural-looking blur based on depth map for more photographic results
Content Safety: Automatically detects and blurs NSFW content for appropriate usage
Flexible Upscaling: Choose between 2x or 4x upscaling depending on your needs
High-Resolution Output: Upscales images while maintaining quality
JPEG Output: All processed images are saved in high-quality JPEG format

📥 Input/Output Options

Single Image: Process one image at a time (JPEG, PNG, WebP, BMP supported)
Output Format: All images are output as high-quality JPEGs for consistency

🖌️ Models Used

This workflow combines several powerful models to achieve stunning results:

🎨 Stable Diffusion Models

cyberillustrious v3.5 by Cyberdelia - Realistic detail enhancement and excellent text handling (Creator of the Illustrious-XL-v1.0: OnomaAI based on SDXL by [StabilityAI] https://huggingface.co/stabilityai/stable-diffusion-xl-base-1.0 )
cyberrealistic v5.7 by Cyberdelia - Upscale model
epicrealismNatural v4.0 by epinikion - Good hair
perfectdeliberate v5 by Desync - Good skin

🔍 ControlNets & Special Models

Depth Anything V2 - Advanced depth map generation for realistic focal effects
OpenPoseXL2 - Pose understanding for better human subjects
diffusers_xl_depth_full - Depth-based detail control and subtle background blurring
Face Detailer Node - Enhances facial features and details for more realistic portraits
Florence-2 - Used to caption images and OCR text in the images

🔧 Enhancement Tools

HandFineTuning_XL by Hustmox - Hand detail improvement
4xRealWebPhoto_v4 by Phips - High-quality photo-realistic upscaling
IP-Adapter SDXL Plus - Style and composition guidance

💡 Usage Guide

Upload a image
Wait for the enhancement process to complete (About 3 Minutes)
Download your enhanced image

⚠️ Important Limitations

Image Ratio: Accepts all image aspect ratios. Inputs are resized so that the longest side is 1536 px and the shorter side is scaled proportionally. No cropping.
Output Resolution: Final images have the longest side at either 3072 px (2×) or 6144 px (4×), with the shorter side scaled proportionally.
Text Processing: While the tool can process text in images, results may not always be perfect.
Depth Blur: The depth-based blur effect can be disabled if preferred.

🙏 Acknowledgements

This workflow builds on the incredible work of many talented developers and model creators:

Model Creators

Cyberdelia for cyberillustrious v3.5
epinikion for epicrealismNatural v4.0
Desync for perfectdeliberate v5
Hustmox for HandFineTuning_XL
Phips for 4xRealWebPhoto_v4 upscaler
tencent-ailab for IP-Adapter (with IPAdapter Plus implementation from cubiq)
Depth Anything team for Depth Anything V2 depth detection model
Microsoft for the Florence-2 vision-language model
MiaoshouAI for the Florence-2 prompt generator implementation

Tools & Frameworks

ComfyUI team for the incredible workflow engine
Depth Anything V2 for vision-language architectures
pysssss for the String Function and WD14 Tagger nodes
ComfyUI Pro Post Processing team for depth-map blur effects and focal depth control
Upgraded Hardware: Runs on NVIDIA H100 GPUs for faster and more efficient processing

Help & Bugs: Email me: bugs(at)3doffice.at

⭐ Thanks. ⭐

Model created 10 months, 2 weeks ago

Model updated 9 months, 3 weeks ago

Examples

Run time and cost