goodguy1963 / enhance-and-upscale-ai-photos

Pure compute-driven AI workflow removes flaws, enriches depth, refines detail 2x & 4x upscales output into unmatched photorealistic quality. Read the description.⚠️Expensive: ~0.35 $/image (2x upscale) Runtime: 4 minutes

  • Public
  • 537 runs
Iterate in playground

Run time and cost

This model costs approximately $0.21 to run on Replicate, or 4 runs per $1, but this varies depending on your inputs. It is also open source and you can run it on your own computer with Docker.

This model runs on Nvidia H100 GPU hardware. Predictions typically complete within 141 seconds.

Readme

Upscale and Enhance AI-generated images into photorealistic masterpieces with pure compute-driven AI workflow

Advanced AI image enhancement that eliminates artifacts, adds natural depth, improves details, and upscales with photorealistic quality for stunning, professional results.

📋 Overview

This specialized workflow prioritizes quality over speed, using an intensive multi-stage process to transform AI-generated images. Rather than quick fixes, it employs a thorough sequence of professional enhancement techniques that combine multiple state-of-the-art Stable Diffusion models with powerful computer vision to:

  • Fix AI artifacts and improve photorealism through deliberate multi-stage processing
  • 🔍 Add natural depth with realistic depth-of-field effects and subtle background blur
  • 🖼️ Preserve composition while enhancing details and correcting common AI flaws
  • 📈 Upscale by 2x or 4x with AI-powered detail preservation and enhancement

🚀 Features

  • High-Quality Focus: Optimized for maximum image quality, not processing speed
  • Multi-Stage Processing Pipeline: Uses a sophisticated sequence of model applications rather than a single-pass approach
  • AI Image Enhancement: Specifically tuned to improve AI-generated images for more photorealistic results
  • Depth-Aware Processing: Uses Depth Anything V2 to add realistic depth effects often missing in AI art
  • Intelligent Depth Blur: Applies light, natural-looking blur based on depth map for more photographic results
  • Content Safety: Automatically detects and blurs NSFW content for appropriate usage
  • Flexible Upscaling: Choose between 2x or 4x upscaling depending on your needs
  • High-Resolution Output: Upscales images while maintaining quality
  • JPEG Output: All processed images are saved in high-quality JPEG format

📥 Input/Output Options

  • Single Image: Process one image at a time (JPEG, PNG, WebP, BMP supported)

  • Output Format: All images are output as high-quality JPEGs for consistency

🖌️ Models Used

This workflow combines several powerful models to achieve stunning results:

🎨 Stable Diffusion Models

  • cyberillustrious v3.5 by Cyberdelia - Realistic detail enhancement and excellent text handling (Creator of the Illustrious-XL-v1.0: OnomaAI based on SDXL by [StabilityAI] https://huggingface.co/stabilityai/stable-diffusion-xl-base-1.0 )
  • cyberrealistic v5.7 by Cyberdelia - Upscale model
  • epicrealismNatural v4.0 by epinikion - Good hair
  • perfectdeliberate v5 by Desync - Good skin

🔍 ControlNets & Special Models

  • Depth Anything V2 - Advanced depth map generation for realistic focal effects
  • OpenPoseXL2 - Pose understanding for better human subjects
  • diffusers_xl_depth_full - Depth-based detail control and subtle background blurring
  • Face Detailer Node - Enhances facial features and details for more realistic portraits
  • Florence-2 - Used to caption images and OCR text in the images

🔧 Enhancement Tools

  • HandFineTuning_XL by Hustmox - Hand detail improvement
  • 4xRealWebPhoto_v4 by Phips - High-quality photo-realistic upscaling
  • IP-Adapter SDXL Plus - Style and composition guidance

💡 Usage Guide

  1. Upload a image
  2. Wait for the enhancement process to complete (About 3 Minutes)
  3. Download your enhanced image

⚠️ Important Limitations

  • Image Ratio: Accepts all image aspect ratios. Inputs are resized so that the longest side is 1536 px and the shorter side is scaled proportionally. No cropping.
  • Output Resolution: Final images have the longest side at either 3072 px (2×) or 6144 px (4×), with the shorter side scaled proportionally.
  • Text Processing: While the tool can process text in images, results may not always be perfect.
  • Depth Blur: The depth-based blur effect can be disabled if preferred.

🙏 Acknowledgements

This workflow builds on the incredible work of many talented developers and model creators:

Model Creators

  • Cyberdelia for cyberillustrious v3.5
  • epinikion for epicrealismNatural v4.0
  • Desync for perfectdeliberate v5
  • Hustmox for HandFineTuning_XL
  • Phips for 4xRealWebPhoto_v4 upscaler
  • tencent-ailab for IP-Adapter (with IPAdapter Plus implementation from cubiq)
  • Depth Anything team for Depth Anything V2 depth detection model
  • Microsoft for the Florence-2 vision-language model
  • MiaoshouAI for the Florence-2 prompt generator implementation

Tools & Frameworks

  • ComfyUI team for the incredible workflow engine
  • Depth Anything V2 for vision-language architectures
  • pysssss for the String Function and WD14 Tagger nodes
  • ComfyUI Pro Post Processing team for depth-map blur effects and focal depth control
  • Upgraded Hardware: Runs on NVIDIA H100 GPUs for faster and more efficient processing

Help & Bugs: Email me: bugs(at)3doffice.at

⭐ Thanks. ⭐