Image Generation

Models

text2image:

karlo text2image model
DeepFloyd if by StabilityAI open-source text-to-image model with photorealism and language understanding. code
Kandinsky multilingual text2image latent diffusion model
stable diffusion 1.5
stable diffusion 2.0
stable diffusion 2.1
stable diffusion xl (SDXL) base 0.9 & refinder 0.9
AnimateDiff Animate Your Personalized Text-to-Image Diffusion Models without Specific Tuning
PixArt-alpha Fast Training of Diffusion Transformer for Photorealistic Text-to-Image Synthesis, paper
Latent Consistency Models LoRAs for high quality few step image generation
OnnxStream Stable Diffusion XL 1.0 Base with 298MB of RAM
StreamDiffusion A Pipeline-Level Solution for Real-Time Interactive Generation
AnyText Code and Model for a diffusion pipeline covering a latent module and text embedding to generate and manipulate text in images

text to 3d:

OpenAI shap-E a text/image to 3D model
shap-e local run text-to-3d locally
stable-dreamfusion A PyTorch implementation of the text-to-3D model Dreamfusion using the Stable Diffusion text-to-2D model

image to 3d:

Wonder3D A cross-domain diffusion model for 3D reconstruction from a single image
DreamCraft3D Official implementation of DreamCraft3D: Hierarchical 3D Generation with Bootstrapped Diffusion Prior

image to text (OCR):

other:

facebookresearch/segment-anything image segmentation
- YOLOv8 SOTA object detection, segmentation, classification and tracking
- DINOv2 1B-parameter ViT model to generate robust all-purpose visual features that outperform OpenCLIP benchmarks at image and pixel levels
Final2x Image super-resolution through interpolation supporting multiple models like RealCUGAN, ESRGAN, Waifu2x, SRMD
text-to-room text to room
DragGAN Interactive Point-based Manipulation on Generative Images, demo
DragDiffusion Harnessing Diffusion Models for Interactive Point-based Image Editing
HQTrack Tracking Anything in High Quality (HQTrack) is a framework for high performance video object tracking and segmentation
CoTracker It is Better to Track Together. A fast transformer-based model that can track any point in a video
ZeroNVS Zero shot 460 degree view synthesis from single images
x-stable-diffusion Real-time inference for Stable Diffusion - 0.88s latency

ComfyUI powerful and modular stable diffusion pipelines using a graph/nodes/flowchart based interface, runs SDXL 0.9, SD2.1, SD2.0, SD1.5
- ComfyUI-Manager installs missing custom nodes automatically
- SeargeSDXL Custom SDXL Node for easier SDXL usage and img2img workflow that utilizes base & refiner
- Sytan ComfyUI SDXL workflow with txt2img using base and refiner
Automatic1111/stable-diffusion-webui well known UI for Stable Diffusion
- sd-webui-cloud-inference extension via omniinfer.io
SD.Next vladmandic/automatic Fork, seemingly more active development efforts compared to automatic1111's original repo
Fooocus Midjourney alike GUI for SDXL to focus on prompting and generating
- RuinedFooocus A Fooocus fork
- Fooocus-MRE A Fooocus fork
stable-diffusion-xl-demo runs SDXL 0.9 in a basic interface
imaginAIry a Stable Diffusion UI
InvokeAI Alternative, polished stable diffusion UI with less features than automatic1111
mlc-ai/web-stable-diffusion
anapnoe/stable-diffusion-webui-ux Redesigned from automatic1111's UI, adding mobile and desktop layouts and UX improvements
refacer One-Click Deepfake Multi-Face Swap Tool
stable-diffusion.cpp CPU inference of Stable Diffusion in pure C/C++ with huge performance gains, supporting ggml, 16/32 bit float, 4/5/8 bit quantization, AVX/AVX2/AVX512, SD1.x, SD2.x, txt2img/img2img
FaceFusion Next generation face swapper and enhancer
OneFlow Backend for diffusers and ComfyUI

https://github.com/JoePenna/Dreambooth-Stable-Diffusion
fast-stable-diffusion TheLastBen's Repo for SD, SDXL fine-tuning and DreamBooth on RunPod, Paperspace, Colab and others
https://github.com/ShivamShrirao/diffusers/tree/main/examples/dreambooth
https://github.com/cloneofsimo/lora
OneTrainer all in one training for SD, SDXL and inpainting models supporting fine-tuning, LoRA, embeddings
sd-scripts by kohya-ss
- LoRA Easy Training Scripts GUI for Kohya's Scripts
- Kohya_ss Windows-focused Gradio GUI for Kohya's Stable Diffusion trainers, experimental sdxl support, reddit thread
Fine tuning concepts explained visually
text2image-gui a Stable Diffusion GUI by NMKD
sd-webui-EasyPhoto / easyphoto plugin for generating AI portraits that can be used to train digital doppelgangers with 5-10 photos and a quick LoRA fine tune, paper
StableTuner Windows GUI for Finetuning / Dreambooth Stable Diffusion models (abandoned)

Speed Is All You Need up to 50% speed increase for Latent Diffusion Models
ORCa converts glossy objects into radiance-field cameras, enabling depth estimation and novel-view synthesis, project, code
cocktail Mixing Multi-Modality Controls for Text-Conditional Image Generation, project, code
SnapFusion Fast text-to-image diffusion on mobile phones in 2 seconds
Objaverse-xl dataset of 10 million annotated high quality 3D objects, hf
LightGlue Local Feature Matching at Light Speed, a lightweight feature matcher with high accuracy and blazing fast inference. It takes as input a set of keypoints and descriptors for each image and returns the indices of corresponding points