AI / Research

Generative AI Research — Stable Diffusion Pipelines

Text-to-sketch and sketch-to-image generation pipelines with a focus on controllability, fidelity to input constraints, and deployment feasibility on modest hardware.

Role Research Engineer

Domain Generative AI

Status Research

Overview

A research effort exploring Stable Diffusion-based pipelines for two related tasks — generating sketches from text, and generating finished imagery from sketches. The work emphasizes controllability (the output must respect the input constraints), fidelity (not just plausible but correct), and practicality (runs on a single workstation GPU, not a cluster).

The Problem

Stock diffusion models are great at plausible imagery but weak at constraint-following. A sketch-to-image pipeline that ignores the sketch layout is worthless. Controllability mechanisms — ControlNet conditioning, LoRA fine-tuning, careful prompt design — make diffusion useful for real creative and technical workflows. The research question is how to combine them for the specific task domain.

My Role & Contribution

Designed and ran the comparison studies across conditioning strategies
Fine-tuned LoRA adapters on the domain dataset and evaluated their effect on fidelity
Built a Gradio demo so reviewers could interact with the pipeline without touching code

Approach

Hugging Face Diffusers as the base Stable Diffusion runtime
ControlNet conditioning on sketch / edge / depth to enforce layout fidelity
LoRA fine-tuning on a curated domain dataset to shift style and concept distribution
Prompt engineering and negative-prompt tuning for consistent outputs
Gradio front-end for interactive exploration and qualitative review

Tech Stack

Python PyTorch Hugging Face Diffusers Stable Diffusion ControlNet LoRA CUDA Gradio

Results & Impact

Qualitative improvements in sketch-following fidelity vs. unconditioned baselines
A reusable pipeline and demo that informs downstream applied work

// TODO: add diagrams / screenshots

← Previous Thermal CV Surveillance Next → PPO — Iterated Prisoner's Dilemma