Back to all projects
AI / Research

Generative AI Research — Stable Diffusion Pipelines

Text-to-sketch and sketch-to-image generation pipelines with a focus on controllability, fidelity to input constraints, and deployment feasibility on modest hardware.

Role Research Engineer
Domain Generative AI
Status Research

Overview

A research effort exploring Stable Diffusion-based pipelines for two related tasks — generating sketches from text, and generating finished imagery from sketches. The work emphasizes controllability (the output must respect the input constraints), fidelity (not just plausible but correct), and practicality (runs on a single workstation GPU, not a cluster).

The Problem

Stock diffusion models are great at plausible imagery but weak at constraint-following. A sketch-to-image pipeline that ignores the sketch layout is worthless. Controllability mechanisms — ControlNet conditioning, LoRA fine-tuning, careful prompt design — make diffusion useful for real creative and technical workflows. The research question is how to combine them for the specific task domain.

My Role & Contribution

  • Designed and ran the comparison studies across conditioning strategies
  • Fine-tuned LoRA adapters on the domain dataset and evaluated their effect on fidelity
  • Built a Gradio demo so reviewers could interact with the pipeline without touching code

Approach

  • Hugging Face Diffusers as the base Stable Diffusion runtime
  • ControlNet conditioning on sketch / edge / depth to enforce layout fidelity
  • LoRA fine-tuning on a curated domain dataset to shift style and concept distribution
  • Prompt engineering and negative-prompt tuning for consistent outputs
  • Gradio front-end for interactive exploration and qualitative review

Tech Stack

Python PyTorch Hugging Face Diffusers Stable Diffusion ControlNet LoRA CUDA Gradio

Results & Impact

  • Qualitative improvements in sketch-following fidelity vs. unconditioned baselines
  • A reusable pipeline and demo that informs downstream applied work
// TODO: add diagrams / screenshots
← Previous Thermal CV Surveillance