Advanced · 18 min

Stable Diffusion Mastery

Unlock full creative control with Stable Diffusion's open-source ecosystem, LoRAs, and ControlNet.

Overview

Stable Diffusion gives you capabilities no hosted platform can match: LoRA fine-tuning for specific styles or characters, ControlNet for precise composition control, inpainting for surgical edits, and unlimited generation at any resolution. The trade-off is a steeper learning curve and local hardware requirements (or cloud GPU costs). Mastering it means understanding not just prompts but model selection, sampling settings, and the LoRA ecosystem.

Key Features to Leverage

ControlNet: uses depth maps, edge detection, or pose skeletons to control composition exactly
LoRA models add specific styles, characters, or concepts in 50-300MB files
Inpainting lets you repaint any masked area while preserving the rest of the image exactly
Negative prompts are extremely powerful — more so than in hosted platforms
CFG scale controls how literally the model follows your prompt (7-12 is the useful range)

5 Example Prompts — and Why They Work

Each prompt is annotated with the reasoning behind its structure.

1High-quality portrait with negative prompts

Prompt

Positive: cinematic portrait of a young woman, soft natural window light, bokeh background, film grain, photorealistic, 8K, detailed skin texture, catchlights in eyes, shot on Sony A7III Negative: cartoon, anime, painting, illustration, deformed, ugly, blurry, low quality, watermark, text, duplicate, bad anatomy, worst quality, low resolution, extra limbs

Why it works

Stable Diffusion's negative prompt is as important as the positive prompt. The negative prompt here blocks the model's tendency toward cartoon-style outputs when using base models. Specific camera gear (Sony A7III) activates photographic training data. Catchlights and skin texture are reliable quality anchors.

2ControlNet pose-guided generation

Prompt

Prompt: [your character description] in a dynamic action pose, professional photography, dramatic lighting [add LoRA: <lora:your_character:0.8>] ControlNet: OpenPose — import a reference pose image. The AI will replicate the exact body position while generating your character in that pose.

Why it works

ControlNet OpenPose extracts skeleton data from a reference image and applies it to the generation — you control the exact body position without describing it in text. Combining with a character LoRA lets you place a consistent character in any pose. The 0.8 weight prevents the LoRA from overpowering the prompt.

3SDXL architectural render

Prompt

Positive: hyperrealistic architectural visualization, contemporary villa, glass and concrete, infinity pool overlooking Mediterranean sea, golden hour, volumetric light rays, photorealistic rendering, octane render quality, professional architectural photography Negative: cartoonish, low detail, people, cars, distracting elements, overexposed

Why it works

SDXL (Stable Diffusion XL) handles architectural detail significantly better than SD 1.5. 'Octane render quality' anchors the style in 3D rendering training data. Removing people and cars in the negative prompt is common for clean architectural renders.

4Inpainting for targeted edits

Prompt

Original image generated. Mask: paint over only the sky. New prompt for masked area: dramatic stormy sky with cumulonimbus clouds, lightning, dark purple and orange, photorealistic. Denoising strength: 0.75. This replaces only the sky while keeping the foreground unchanged.

Why it works

Denoising strength of 0.75 is the key parameter for inpainting — too high (0.9+) and it ignores your original image; too low (0.4-) and it can't make significant changes. Masking only the sky ensures the rest of the image stays pixel-perfect.

5ComfyUI workflow for consistency

Prompt

Node setup: Load Checkpoint (SDXL base) -> Load LoRA (character_v2, strength 0.75) -> Load LoRA (lighting_style, strength 0.5) -> KSampler (steps: 25, CFG: 7, sampler: dpmpp_2m, scheduler: karras) -> VAE Decode -> Save Image. Prompt: [character] in [scene], consistent lighting, same character from previous batch, seed: [fixed seed]

Why it works

ComfyUI's node-based workflow lets you stack LoRAs with individual strength controls. Fixing the seed while changing the prompt produces consistent character variations. DPM++ 2M Karras at 25 steps is the most reliable sampler for quality/speed balance. Stacking two LoRAs (character + lighting) at reduced individual strengths prevents conflict.

Common Mistakes to Avoid

Using SD 1.5 for photorealism when SDXL or FLUX models are dramatically better for that use case
Setting CFG scale too high (15+) — this produces over-saturated, artifact-heavy images; stay in the 6-12 range
Not writing negative prompts — unlike hosted platforms, SD base models need explicit negative prompts to avoid common artifacts
Using LoRA at full strength (1.0) — most LoRAs work better at 0.6-0.85, leaving room for the base model's knowledge

Pro Tips for Power Users

Use X/Y/Z plot scripts in AUTOMATIC1111 to systematically test CFG scale, step count, and sampler combinations in a grid
ADetailer extension automatically detects and inpaints faces at high resolution — run it on every portrait for dramatically sharper faces
Civitai.com is the primary marketplace for LoRAs; check the trigger words tab for each LoRA — you must include the trigger word in your prompt for it to activate
For FLUX models, lower CFG (1-3.5) and higher steps (30-50) produce better results than the SD settings most guides recommend

Ready to try these techniques?

Browse our full library of Stable prompts.

Browse Stable-diffusion Prompts

Runway Sora