What Makes a Prompt Good? The 6 Non-Negotiable Elements
After years of working with every major AI model — image generators, video models, LLMs — the difference between a prompt that works and one that doesn't comes down to six things. Not ten. Not twenty. Six. Miss one and the output suffers. Nail all six and you get commercial-grade results consistently.
These elements apply to every type of AI prompt. The specific words change depending on whether you're prompting an image model or an LLM, but the underlying structure doesn't.
The model needs to know exactly who or what it's generating. Vague subjects produce generic outputs. "A woman" gives the model infinite interpretations. "1woman, late 20s, Southeast Asian, long dark hair, angular jawline, confident expression" removes nearly all of them.
For image prompts: describe age, ethnicity, specific features, and expression. For LLM prompts: define the role precisely — "You are a senior copywriter at a luxury fashion brand targeting women aged 25–40" is a defined subject. "You are a copywriter" is not.
The subject should always come first in the prompt. AI models weight earlier tokens more heavily.
Lighting is the single biggest variable in image quality — and the most consistently underprompted element. "Good lighting" means nothing to a model. "Soft Rembrandt key light from upper left, warm fill from the right, subtle rim light" means something specific.
Named lighting setups the model understands: Rembrandt, clamshell, chiaroscuro, golden hour, blue hour, window light, practical neon. Named references work even better — "lit like a David Fincher film" or "Roger Deakins natural light" gives the model a rich visual library to draw from.
If you leave lighting unspecified, the model defaults to flat, even, studio-style illumination — the visual equivalent of a passport photo.
For image and video prompts: naming a camera and lens is the single fastest way to improve output quality. "Shot on Canon EOS R5, 85mm f/1.2" tells the model: high resolution, shallow depth of field, natural bokeh, the compression of a portrait lens. It reads the camera as a shorthand for an entire visual aesthetic.
You don't need to use a real camera that exists. "Shot on medium format Hasselblad" or "ARRI ALEXA anamorphic" communicates a quality level and aesthetic that the model has seen thousands of times in its training data.
For LLMs: the equivalent is specifying format. "Output as a markdown table" or "write in exactly 80 words, no more" is the technical reference that constrains the output space.
Where is this happening? What surrounds the subject? The environment frames the subject and determines how natural the output looks. A portrait without environmental context gets placed against whatever background the model decides — and it usually decides wrong.
Environment also carries mood. "Rain-soaked alley at midnight with neon reflections" communicates color palette, lighting quality, atmosphere, and emotional register in one phrase. "Outdoor setting" communicates nothing.
For LLMs: context is the situation, audience, and background. "I'm pitching to a room of 40 retail buyers who know nothing about AI" is context. "Write a pitch" is not.
Quality tags tell the model what standard to aim for. "8K, ultra-detailed, photorealistic, commercial photography standard" front-loads quality signals that pull the entire output toward professional results. These terms appear in the training data of professional photography — the model has learned to associate them with high-quality outputs.
Style anchors go further — they reference a specific aesthetic. "Vogue editorial," "A24 film," "campaign photography for a luxury brand" each invoke a rich visual reference that the model can draw on. These work because the model has processed vast amounts of content tagged with these references.
Telling the model what to avoid is as important as telling it what to include. This applies to image models (negative prompts), video models (identity lock instructions), and LLMs (explicit exclusions).
For image models: CGI, plastic skin, airbrushed face, watermark, blurry, overexposed, cartoon — these are the most common failure modes and the most reliably fixed by negative prompting.
For LLMs: "No filler phrases. No exclamation marks. Don't start sentences with 'I'. Avoid corporate jargon." Explicit exclusions prevent the model from defaulting to its most common patterns, which are usually the patterns you're trying to avoid.
More prompts. Every week.
Production-ready prompts, model guides, and AI workflow breakdowns — free forever.
SUBSCRIBE FREE ↗