Prompt Engineering & Optimization: Patterns, Anti-Patterns, and Proven Workflows
Design prompts that are reliable, steerable, and measurable. Covers structure, context packing, constraints, few-shot, tool-use, evaluation, and iterative optimization with clear good vs bad examples.
Good prompts are engineered specifications, not vibes. Treat them like product interfaces: define roles, objectives, constraints, inputs, and outputs. Then measure and iterate.
Quick answer
- Structure beats prose: role → objective → constraints → input → output schema.
- Context matters: pack only relevant facts with citations/IDs.
- Show, don’t tell: few-shot examples for tricky formatting or tone.
- Measure: add evals; iterate with diffs, not guesses.
1) Prompt anatomy (reusable template)
[Role]
You are a {role} optimizing for {objective}.
[Constraints]
- Follow {policy}. Avoid {undesired}.
- Only use provided context; do not invent facts.
- Respond in {language}; be {tone}.
[Inputs]
Query: "{user_query}"
Context (IDs + snippets):
- {doc_id_1}: {snippet_1}
- {doc_id_2}: {snippet_2}
[Output Schema]
Return JSON:
{
"answer": string,
"citations": [doc_id],
"confidence": 0-1
}
2) Good vs Bad prompts (side-by-side)
| Bad prompt | Why it's bad | Good prompt | Why it's good |
|---|---|---|---|
| "Summarize the following." | No goal, no audience, no length or structure. | Role + objective + constraints + length + format (bullets or JSON). | Defines purpose and output schema; repeatable. |
| "Write me marketing copy fast." | No brand voice, target segment, key messages, or guardrails. | Brand voice + audience + key messages + tone + do/don’t + examples. | Steerable and safe; aligns with requirements. |
| "Extract data from this text." | No schema; leads to inconsistent fields and formats. | Explicit JSON schema with types + field definitions + examples. | Machine-checkable; supports automated validation. |
| "Answer using the docs." | No doc IDs or citation requirement; invites fabrication. | Context with IDs + citation requirement + confidence + abstain rule. | Grounded answers with traceability and abstention behavior. |
3) Patterns
- Role priming: define agent role and success criteria.
- Constraints: language, tone, policy, abstain if unclear.
- Output schemas: JSON/protobuf ensures contract stability.
- Few-shot: 2–5 representative examples; avoid overfitting.
- Tool-use: explicit tool descriptions; require citations/IDs.
- Self-check: add simple checks ("list assumptions", "validate schema").
4) Anti-patterns
- Vague instructions: “make it good” without criteria.
- Hidden requirements: policies that aren’t in the prompt.
- Overlong prose: burying key instructions in paragraphs.
- Unbounded outputs: no length limits or schema.
- No grounding: missing context or citation rules.
5) How-to: RAG prompting
- Provide IDs + snippets with minimal noise.
- Require citations and allow abstain when context is insufficient.
- Define output schema and confidence.
- Optionally add rerank rationale (short, structured) not raw chain-of-thought.
6) How-to: extraction prompting
// Output schema
{
"name": string,
"email": string | null,
"order_id": string,
"items": [{ "sku": string, "qty": number }]
}
Provide one or two few-shot pairs with tricky cases (missing fields, multiple items).
7) Optimization loop
- Define evals: task metrics and acceptance criteria.
- Collect failures: bucket by pattern (missing citation, wrong schema).
- Patch prompts: add constraints or examples targeting failure buckets.
- Diff and re-run: track win rate and regression risk.
8) Example prompt (good)
You are a technical writer optimizing for accurate, concise summaries for enterprise admins.
Constraints:
- Only use provided context; cite source IDs.
- If context is insufficient, reply: {"answer": null, "citations": [], "confidence": 0}.
Input:
Query: "Reset SSO settings"
Context:
- DOC-12: "To reset SSO, navigate to Admin → Auth → SSO..."
Output JSON:
{
"answer": "To reset SSO, go to Admin → Auth → SSO...",
"citations": ["DOC-12"],
"confidence": 0.82
}
FAQ (direct answers)
Should I let the model write its own schema?
No. Provide the schema. Then validate programmatically and retry on violations.
When do I need few-shot?
When outputs require nuanced formatting or tone and pure instructions aren’t enough. Use 2–5 small, representative examples.
How do I keep prompts maintainable?
Modularize: system role, policy constraints, task template, and per-feature examples. Version prompts and track eval scores.
Bottom line
- Prompts are contracts—make them explicit and measurable.
- Use context sparingly and require citations.
- Iterate with evals; fix specific failure modes.