What is a ‘good’ prompt?

One that is explicit about role, objective, constraints, format, and context, and that is validated with evals. It reduces ambiguity and produces consistent outputs.

Should I use chain-of-thought in production?

Use structured reasoning when it improves accuracy, but avoid exposing raw chain-of-thought verbatims to end users. Prefer compact rationales or tool-verified steps.

Few-shot vs instructions only?

Few-shot improves steerability when you have representative examples. If latency/length is tight, use crisp constraints and output schemas instead. Often combine both.

How do I prevent sensitive outputs?

Combine pre-prompt guardrails (policy), retrieval filters, and post-output validation. Use allow/deny lists, regex checks, and evaluators for risky categories.

How do I measure prompt quality?

Use task-specific evals: accuracy/F1 for extraction, BLEU/ROUGE for summarization, preference wins for generation, and schema adherence rates for structured outputs.

Prompting2025-12-12

Prompt Engineering & Optimization: Patterns, Anti-Patterns, and Proven Workflows

Design prompts that are reliable, steerable, and measurable. Covers structure, context packing, constraints, few-shot, tool-use, evaluation, and iterative optimization with clear good vs bad examples.

Good prompts are engineered specifications, not vibes. Treat them like product interfaces: define roles, objectives, constraints, inputs, and outputs. Then measure and iterate.

Quick answer

Structure beats prose: role → objective → constraints → input → output schema.
Context matters: pack only relevant facts with citations/IDs.
Show, don’t tell: few-shot examples for tricky formatting or tone.
Measure: add evals; iterate with diffs, not guesses.

1) Prompt anatomy (reusable template)

[Role]
You are a {role} optimizing for {objective}.

[Constraints]
- Follow {policy}. Avoid {undesired}.
- Only use provided context; do not invent facts.
- Respond in {language}; be {tone}.

[Inputs]
Query: "{user_query}"
Context (IDs + snippets):
- {doc_id_1}: {snippet_1}
- {doc_id_2}: {snippet_2}

[Output Schema]
Return JSON:
{
  "answer": string,
  "citations": [doc_id],
  "confidence": 0-1
}

2) Good vs Bad prompts (side-by-side)

Bad prompt	Why it's bad	Good prompt	Why it's good
"Summarize the following."	No goal, no audience, no length or structure.	Role + objective + constraints + length + format (bullets or JSON).	Defines purpose and output schema; repeatable.
"Write me marketing copy fast."	No brand voice, target segment, key messages, or guardrails.	Brand voice + audience + key messages + tone + do/don’t + examples.	Steerable and safe; aligns with requirements.
"Extract data from this text."	No schema; leads to inconsistent fields and formats.	Explicit JSON schema with types + field definitions + examples.	Machine-checkable; supports automated validation.
"Answer using the docs."	No doc IDs or citation requirement; invites fabrication.	Context with IDs + citation requirement + confidence + abstain rule.	Grounded answers with traceability and abstention behavior.

3) Patterns

Role priming: define agent role and success criteria.
Constraints: language, tone, policy, abstain if unclear.
Output schemas: JSON/protobuf ensures contract stability.
Few-shot: 2–5 representative examples; avoid overfitting.
Tool-use: explicit tool descriptions; require citations/IDs.
Self-check: add simple checks ("list assumptions", "validate schema").

4) Anti-patterns

Vague instructions: “make it good” without criteria.
Hidden requirements: policies that aren’t in the prompt.
Overlong prose: burying key instructions in paragraphs.
Unbounded outputs: no length limits or schema.
No grounding: missing context or citation rules.

5) How-to: RAG prompting

Provide IDs + snippets with minimal noise.
Require citations and allow abstain when context is insufficient.
Define output schema and confidence.
Optionally add rerank rationale (short, structured) not raw chain-of-thought.

6) How-to: extraction prompting

// Output schema
{
  "name": string,
  "email": string | null,
  "order_id": string,
  "items": [{ "sku": string, "qty": number }]
}

Provide one or two few-shot pairs with tricky cases (missing fields, multiple items).

7) Optimization loop

Define evals: task metrics and acceptance criteria.
Collect failures: bucket by pattern (missing citation, wrong schema).
Patch prompts: add constraints or examples targeting failure buckets.
Diff and re-run: track win rate and regression risk.

8) Example prompt (good)

You are a technical writer optimizing for accurate, concise summaries for enterprise admins.
Constraints:
- Only use provided context; cite source IDs.
- If context is insufficient, reply: {"answer": null, "citations": [], "confidence": 0}.
Input:
Query: "Reset SSO settings"
Context:
- DOC-12: "To reset SSO, navigate to Admin → Auth → SSO..."
Output JSON:
{
  "answer": "To reset SSO, go to Admin → Auth → SSO...",
  "citations": ["DOC-12"],
  "confidence": 0.82
}

FAQ (direct answers)

Should I let the model write its own schema?

No. Provide the schema. Then validate programmatically and retry on violations.

When do I need few-shot?

When outputs require nuanced formatting or tone and pure instructions aren’t enough. Use 2–5 small, representative examples.

How do I keep prompts maintainable?

Modularize: system role, policy constraints, task template, and per-feature examples. Version prompts and track eval scores.

Bottom line

Prompts are contracts—make them explicit and measurable.
Use context sparingly and require citations.
Iterate with evals; fix specific failure modes.

Prompt Engineering & Optimization: Patterns, Anti-Patterns, and Proven Workflows

Quick answer

1) Prompt anatomy (reusable template)

2) Good vs Bad prompts (side-by-side)

3) Patterns

4) Anti-patterns

5) How-to: RAG prompting

6) How-to: extraction prompting

7) Optimization loop

8) Example prompt (good)

FAQ (direct answers)

Should I let the model write its own schema?

When do I need few-shot?

How do I keep prompts maintainable?

Bottom line

Further reading

Related Topics

Related Articles

Vector Databases & Embeddings: A Practical Guide for RAG, Search, and AI Apps

AI Agent Tool Integration & Function Calling: Design, Contracts, and Safety

How to evaluate and benchmark RAG pipelines effectively?

Ready to put this into practice?