What is function calling?

A structured way for models to request tools with explicit JSON argument schemas. The runtime validates and executes the tool, then returns results back to the model.

How do I prevent unsafe tool use?

Validate inputs against schemas, sanitize strings, enforce allowlists, add permission checks, and block side-effecting tools by default unless explicitly enabled for the user/session.

Do I need routing or just one tool?

If your agent spans multiple capabilities (search, DB, actions), add a router that maps intents to tools. Keep tools small and single-purpose; compose for complex tasks.

How do I evaluate tool use?

Log tool calls, arguments, and outcomes. Measure success rate, latency, error types, and preference wins on tasks. Add unit tests for argument validation and integration tests for tool orchestration.

Should tools return raw or summarized data?

Prefer compact structured results with IDs and pagination metadata. Let the agent summarize for the user and keep raw data accessible via references.

Agents2025-12-12

AI Agent Tool Integration & Function Calling: Design, Contracts, and Safety

How to wire tools into agents safely and reliably: function schemas, argument validation, tool routing, retries, observability, and evaluation—plus clear examples.

Function calling turns prompts into safe, structured actions. Tools are not “magic”—they are contracts: name, description, input schema, auth, and side-effects. Good agents compose small tools, validate inputs, and log every call.

Quick answer

Design tools as small, single-responsibility functions with explicit JSON schemas.
Validate arguments server-side; never trust model-generated inputs.
Route intents to tools; add allowlists and permissions.
Log and evaluate calls, failures, latency, and user outcomes.

1) Tool schema (contract)

{
  "name": "search_docs",
  "description": "Search enterprise docs by keyword and tag",
  "input_schema": {
    "type": "object",
    "properties": {
      "query": { "type": "string", "minLength": 2 },
      "tags": { "type": "array", "items": { "type": "string" }, "default": [] },
      "limit": { "type": "integer", "minimum": 1, "maximum": 50, "default": 10 }
    },
    "required": ["query"]
  },
  "auth": "user",
  "side_effects": false
}

2) Good vs Bad tool integration

Bad	Issue	Good	Why
Free-form strings to DB tool	No schema; injection risk; brittle	JSON schema + validation + parameterized queries	Safe and maintainable
One giant tool that “does everything”	Hard to test; poor routing; hidden effects	Small tools (search, fetch, update) composed by agent	Modular and debuggable
No logs for tool calls	No visibility; hard to improve	Structured logs: tool, args, latency, outcome	Observability for evals and fixes

3) Routing and permissions

Router: map intents to tools (keywords, embeddings, simple rules).
Allowlist: per-user/session tool availability with scope.
Auth: user vs service tools; require tokens for side-effects.
Rate limits: protect expensive tools; backoff and queue.

4) Safety and validation

Schema validation: reject invalid types and missing fields.
Sanitization: escape strings; blockpaths; redact secrets.
Policy checks: ensure tenant/ACL constraints on every call.
Dry-run mode: for dangerous tools, require explicit confirm.

5) Orchestration patterns

Plan → act → observe: agent drafts plan, calls tools, reflects, and answers.
Decompose: break tasks into small steps and checkpoint.
Retry: exponential backoff; circuit-breakers on repeated failures.
Caching: cache tool results keyed by args for speed.

6) Example: search → fetch → summarize

// Tool call 1
{
  "tool": "search_docs",
  "args": { "query": "rotate API key", "tags": ["security"], "limit": 5 }
}
// Tool result
{
  "results": [ {"id": "DOC-12", "title": "API key rotation"}, {"id": "DOC-33", "title": "SSO security"} ]
}
// Tool call 2
{
  "tool": "fetch_doc",
  "args": { "id": "DOC-12" }
}
// Final answer: summarize with citations
{"answer": "Go to Admin → Auth → Keys...", "citations": ["DOC-12"], "confidence": 0.86}

7) Observability and evaluation

Logs: tool name, args hash, latency, status, error.
Metrics: success rate, retries, p95 latency, user outcomes.
Evals: task-specific success, preference wins, regression tests.

8) Try it: minimal validator + router

// Minimal JSON schema validator (runtime)
type JSONSchema = { type: string; properties?: Record; required?: string[] };

function validate(schema: JSONSchema, data: any) {
  if (schema.type !== typeof data && !(schema.type === 'object' && typeof data === 'object')) {
    return { ok: false, error: 'Expected ' + String(schema.type) };
  }
  const req = schema.required || [];
  for (const key of req) {
    if (!(key in data)) return { ok: false, error: 'Missing field: ' + String(key) };
  }
  return { ok: true };
}

// Tool registry
const tools = {
  search_docs: {
    schema: {
      type: 'object',
      required: ['query'],
      properties: {
        query: { type: 'string' },
        tags: { type: 'array' },
        limit: { type: 'number' }
      }
    },
    run: async ({ query, tags = [], limit = 5 }: { query: string; tags?: string[]; limit?: number }) => {
      // Replace with your search impl
      return { results: [{ id: 'DOC-12', title: 'API key rotation' }] };
    }
  },
  fetch_doc: {
    schema: { type: 'object', required: ['id'], properties: { id: { type: 'string' } } },
    run: async ({ id }: { id: string }) => {
      // Replace with your fetch impl
      return { id, content: 'To reset SSO, go to Admin → Auth → SSO...' };
    }
  }
};

// Simple router stub
function routeIntent(userQuery: string) {
  if (/doc|fetch/i.test(userQuery)) return 'fetch_doc';
  return 'search_docs';
}

// Execute tool with validation
async function callTool(name: keyof typeof tools, args: any) {
  const tool = tools[name];
  const v = validate(tool.schema as any, args);
  if (!v.ok) throw new Error('Invalid args for ' + String(name) + ': ' + String(v.error));
  return await tool.run(args);
}

// Example
async function example() {
  const toolName = routeIntent('rotate API key');
  const res1 = await callTool(toolName as any, { query: 'rotate API key', tags: ['security'], limit: 5 });
  const doc = await callTool('fetch_doc', { id: res1.results[0].id });
  return { answer: 'Go to Admin → Auth → Keys...', citations: [doc.id], confidence: 0.86 };
}

9) Try it: side-effect tool with allowlist + dry-run

// Allowlist (per user/session)
const allowedTools: Record = {
  user_123: ['search_docs', 'fetch_doc', 'update_user_email']
};

function isAllowed(userId: string, tool: string) {
  return (allowedTools[userId] || []).includes(tool);
}

// Simple sanitization helper
function sanitizeEmail(input: string) {
  const s = input.trim().toLowerCase();
  if (!/^[^s@]+@[^s@]+.[^s@]+$/.test(s)) throw new Error('Invalid email');
  return s;
}

// Side-effect tool (requires confirm)
const update_user_email = {
  schema: { type: 'object', required: ['userId', 'newEmail', 'confirm'], properties: {
    userId: { type: 'string' },
    newEmail: { type: 'string' },
    confirm: { type: 'boolean' }
  }},
  run: async ({ userId, newEmail, confirm }: { userId: string; newEmail: string; confirm: boolean }) => {
    if (!confirm) return { dryRun: true, message: 'Set confirm=true to apply change.' };
    const email = sanitizeEmail(newEmail);
    // Perform the update (replace with real DB/API)
    return { ok: true, userId, email };
  }
};

async function callSideEffectTool(userId: string, name: string, args: any) {
  if (!isAllowed(userId, name)) throw new Error('Tool not allowed for this user');
  const tool = name === 'update_user_email' ? update_user_email : null;
  if (!tool) throw new Error('Unknown tool');
  const v = validate(tool.schema as any, args);
  if (!v.ok) throw new Error('Invalid args: ' + String(v.error));
  return await tool.run(args);
}

// Example
async function exampleSideEffect() {
  const resDry = await callSideEffectTool('user_123', 'update_user_email', { userId: 'user_123', newEmail: 'Admin@Example.com ', confirm: false });
  // => { dryRun: true, message: 'Set confirm=true to apply change.' }
  const resLive = await callSideEffectTool('user_123', 'update_user_email', { userId: 'user_123', newEmail: 'admin@example.com', confirm: true });
  // => { ok: true, userId: 'user_123', email: 'admin@example.com' }
  return resLive;
}

Side-effect tools: checklist

Auth scope: verify user/session permissions and required tokens per call.
Audit logging: log who, what, when, args hash, and outcome; retain for reviews.
Idempotency: use idempotency keys on writes to avoid duplicates.
Dry-run + confirm: require explicit confirmation for risky actions.
Rollback plan: record previous state; provide reversible operations where possible.
Rate limits: protect expensive or sensitive tools with quotas and backoff.
Validation + sanitization: strict schema checks and string sanitization.
Observability: metrics for success rate, error classes, p95 latency.

FAQ (direct answers)

Can tools call other tools?

Prefer agent-level orchestration. If a tool composes others, keep boundaries clear and log subcalls for debuggability.

How do I handle long results?

Return compact structured data with IDs and pagination; the agent decides what to display and what to retrieve next.