Agents2025-12-12
AI Agent Tool Integration & Function Calling: Design, Contracts, and Safety
How to wire tools into agents safely and reliably: function schemas, argument validation, tool routing, retries, observability, and evaluation—plus clear examples.
Function calling turns prompts into safe, structured actions. Tools are not “magic”—they are contracts: name, description, input schema, auth, and side-effects. Good agents compose small tools, validate inputs, and log every call.
Quick answer
- Design tools as small, single-responsibility functions with explicit JSON schemas.
- Validate arguments server-side; never trust model-generated inputs.
- Route intents to tools; add allowlists and permissions.
- Log and evaluate calls, failures, latency, and user outcomes.
1) Tool schema (contract)
{
"name": "search_docs",
"description": "Search enterprise docs by keyword and tag",
"input_schema": {
"type": "object",
"properties": {
"query": { "type": "string", "minLength": 2 },
"tags": { "type": "array", "items": { "type": "string" }, "default": [] },
"limit": { "type": "integer", "minimum": 1, "maximum": 50, "default": 10 }
},
"required": ["query"]
},
"auth": "user",
"side_effects": false
}
2) Good vs Bad tool integration
| Bad | Issue | Good | Why |
|---|---|---|---|
| Free-form strings to DB tool | No schema; injection risk; brittle | JSON schema + validation + parameterized queries | Safe and maintainable |
| One giant tool that “does everything” | Hard to test; poor routing; hidden effects | Small tools (search, fetch, update) composed by agent | Modular and debuggable |
| No logs for tool calls | No visibility; hard to improve | Structured logs: tool, args, latency, outcome | Observability for evals and fixes |
3) Routing and permissions
- Router: map intents to tools (keywords, embeddings, simple rules).
- Allowlist: per-user/session tool availability with scope.
- Auth: user vs service tools; require tokens for side-effects.
- Rate limits: protect expensive tools; backoff and queue.
4) Safety and validation
- Schema validation: reject invalid types and missing fields.
- Sanitization: escape strings; blockpaths; redact secrets.
- Policy checks: ensure tenant/ACL constraints on every call.
- Dry-run mode: for dangerous tools, require explicit confirm.
5) Orchestration patterns
- Plan → act → observe: agent drafts plan, calls tools, reflects, and answers.
- Decompose: break tasks into small steps and checkpoint.
- Retry: exponential backoff; circuit-breakers on repeated failures.
- Caching: cache tool results keyed by args for speed.
6) Example: search → fetch → summarize
// Tool call 1
{
"tool": "search_docs",
"args": { "query": "rotate API key", "tags": ["security"], "limit": 5 }
}
// Tool result
{
"results": [ {"id": "DOC-12", "title": "API key rotation"}, {"id": "DOC-33", "title": "SSO security"} ]
}
// Tool call 2
{
"tool": "fetch_doc",
"args": { "id": "DOC-12" }
}
// Final answer: summarize with citations
{"answer": "Go to Admin → Auth → Keys...", "citations": ["DOC-12"], "confidence": 0.86}
7) Observability and evaluation
- Logs: tool name, args hash, latency, status, error.
- Metrics: success rate, retries, p95 latency, user outcomes.
- Evals: task-specific success, preference wins, regression tests.
8) Try it: minimal validator + router
// Minimal JSON schema validator (runtime)
type JSONSchema = { type: string; properties?: Record; required?: string[] };
function validate(schema: JSONSchema, data: any) {
if (schema.type !== typeof data && !(schema.type === 'object' && typeof data === 'object')) {
return { ok: false, error: 'Expected ' + String(schema.type) };
}
const req = schema.required || [];
for (const key of req) {
if (!(key in data)) return { ok: false, error: 'Missing field: ' + String(key) };
}
return { ok: true };
}
// Tool registry
const tools = {
search_docs: {
schema: {
type: 'object',
required: ['query'],
properties: {
query: { type: 'string' },
tags: { type: 'array' },
limit: { type: 'number' }
}
},
run: async ({ query, tags = [], limit = 5 }: { query: string; tags?: string[]; limit?: number }) => {
// Replace with your search impl
return { results: [{ id: 'DOC-12', title: 'API key rotation' }] };
}
},
fetch_doc: {
schema: { type: 'object', required: ['id'], properties: { id: { type: 'string' } } },
run: async ({ id }: { id: string }) => {
// Replace with your fetch impl
return { id, content: 'To reset SSO, go to Admin → Auth → SSO...' };
}
}
};
// Simple router stub
function routeIntent(userQuery: string) {
if (/doc|fetch/i.test(userQuery)) return 'fetch_doc';
return 'search_docs';
}
// Execute tool with validation
async function callTool(name: keyof typeof tools, args: any) {
const tool = tools[name];
const v = validate(tool.schema as any, args);
if (!v.ok) throw new Error('Invalid args for ' + String(name) + ': ' + String(v.error));
return await tool.run(args);
}
// Example
async function example() {
const toolName = routeIntent('rotate API key');
const res1 = await callTool(toolName as any, { query: 'rotate API key', tags: ['security'], limit: 5 });
const doc = await callTool('fetch_doc', { id: res1.results[0].id });
return { answer: 'Go to Admin → Auth → Keys...', citations: [doc.id], confidence: 0.86 };
}
9) Try it: side-effect tool with allowlist + dry-run
// Allowlist (per user/session)
const allowedTools: Record = {
user_123: ['search_docs', 'fetch_doc', 'update_user_email']
};
function isAllowed(userId: string, tool: string) {
return (allowedTools[userId] || []).includes(tool);
}
// Simple sanitization helper
function sanitizeEmail(input: string) {
const s = input.trim().toLowerCase();
if (!/^[^s@]+@[^s@]+.[^s@]+$/.test(s)) throw new Error('Invalid email');
return s;
}
// Side-effect tool (requires confirm)
const update_user_email = {
schema: { type: 'object', required: ['userId', 'newEmail', 'confirm'], properties: {
userId: { type: 'string' },
newEmail: { type: 'string' },
confirm: { type: 'boolean' }
}},
run: async ({ userId, newEmail, confirm }: { userId: string; newEmail: string; confirm: boolean }) => {
if (!confirm) return { dryRun: true, message: 'Set confirm=true to apply change.' };
const email = sanitizeEmail(newEmail);
// Perform the update (replace with real DB/API)
return { ok: true, userId, email };
}
};
async function callSideEffectTool(userId: string, name: string, args: any) {
if (!isAllowed(userId, name)) throw new Error('Tool not allowed for this user');
const tool = name === 'update_user_email' ? update_user_email : null;
if (!tool) throw new Error('Unknown tool');
const v = validate(tool.schema as any, args);
if (!v.ok) throw new Error('Invalid args: ' + String(v.error));
return await tool.run(args);
}
// Example
async function exampleSideEffect() {
const resDry = await callSideEffectTool('user_123', 'update_user_email', { userId: 'user_123', newEmail: 'Admin@Example.com ', confirm: false });
// => { dryRun: true, message: 'Set confirm=true to apply change.' }
const resLive = await callSideEffectTool('user_123', 'update_user_email', { userId: 'user_123', newEmail: 'admin@example.com', confirm: true });
// => { ok: true, userId: 'user_123', email: 'admin@example.com' }
return resLive;
}
Side-effect tools: checklist
- Auth scope: verify user/session permissions and required tokens per call.
- Audit logging: log who, what, when, args hash, and outcome; retain for reviews.
- Idempotency: use idempotency keys on writes to avoid duplicates.
- Dry-run + confirm: require explicit confirmation for risky actions.
- Rollback plan: record previous state; provide reversible operations where possible.
- Rate limits: protect expensive or sensitive tools with quotas and backoff.
- Validation + sanitization: strict schema checks and string sanitization.
- Observability: metrics for success rate, error classes, p95 latency.
FAQ (direct answers)
Can tools call other tools?
Prefer agent-level orchestration. If a tool composes others, keep boundaries clear and log subcalls for debuggability.
How do I handle long results?
Return compact structured data with IDs and pagination; the agent decides what to display and what to retrieve next.
Further reading
Related Topics
AgentsFunction CallingTool UseSchemasValidationRoutingObservabilityEvaluation