📡 API Reference

Complete API documentation for training, models, and analytics

🌐 Base URL

https://finetunelab.ai

All endpoints are relative to this base URL. For local development, use http://localhost:3000

List Available Models

GET/api/models

📘 What this does:

Retrieves a list of all available models that can be used for training and inference. No authentication required.

Request:

curl https://finetunelab.ai/api/models

✅ Response (200 OK):

{
  "models": [
    {
      "id": "llama-3-8b",
      "name": "LLaMA 3 8B",
      "status": "available",
      "parameters": "8B"
    },
    {
      "id": "mistral-7b",
      "name": "Mistral 7B",
      "status": "available",
      "parameters": "7B"
    }
  ]
}

💬 Ask Your AI Assistant:

"Get the list of available models from the API"

Start Training Job

POST/api/training/start

📘 What this does:

Starts a new fine-tuning job with the specified model, dataset, and configuration. Returns a job ID for monitoring progress.

Request Body:

model*string

Model ID from /api/models

dataset*string

Dataset ID from /api/training/dataset

configstring

Training configuration (default: "default")

curl -X POST https://finetunelab.ai/api/training/start \
  -H "Content-Type: application/json" \
  -d '{
    "model": "llama-3-8b",
    "dataset": "dataset-abc123",
    "config": "default"
  }'

✅ Response (200 OK):

{
  "job_id": "job-12345-abcde",
  "status": "queued",
  "message": "Training job started successfully",
  "estimated_duration": "2-4 hours"
}

💬 Ask Your AI Assistant:

"Start a training job with model llama-3-8b and dataset dataset-abc123"

Get Training Status

GET/api/training/status/:jobId

📘 What this does:

Retrieves the current status, progress, and metrics for a specific training job.

Request:

curl https://finetunelab.ai/api/training/status/job-12345-abcde

✅ Response (200 OK):

{
  "job_id": "job-12345-abcde",
  "status": "running",
  "progress": 45,
  "current_epoch": 2,
  "total_epochs": 5,
  "current_step": 1250,
  "total_steps": 2500,
  "loss": 0.234,
  "learning_rate": 0.0001,
  "elapsed_time": "1h 23m",
  "estimated_remaining": "1h 45m"
}

Possible Status Values:

queued - Waiting to start
pending - Initializing
running - Training in progress
paused - Temporarily stopped
completed - Finished successfully
failed - Error occurred
cancelled - User cancelled

💬 Ask Your AI Assistant:

"Check the status of training job job-12345-abcde"

Pause Training Job

POST/api/training/pause/:jobId

📘 What this does:

Temporarily pauses a running training job. The job can be resumed later from the same checkpoint.

Request:

curl -X POST https://finetunelab.ai/api/training/pause/job-12345-abcde

✅ Response (200 OK):

{
  "success": true,
  "message": "Training job paused successfully",
  "job_id": "job-12345-abcde",
  "status": "paused"
}

💬 Ask Your AI Assistant:

"Pause the training job job-12345-abcde"

Resume Training Job

POST/api/training/resume/:jobId

📘 What this does:

Resumes a paused training job from its last checkpoint. Training continues where it left off.

Request:

curl -X POST https://finetunelab.ai/api/training/resume/job-12345-abcde

✅ Response (200 OK):

{
  "success": true,
  "message": "Training job resumed successfully",
  "job_id": "job-12345-abcde",
  "status": "running"
}

💬 Ask Your AI Assistant:

"Resume the training job job-12345-abcde"

Cancel Training Job

POST/api/training/cancel/:jobId

📘 What this does:

Permanently cancels a training job. This action cannot be undone. The job will be marked as cancelled.

⚠️ Warning:

This action is permanent. Cancelled jobs cannot be resumed. Consider pausing instead if you might want to continue later.

Request:

curl -X POST https://finetunelab.ai/api/training/cancel/job-12345-abcde

✅ Response (200 OK):

{
  "success": true,
  "message": "Training job cancelled successfully",
  "job_id": "job-12345-abcde",
  "status": "cancelled"
}

💬 Ask Your AI Assistant:

"Cancel the training job job-12345-abcde"

List Available Datasets

GET/api/training/dataset

📘 What this does:

Retrieves a list of all available datasets that can be used for training. Includes dataset metadata like size and format.

Request:

curl https://finetunelab.ai/api/training/dataset

✅ Response (200 OK):

{
  "datasets": [
    {
      "id": "dataset-abc123",
      "name": "Customer Support Chat",
      "samples": 10000,
      "format": "jsonl",
      "size_mb": 45.2,
      "created_at": "2025-11-01T10:30:00Z"
    },
    {
      "id": "dataset-xyz789",
      "name": "Product Reviews",
      "samples": 5000,
      "format": "csv",
      "size_mb": 12.8,
      "created_at": "2025-11-05T14:20:00Z"
    }
  ]
}

💬 Ask Your AI Assistant:

"Show me all available datasets"

Deploy Trained Model

POST/api/training/deploy

💡 What this does:

Deploys a completed training job to RunPod Serverless cloud inference. Makes the model available for real-time predictions with auto-scaling.

Request Body:

job_id*string

Completed training job ID

checkpointstring

Specific checkpoint to deploy (default: latest)

curl -X POST https://finetunelab.ai/api/training/deploy \
  -H "Content-Type: application/json" \
  -d '{
    "job_id": "job-12345-abcde",
    "checkpoint": "checkpoint-final",
    "port": 8001
  }'

✅ Response (200 OK):

{
  "success": true,
  "message": "Model deployed successfully",
  "deployment_id": "deploy-xyz123",
  "endpoint": "https://finetunelab.ai/v1/completions",
  "model_name": "llama-3-8b-finetuned"
}

💬 Ask Your AI Assistant:

"Deploy the trained model from job job-12345-abcde"

Execute Training Job

POST/api/training/execute

📘 What this does:

Directly executes a training job on the training server. This is a lower-level endpoint that bypasses the Next.js API and directly interacts with the training server. Use this for advanced scenarios or when you need direct server control.

Request Body:

curl -X POST https://finetunelab.ai/api/training/execute \
  -H "Content-Type: application/json" \
  -d '{
    "execution_id": "exec-123",
    "name": "My Training Job",
    "dataset_path": "/datasets/my-dataset.jsonl",
    "dataset_content": "<JSONL data>",
    "config": {
      "model": {
        "base_model": "gpt2",
        "model_type": "causal-lm"
      },
      "training": {
        "num_train_epochs": 3,
        "per_device_train_batch_size": 4,
        "learning_rate": 2e-5
      }
    }
  }'

Response:

{
  "job_id": "550e8400-e29b-41d4-a716-446655440000",
  "status": "queued",
  "message": "Training job queued successfully"
}

Tell your AI assistant:

"Execute a training job directly on the training server using /api/training/execute with my config and dataset"

Get Training Metrics

GET/api/training/metrics/:jobId

📘 What this does:

Retrieves the complete metrics history for a training job. Returns all data points including loss, learning rate, and other metrics over time. Perfect for creating training progress charts.

Request:

curl https://finetunelab.ai/api/training/metrics/job-12345-abcde

Response:

{
  "job_id": "job-12345-abcde",
  "metrics": [
    {
      "epoch": 1,
      "step": 100,
      "loss": 2.45,
      "learning_rate": 0.0001,
      "grad_norm": 1.23,
      "timestamp": "2025-01-15T10:30:00Z"
    },
    {
      "epoch": 1,
      "step": 200,
      "loss": 2.12,
      "learning_rate": 0.0001,
      "grad_norm": 1.18,
      "timestamp": "2025-01-15T10:35:00Z"
    }
  ],
  "total_points": 2
}

Tell your AI assistant:

"Get the full metrics history for my training job and plot the loss curve over time"

Get Training Logs

GET/api/training/logs/:jobId

📘 What this does:

Retrieves the raw training logs for a job. Useful for debugging and monitoring the training process in real-time. Returns the most recent log entries.

Request:

curl https://finetunelab.ai/api/training/logs/job-12345-abcde

Response:

{
  "job_id": "job-12345-abcde",
  "logs": [
    "[2025-01-15 10:30:00] Loading model gpt2...",
    "[2025-01-15 10:30:05] Model loaded successfully",
    "[2025-01-15 10:30:10] Starting training...",
    "[2025-01-15 10:30:15] Epoch 1/3 - Step 100 - Loss: 2.45"
  ],
  "total_lines": 4
}

Tell your AI assistant:

"Show me the training logs for my job to debug any issues"

List Training Checkpoints

GET/api/training/checkpoints/:jobId

📘 What this does:

Lists all saved checkpoints for a training job, including metadata like eval loss, training loss, file size, and timestamps. Helps you find the best checkpoint to use for deployment.

Request:

curl https://finetunelab.ai/api/training/checkpoints/job-12345-abcde

Response:

{
  "job_id": "job-12345-abcde",
  "checkpoints": [
    {
      "path": "checkpoint-1000",
      "epoch": 1,
      "step": 1000,
      "eval_loss": 1.85,
      "train_loss": 1.92,
      "size_mb": 245.7,
      "created_at": "2025-01-15T10:40:00Z"
    },
    {
      "path": "checkpoint-2000",
      "epoch": 2,
      "step": 2000,
      "eval_loss": 1.23,
      "train_loss": 1.45,
      "size_mb": 245.7,
      "created_at": "2025-01-15T11:10:00Z"
    }
  ],
  "best_checkpoint": "checkpoint-2000",
  "total": 2
}

Tell your AI assistant:

"List all checkpoints for my training job and identify the one with the lowest eval loss"

Validate Model

POST/api/training/validate

📘 What this does:

Runs validation on a trained model using a test dataset. Computes metrics like accuracy, perplexity, BLEU score, etc. Essential for evaluating model performance before deployment.

Request Body:

curl -X POST https://finetunelab.ai/api/training/validate \
  -H "Content-Type: application/json" \
  -d '{
    "model_path": "/models/checkpoint-2000",
    "test_dataset_id": "test-dataset-001",
    "metrics_to_compute": ["perplexity", "accuracy", "bleu"]
  }'

Response:

{
  "job_id": "val-550e8400-e29b-41d4-a716-446655440000",
  "status": "completed",
  "metrics": {
    "perplexity": 15.23,
    "accuracy": 0.87,
    "bleu": 0.42
  },
  "message": "Validation completed successfully with 3 metrics"
}

Tell your AI assistant:

"Validate my trained model on the test dataset and compute perplexity and accuracy metrics"

Download Trained Model

GET/api/training/:jobId/download/model

📘 What this does:

Downloads a trained model or specific checkpoint as a ZIP file. Perfect for backing up your models or deploying them to production. You can specify a checkpoint name to download a specific checkpoint, or omit it to download the entire output directory.

Request (with checkpoint):

curl https://finetunelab.ai/api/training/job-12345-abcde/download/model?checkpoint=checkpoint-2000 -O

Query Parameters:

checkpoint(optional)

Specific checkpoint directory name (e.g., "checkpoint-2000")

Response:

ZIP file stream
Content-Type: application/zip
Content-Disposition: attachment; filename=job_12345-abcde_checkpoint-2000.zip

Contains:
- adapter_config.json
- adapter_model.bin
- tokenizer.json
- config.json
- ... (all model files)

Tell your AI assistant:

"Download the best checkpoint from my training job as a ZIP file"

Download Training Logs

GET/api/training/:jobId/download/logs

📘 What this does:

Downloads all training logs as a ZIP archive. Includes training.log (full console output) and progress.json (metrics data). Essential for debugging, analysis, and record-keeping.

Request:

curl https://finetunelab.ai/api/training/job-12345-abcde/download/logs -O

Response:

ZIP file stream
Content-Type: application/zip
Content-Disposition: attachment; filename=job_12345-abcde_logs.zip

Contains:
- training.log (full console output)
- progress.json (metrics history)

Tell your AI assistant:

"Download all training logs for analysis and debugging"

Get Job Analytics

GET/api/training/:jobId/analytics

📘 What this does:

Retrieves comprehensive analytics for a specific training job. Includes performance metrics, resource utilization, checkpoints info, loss progression, and efficiency scores. Perfect for understanding how well your training job performed.

Request:

curl https://finetunelab.ai/api/training/job-12345-abcde/analytics

Response:

{
  "job_id": "job-12345-abcde",
  "duration": {
    "total_seconds": 3600,
    "formatted": "1h 0m 0s"
  },
  "performance": {
    "avg_iteration_time": 2.5,
    "throughput": 400,
    "total_iterations": 1000
  },
  "resources": {
    "gpu_utilization": 0.85,
    "memory_used_gb": 8.2,
    "peak_temperature": 72
  },
  "loss_progression": {
    "initial_loss": 3.2,
    "final_loss": 1.1,
    "improvement": 2.1
  },
  "checkpoints": {
    "total": 3,
    "best_checkpoint": "checkpoint-2000"
  },
  "efficiency_score": 0.92
}

Tell your AI assistant:

"Get detailed analytics for my training job including performance metrics and resource utilization"

Get System Analytics Summary

GET/api/training/analytics/summary

📘 What this does:

Retrieves system-wide analytics across all training jobs. Shows aggregated metrics including total job counts by status, average training duration, total training time, and resource utilization trends. Great for getting an overview of your training infrastructure.

Request:

curl https://finetunelab.ai/api/training/analytics/summary

Response:

{
  "total_jobs": 42,
  "jobs_by_status": {
    "completed": 35,
    "running": 2,
    "failed": 3,
    "cancelled": 2
  },
  "avg_training_duration_seconds": 2400,
  "total_training_time_hours": 28.0,
  "avg_throughput": 450,
  "top_performing_jobs": [
    {
      "job_id": "job-12345-abcde",
      "efficiency_score": 0.95,
      "duration_seconds": 1800
    }
  ],
  "resource_utilization": {
    "avg_gpu_usage": 0.82,
    "avg_memory_gb": 7.5
  }
}

Tell your AI assistant:

"Show me system-wide analytics for all training jobs including success rates and average performance"

Compare Training Jobs

GET/api/training/analytics/compare

📘 What this does:

Compares multiple training jobs side-by-side across key metrics like duration, throughput, loss improvement, and resource usage. Perfect for A/B testing different hyperparameters or comparing model architectures.

Request:

curl "https://finetunelab.ai/api/training/analytics/compare?job_ids=job-123,job-456,job-789"

Query Parameters:

job_ids(required)

Comma-separated list of job IDs to compare

Response:

{
  "comparison": [
    {
      "job_id": "job-123",
      "duration_seconds": 1800,
      "throughput": 450,
      "loss_improvement": 2.1,
      "gpu_utilization": 0.85,
      "efficiency_score": 0.92
    },
    {
      "job_id": "job-456",
      "duration_seconds": 2100,
      "throughput": 380,
      "loss_improvement": 1.8,
      "gpu_utilization": 0.78,
      "efficiency_score": 0.85
    }
  ],
  "winner": {
    "best_duration": "job-123",
    "best_throughput": "job-123",
    "best_efficiency": "job-123"
  }
}

Tell your AI assistant:

"Compare these training jobs to see which hyperparameters worked best"

List Training Configs

GET/api/training

📘 What this does:

Lists all training configurations saved by the current user. Includes metadata, validation status, and configuration details. Requires authentication.

Request:

curl https://finetunelab.ai/api/training -H "Authorization: Bearer YOUR_TOKEN"

Response:

{
  "configs": [
    {
      "id": "config-123",
      "name": "My GPT-2 Fine-tune",
      "description": "Fine-tuning GPT-2 on custom dataset",
      "template_type": "sft",
      "is_validated": true,
      "created_at": "2025-01-15T10:00:00Z",
      "config_json": { ... }
    }
  ]
}

Tell your AI assistant:

"Show me all my saved training configurations"

Create Training Config

POST/api/training

📘 What this does:

Creates a new training configuration with validation. Automatically validates the config against schema rules. Returns the created config with validation results. Requires authentication.

Request Body:

curl -X POST https://finetunelab.ai/api/training \
                        -H &quot;Authorization: Bearer YOUR_TOKEN&quot; \
  -H "Content-Type: application/json" \
  -d '{ "name": "My Config", "template_type": "sft", "config_json": {...} }'

Response (201 Created):

{
  "config": {
    "id": "config-456",
    "user_id": "user-789",
    "name": "My Training Config",
    "description": "Fine-tuning for customer support",
    "template_type": "sft",
    "is_validated": true,
    "validation_errors": null,
    "created_at": "2025-01-15T11:30:00Z",
    "config_json": { ... }
  }
}

Tell your AI assistant:

"Create a new training configuration for fine-tuning GPT-2"

Get Training Config

GET/api/training/:id

📘 What this does:

Retrieves a specific training configuration by ID. Only returns configs owned by the authenticated user.

Request:

curl https://finetunelab.ai/api/training/config-123 -H "Authorization: Bearer YOUR_TOKEN"

Response:

{
  "config": {
    "id": "config-123",
    "name": "My GPT-2 Fine-tune",
    "description": "Fine-tuning GPT-2 on custom dataset",
    "template_type": "sft",
    "is_validated": true,
    "config_json": {
      "model": { ... },
      "training": { ... }
    }
  }
}

Tell your AI assistant:

"Get the details of my training configuration"

Update Training Config

PUT/api/training/:id

📘 What this does:

Updates an existing training configuration. Re-validates the config if config_json is provided. Only updates fields that are provided in the request body.

Request Body:

curl -X PUT https://finetunelab.ai/api/training/config-123 \
  -H "Authorization: Bearer YOUR_TOKEN" \
  -d '{ "name": "Updated Name", "config_json": {...} }'

Response:

{
  "config": {
    "id": "config-123",
    "name": "Updated Config Name",
    "description": "Updated description",
    "is_validated": true,
    "updated_at": "2025-01-15T12:00:00Z",
    "config_json": { ... }
  }
}

Tell your AI assistant:

"Update my training configuration to use 5 epochs instead of 3"

Delete Training Config

DELETE/api/training/:id

⚠️ Warning:

This permanently deletes the training configuration. This action cannot be undone. Make sure you've backed up any important configurations before deleting.

Request:

curl -X DELETE https://finetunelab.ai/api/training/config-123 -H "Authorization: Bearer YOUR_TOKEN"

Response:

{
  "success": true
}

Tell your AI assistant:

"Delete this training configuration - I don't need it anymore"

Create Custom Model

POST/api/models

📘 What this does:

Creates a new custom model configuration for your account. Perfect for adding private models, custom endpoints, or models from providers not in the default list. Supports various authentication methods and streaming capabilities.

Request Body:

curl -X POST https://finetunelab.ai/api/models \
  -H "Authorization: Bearer YOUR_TOKEN" \
  -d '{
    "name": "My Custom GPT-4",
    "provider": "openai",
    "base_url": "https://api.openai.com/v1",
    "model_id": "gpt-4",
    "auth_type": "bearer",
    "api_key": "sk-..."
  }'

Required Fields:

name

Display name for the model

provider

Provider name (e.g., "openai", "anthropic", "custom")

base_url

API base URL

model_id

Model identifier used in API calls

auth_type

Authentication type: "bearer", "api-key", or "none"

Response (201 Created):

{
  "success": true,
  "model": {
    "id": "model-789",
    "name": "My Custom GPT-4",
    "provider": "openai",
    "enabled": true
  },
  "message": "Model created successfully"
}

Tell your AI assistant:

"Add a custom model configuration for my private OpenAI API endpoint"

Delete Custom Model

DELETE/api/models/:id

⚠️ Warning:

This permanently deletes the model configuration. This action cannot be undone. Note: You cannot delete global/system models, only your own custom models.

Request:

curl -X DELETE https://finetunelab.ai/api/models/model-789 -H "Authorization: Bearer YOUR_TOKEN"

Response:

{
  "success": true,
  "message": "Model deleted successfully"
}

Error Response (403 Forbidden):

{
  "success": false,
  "error": "Cannot delete global models"
}

Tell your AI assistant:

"Remove this custom model - I'm no longer using it"

Phase 5: GraphRAG - Knowledge Graph Integration

🧠 What is GraphRAG?

GraphRAG (Graph-based Retrieval Augmented Generation) transforms your documents and conversations into a rich knowledge graph using Neo4j and Graphiti. This enables context-aware responses, relationship discovery, and intelligent document search powered by vector embeddings + graph traversal.

Key Features:

  • • Upload PDFs, DOCX, TXT, Markdown
  • • Automatic entity & relationship extraction
  • • Hybrid search (vector + graph)
  • • Conversation enrichment

Technology Stack:

  • • Neo4j for knowledge graph
  • • Graphiti for entity extraction
  • • OpenAI embeddings for search
  • • Supabase for document storage

🚀 Inference Deployment Endpoints

Deploy trained models to production inference endpoints. Support for RunPod Serverless with auto-scaling, budget controls, and real-time cost tracking.

Key Features:

  • • RunPod Serverless with auto-scaling (A4000 to H100 GPUs)
  • • Budget limits with real-time cost tracking
  • • Automatic checkpoint selection
  • • Budget alerts at 50%, 80%, 100%

Supported Deployments:

  • • LoRA adapters (optimized)
  • • Merged models
  • • Quantized models (4-bit, 8-bit)
  • • GGUF format

Deploy to Production Inference

POST/api/inference/deploy

📘 What this does:

Deploys a trained model to RunPod Serverless for production inference. Creates an auto-scaling endpoint with budget controls and cost tracking. Returns deployment ID and endpoint URL.

Request Body:

providerstring (required) - Provider type: "runpod-serverless"
deployment_namestring (required) - User-friendly deployment name
training_job_idstring (required) - Training job ID with checkpoints
checkpoint_selectionstring (optional) - "best", "final", or "step-N" (default: "best")
gpu_typestring (optional) - GPU type: "NVIDIA RTX A4000" to "NVIDIA H100" (default: A4000)
min_workersnumber (optional) - Minimum workers, 0 = scale to zero (default: 0)
max_workersnumber (optional) - Maximum workers for auto-scaling (default: 3)
budget_limitnumber (required) - Budget limit in USD (minimum: $1.00)
auto_stop_on_budgetboolean (optional) - Auto-stop when budget exceeded (default: true)
curl -X POST https://finetunelab.ai/api/inference/deploy \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer YOUR_ACCESS_TOKEN" \
  -d '{
    "provider": "runpod-serverless",
    "deployment_name": "my-model-prod",
    "training_job_id": "job-abc123",
    "checkpoint_selection": "best",
    "gpu_type": "NVIDIA RTX A4000",
    "min_workers": 0,
    "max_workers": 3,
    "budget_limit": 10.0,
    "auto_stop_on_budget": true
  }'

✅ Response (200 OK):

{
  "success": true,
  "deployment_id": "dep-xyz789",
  "endpoint_id": "runpod-ep-abc123",
  "endpoint_url": "https://api.runpod.ai/v2/abc123/runsync",
  "status": "deploying",
  "gpu_type": "NVIDIA RTX A4000",
  "cost_per_request": 0.0004,
  "budget_limit": 10.0,
  "estimated_requests": 25000,
  "created_at": "2025-11-12T10:30:00Z"
}

❌ Error Responses:

400 Bad Request - Invalid configuration

{
  "success": false,
  "error": {
    "code": "INVALID_CONFIG",
    "message": "budget_limit must be at least $1.00"
  }
}

401 Unauthorized - Missing or invalid API key

{
  "success": false,
  "error": {
    "code": "NO_RUNPOD_KEY",
    "message": "No RunPod API key found. Please add your RunPod API key in Settings > Secrets"
  }
}

404 Not Found - Training job not found

{
  "success": false,
  "error": {
    "code": "JOB_NOT_FOUND",
    "message": "Training job not found or no checkpoints available"
  }
}

Available GPU Types & Pricing:

• NVIDIA RTX A4000 - $0.0004/request
• NVIDIA RTX A5000 - $0.0006/request
• NVIDIA RTX A6000 - $0.0008/request
• NVIDIA A40 - $0.0010/request
• NVIDIA A100 40GB - $0.0020/request
• NVIDIA A100 80GB - $0.0025/request
• NVIDIA H100 - $0.0035/request

💬 Ask Your AI Assistant:

"Deploy my completed training job to RunPod Serverless with a $10 budget limit and auto-scaling on A4000 GPUs"

Get Deployment Status

GET/api/inference/deployments/:id/status

📘 What this does:

Retrieves real-time status, cost tracking, and metrics for an inference deployment. Includes request count, current spend, budget utilization, and performance metrics.

Request:

curl https://finetunelab.ai/api/inference/deployments/dep-xyz789/status \
  -H "Authorization: Bearer YOUR_ACCESS_TOKEN"

✅ Response (200 OK):

{
  "success": true,
  "deployment": {
    "id": "dep-xyz789",
    "deployment_name": "my-model-prod",
    "provider": "runpod-serverless",
    "endpoint_id": "runpod-ep-abc123",
    "endpoint_url": "https://api.runpod.ai/v2/abc123/runsync",
    "status": "active",

    "config": {
      "gpu_type": "NVIDIA RTX A4000",
      "min_workers": 0,
      "max_workers": 3,
      "auto_stop_enabled": true
    },

    "cost_per_request": 0.0004,
    "budget_limit": 10.0,
    "current_spend": 2.47,
    "request_count": 6175,

    "budget_utilization_percent": 24.7,
    "budget_alert": null,
    "estimated_requests_remaining": 18813,

    "metrics": {
      "total_requests": 6175,
      "successful_requests": 6150,
      "failed_requests": 25,
      "avg_latency_ms": 145,
      "p95_latency_ms": 320,
      "requests_per_minute": 42,
      "gpu_utilization_percent": 67,
      "last_request_at": "2025-11-12T11:45:30Z",
      "uptime_seconds": 7200
    },

    "created_at": "2025-11-12T10:30:00Z",
    "deployed_at": "2025-11-12T10:32:15Z",
    "last_updated": "2025-11-12T11:45:45Z"
  }
}

Status Values:

  • deploying - Initial deployment in progress
  • active - Running and accepting requests
  • scaling - Auto-scaling workers
  • stopped - Manually stopped or budget exceeded
  • failed - Deployment failed
  • error - Runtime error state

💬 Ask Your AI Assistant:

"Check the status and cost tracking for my inference deployment"

Stop Deployment

DELETE/api/inference/deployments/:id/stop

📘 What this does:

Gracefully stops an active inference deployment. Finalizes cost tracking, updates status to "stopped", and terminates the RunPod Serverless endpoint. Cost will be calculated up to the stop time.

Request:

curl -X DELETE https://finetunelab.ai/api/inference/deployments/dep-xyz789/stop \
  -H &quot;Authorization: Bearer YOUR_ACCESS_TOKEN&quot;

✅ Response (200 OK):

{
  "success": true,
  "message": "Deployment stopped successfully",
  "deployment_id": "dep-xyz789",
  "final_status": {
    "status": "stopped",
    "total_requests": 6175,
    "final_spend": 2.47,
    "budget_limit": 10.0,
    "stopped_at": "2025-11-12T12:00:00Z",
    "uptime_hours": 1.5
  }
}

⚠️ Important:

  • • Stopping a deployment is permanent - you cannot restart it
  • • Final costs will be calculated and recorded
  • • The endpoint URL will no longer accept requests
  • • To redeploy, use the deploy endpoint again
  • • Auto-stop will trigger automatically if budget limit is reached

💬 Ask Your AI Assistant:

"Stop my inference deployment and show me the final cost breakdown"

Upload Document

POST/api/graphrag/upload

📘 What this does:

Uploads a document, parses it, and processes it through Graphiti to extract entities and relationships. Supported formats: PDF, DOCX, TXT, MD. Automatically chunks large documents and creates knowledge graph nodes.

Request (multipart/form-data):

curl -X POST https://finetunelab.ai/api/graphrag/upload \
  -H "Authorization: Bearer YOUR_TOKEN" \
  -F "file=@document.pdf" \
  -F 'metadata={"title":"My Document","source":"research"}'

Response:

{
  "success": true,
  "document": {
    "id": "doc-abc-123",
    "filename": "document.pdf",
    "fileType": "pdf",
    "processed": false,
    "uploadPath": "documents/user-123/document.pdf",
    "metadata": {
      "title": "My Document",
      "source": "research"
    },
    "createdAt": "2025-01-15T10:30:00Z"
  },
  "message": "Document uploaded successfully, processing started"
}

Tell your AI assistant:

"Upload this PDF to the knowledge graph"

List Documents

GET/api/graphrag/documents

Request:

curl -X GET https://finetunelab.ai/api/graphrag/documents -H "Authorization: Bearer YOUR_TOKEN"

Response:

{
  "documents": [
    {
      "id": "doc-abc-123",
      "filename": "research.pdf",
      "fileType": "pdf",
      "processed": true,
      "neo4jEpisodeIds": ["ep-1", "ep-2"],
      "uploadPath": "documents/user-123/research.pdf",
      "metadata": { "title": "Research Paper" },
      "createdAt": "2025-01-15T10:30:00Z"
    }
  ],
  "total": 1
}

Tell your AI assistant:

"Show me all my uploaded documents"

Search Knowledge Graph

POST/api/graphrag/search

🔍 Hybrid Search:

Combines vector similarity (OpenAI embeddings) with graph traversal to find the most relevant context from your knowledge base. Perfect for enriching chat conversations with document context.

Request:

curl -X POST https://finetunelab.ai/api/graphrag/search \
  -H "Authorization: Bearer YOUR_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{"query":"quantum computing applications","limit":5}'

Response:

{
  "context": "Quantum computing has applications in...",
  "sources": [
    {
      "documentId": "doc-abc-123",
      "filename": "quantum-research.pdf",
      "relevanceScore": 0.92,
      "excerpt": "Quantum algorithms can solve..."
    }
  ],
  "metadata": {
    "searchMethod": "hybrid",
    "resultsFound": 5
  },
  "query": "quantum computing applications",
  "limit": 5
}

Tell your AI assistant:

"Search my knowledge base for information about quantum computing"

Delete Document

DELETE/api/graphrag/delete/:id

⚠️ Complete Cleanup:

Deletes the document from Supabase storage AND removes all associated nodes/relationships from the Neo4j knowledge graph. This action is permanent.

Request:

curl -X DELETE https://finetunelab.ai/api/graphrag/delete/doc-abc-123 -H "Authorization: Bearer YOUR_TOKEN"

Response:

{
  "success": true,
  "id": "doc-abc-123",
  "message": "Document and knowledge graph entries deleted successfully"
}

Tell your AI assistant:

"Remove this document from my knowledge base"

Phase 6: Analytics & Metrics - Data-Driven Insights

Fine Tune Lab provides comprehensive analytics and metrics collection to help you understand training performance, conversation quality, costs, and user behavior. Track token usage, sentiment, cohorts, and get AI-powered insights.

🎯 Key Capabilities:

  • Real-time Metrics: Track training progress, loss, GPU usage, and throughput
  • Cost Analysis: Monitor token usage and estimated costs across conversations
  • Quality Tracking: Measure success rates, ratings, and evaluation metrics
  • AI Analytics Assistant: Ask questions about your sessions with natural language
  • Sentiment Analysis: Detect sentiment trends and anomalies in conversations
  • User Cohorts: Segment users for targeted analysis and experiments

Get Analytics Data

GET/api/analytics/data

Fetch aggregated analytics data across 6 metric categories: token usage, quality, tools, conversations, errors, and latency. Supports flexible date ranges, granularity options, and selective metric filtering for efficient data retrieval.

Query Parameters:

startDate: ISO 8601 date (required) - Start of date range
endDate: ISO 8601 date (required) - End of date range
metrics: String (optional) - Comma-separated list or "all"
  Values: tokens, quality, tools, conversations, errors, latency
granularity: String (optional) - Data aggregation period
  Values: hour, day (default), week, month, all

Example Request:

curl -X GET 'https://finetunelab.ai/api/analytics/data?startDate=2025-11-01T00:00:00Z&endDate=2025-11-12T23:59:59Z&metrics=tokens,quality,tools&granularity=day' \
  -H "Authorization: Bearer YOUR_TOKEN"

Response:

{
  "success": true,
  "data": {
    "userId": "user-123",
    "timeRange": {
      "start": "2025-11-01T00:00:00.000Z",
      "end": "2025-11-12T23:59:59.000Z",
      "period": "day"
    },
    "metrics": {
      "tokenUsage": [
        {
          "timestamp": "2025-11-01T12:00:00.000Z",
          "messageId": "msg-abc",
          "conversationId": "conv-123",
          "modelId": "gpt-4o",
          "inputTokens": 1250,
          "outputTokens": 890,
          "totalTokens": 2140,
          "estimatedCost": 0.00321
        }
      ],
      "quality": [
        {
          "timestamp": "2025-11-01T12:05:00.000Z",
          "messageId": "msg-abc",
          "rating": 4,
          "successStatus": "success",
          "evaluationType": "user_feedback"
        }
      ],
      "tools": [
        {
          "timestamp": "2025-11-01T12:02:00.000Z",
          "toolName": "calculator",
          "executionTimeMs": 45,
          "success": true
        }
      ]
    },
    "aggregations": {
      "totals": {
        "messages": 156,
        "conversations": 23,
        "tokens": 334210,
        "cost": 0.501,
        "evaluations": 89,
        "errors": 3
      },
      "averages": {
        "tokensPerMessage": 2142,
        "costPerMessage": 0.00321,
        "rating": 4.2,
        "successRate": 94.3,
        "errorRate": 1.9,
        "latencyMs": 1834
      }
    }
  }
}

Tell your AI assistant:

"Get my analytics data for the past week showing token usage and quality metrics"

AI-Powered Analytics Chat

POST/api/analytics/chat

Chat with an AI analytics assistant powered by GPT-4o that can analyze your training sessions, compute metrics, detect patterns, and provide insights. Equipped with 7 specialized tools including calculator, evaluation metrics (13 operations), datetime utilities, system monitoring, and training control.

Available Tools:

1. calculator - Exact mathematical calculations
2. evaluation_metrics - 13 operations:
   • get_metrics, quality_trends, success_analysis
   • compare_periods, model_comparison, tool_impact_analysis
   • error_analysis, temporal_analysis, textual_feedback_analysis
   • benchmark_analysis, advanced_sentiment_analysis
   • predictive_quality_modeling, anomaly_detection
3. datetime - Date/time operations and timezone conversions
4. system_monitor - System health and resource monitoring
5. get_session_evaluations - Ratings and feedback data
6. get_session_metrics - Token usage, costs, response times
7. get_session_conversations - Full message history
8. training_control - List configs, attach datasets, start training

Request Body:

{
  "messages": [
    {
      "role": "user",
      "content": "What's my success rate for this session? How does it compare to typical benchmarks?"
    }
  ],
  "sessionId": "session-abc-123",
  "experimentName": "Production Evaluation Q4",
  "conversationIds": ["conv-1", "conv-2", "conv-3"]
}

Example Request:

curl -X POST https://finetunelab.ai/api/analytics/chat \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer YOUR_TOKEN" \
  -d '{
    "messages": [{"role": "user", "content": "Analyze my session costs"}],
    "sessionId": "session-abc-123",
    "experimentName": "Q4 Eval",
    "conversationIds": ["conv-1", "conv-2"]
  }'

Streaming Response:

data: {"type": "token_usage", "input_tokens": 1250, "output_tokens": 340}

data: {"type": "tools_metadata", "tools_called": [
  {"name": "get_session_metrics", "success": true},
  {"name": "calculator", "success": true}
]}

data: {"content": "I analyzed your session and found:"}
data: {"content": " The total cost was $0.0123"}
data: {"content": " across 3 conversations."}
data: {"content": " Average cost per conversation: $0.0041"}

data: [DONE]

Tell your AI assistant:

"Chat with the analytics assistant about my training session performance"

Get Training Job Status & Metrics

GET/api/training/local/:jobId/status

Get comprehensive real-time status and current metrics for a local training job. Returns progress tracking, latest loss values, GPU utilization, throughput stats, perplexity, learning rate, and trend analysis. Includes stale job detection and time estimates.

Response Fields:

Progress Tracking:
• current_step, current_epoch, total_steps, total_epochs, progress (%)
• elapsed_seconds, remaining_seconds

Loss Metrics:
• loss (current train loss), eval_loss (current eval loss)
• best_eval_loss, best_epoch, best_step
• loss_trend: "improving" | "degrading" | "stable"
• epochs_without_improvement (early stopping detection)
• train_perplexity, eval_perplexity (exp of loss)

GPU Metrics:
• gpu_memory_allocated_gb, gpu_memory_reserved_gb
• gpu_utilization_percent

Training Parameters:
• learning_rate (current), grad_norm (gradient norm)
• samples_per_second, tokens_per_second (throughput)

Status & Warnings:
• status: "pending" | "running" | "completed" | "failed" | "cancelled"
• warning: Stale job detection (no updates >5 minutes)
• error: Error message if failed

Example Request:

curl -X GET https://finetunelab.ai/api/training/local/job-abc-123/status \
  -H "Authorization: Bearer YOUR_TOKEN"

Response:

{
  "job_id": "job-abc-123",
  "status": "running",
  "model_name": "meta-llama/Llama-3.2-1B",
  
  "current_step": 450,
  "current_epoch": 2,
  "total_steps": 1000,
  "total_epochs": 3,
  "total_samples": 5000,
  "progress": 45,
  
  "loss": 1.89,
  "eval_loss": 1.82,
  "best_eval_loss": 1.75,
  "best_epoch": 1,
  "best_step": 350,
  "loss_trend": "improving",
  "epochs_without_improvement": 1,
  
  "train_perplexity": 6.62,
  "eval_perplexity": 6.17,
  
  "learning_rate": 0.00008,
  "grad_norm": 0.72,
  
  "gpu_memory_allocated_gb": 15.8,
  "gpu_memory_reserved_gb": 18.0,
  "gpu_utilization_percent": 92.3,
  
  "samples_per_second": 13.2,
  "tokens_per_second": 1689,
  
  "started_at": "2025-11-12T09:00:00.000Z",
  "elapsed_seconds": 3420,
  "remaining_seconds": 4180
}

Stale Job Warning Example:

{
  "job_id": "job-xyz-789",
  "status": "running",
  "current_step": 200,
  "progress": 20,
  "warning": "⚠️ No updates received in 8 minute(s). The training process may have terminated unexpectedly.",
  ...
}

Tell your AI assistant:

"Check my training job status and tell me the current loss, GPU usage, and time remaining"

Get Training Metrics History

GET/api/training/local/:jobId/metrics

Retrieve complete metrics history from the database for charting and analysis. Returns all tracked metrics including loss values, GPU stats, learning rate, throughput, and perplexity ordered by training step.

Metrics Tracked:

• Loss Metrics: train_loss, eval_loss, train_perplexity, eval_perplexity
• Training Progress: step, epoch, learning_rate, grad_norm
• GPU Metrics: gpu_memory_allocated_gb, gpu_memory_reserved_gb, 
                gpu_utilization_percent
• Throughput: samples_per_second, tokens_per_second
• Timestamps: created_at (when metric was recorded)

Example Request:

curl -X GET https://finetunelab.ai/api/training/local/job-abc-123/metrics \
  -H "Authorization: Bearer YOUR_TOKEN"

Response:

{
  "job_id": "job-abc-123",
  "metrics": [
    {
      "id": "metric-001",
      "job_id": "job-abc-123",
      "step": 100,
      "epoch": 1,
      "train_loss": 2.45,
      "eval_loss": 2.38,
      "train_perplexity": 11.59,
      "perplexity": 10.80,
      "learning_rate": 0.0001,
      "grad_norm": 0.85,
      "gpu_memory_allocated_gb": 14.2,
      "gpu_memory_reserved_gb": 16.0,
      "gpu_utilization_percent": 89.5,
      "samples_per_second": 12.4,
      "tokens_per_second": 1586,
      "created_at": "2025-11-12T10:15:23.000Z"
    },
    {
      "id": "metric-002",
      "job_id": "job-abc-123",
      "step": 200,
      "epoch": 1,
      "train_loss": 2.12,
      "eval_loss": 2.05,
      "learning_rate": 0.00009,
      "samples_per_second": 12.8,
      "created_at": "2025-11-12T10:20:45.000Z"
    }
  ]
}

Tell your AI assistant:

"Show me the complete metrics history for my training job so I can chart the loss curve"

Get Sentiment Insights

GET/api/analytics/sentiment/insights

Detect sentiment-related insights, anomalies, and patterns across your conversations. Analyzes evaluation feedback, detects sudden sentiment shifts, identifies trending issues, and highlights conversations requiring attention.

Query Parameters:

lookback_days: Number (optional, default: 30)
  - Number of days to analyze for insights
  - Example: lookback_days=7 for weekly insights

Example Request:

curl -X GET 'https://finetunelab.ai/api/analytics/sentiment/insights?lookback_days=14' \
  -H "Authorization: Bearer YOUR_TOKEN"

Response:

{
  "insights": [
    {
      "type": "anomaly",
      "severity": "high",
      "title": "Sudden Sentiment Drop Detected",
      "description": "Average sentiment decreased by 35% in the past 3 days",
      "affectedConversations": 12,
      "detectedAt": "2025-11-12T08:30:00.000Z",
      "recommendation": "Review recent model changes or data quality issues"
    },
    {
      "type": "trend",
      "severity": "medium",
      "title": "Increasing Negative Feedback",
      "description": "Negative ratings increased by 15% week-over-week",
      "affectedConversations": 8,
      "detectedAt": "2025-11-12T08:30:00.000Z",
      "recommendation": "Analyze error patterns in recent conversations"
    },
    {
      "type": "positive",
      "severity": "low",
      "title": "Quality Improvement Detected",
      "description": "Success rate improved from 82% to 94% this week",
      "affectedConversations": 45,
      "detectedAt": "2025-11-12T08:30:00.000Z",
      "recommendation": "Document successful strategies for future reference"
    }
  ]
}

Tell your AI assistant:

"Check for any sentiment anomalies or trends in my conversations from the past 2 weeks"

User Cohorts Management

GETPOSTPATCHDELETE/api/analytics/cohorts

Create and manage user cohorts for segmentation, targeted analysis, and A/B testing. Supports 5 cohort types: static (manual lists), dynamic (auto-updating rules), behavioral (action-based), subscription (tier-based), and custom (advanced criteria).

Cohort Types:

• static: Fixed list of users (manual membership)
• dynamic: Auto-updates based on criteria (e.g., "active in last 7 days")
• behavioral: Action-based segmentation (e.g., "used calculator tool")
• subscription: Tier-based grouping (free, pro, enterprise)
• custom: Advanced custom criteria and rules

GET - List Cohorts:

curl -X GET 'https://finetunelab.ai/api/analytics/cohorts?cohort_type=dynamic&is_active=true&limit=20' \
  -H "Authorization: Bearer YOUR_TOKEN"

POST - Create Cohort:

curl -X POST https://finetunelab.ai/api/analytics/cohorts \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer YOUR_TOKEN" \
  -d '{
    "name": "Power Users",
    "description": "Users with >50 conversations in last month",
    "cohort_type": "dynamic",
    "criteria": {
      "min_conversations": 50,
      "lookback_days": 30,
      "active_status": true
    }
  }'

PATCH - Update Cohort:

curl -X PATCH https://finetunelab.ai/api/analytics/cohorts \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer YOUR_TOKEN" \
  -d '{
    "cohort_id": "cohort-abc-123",
    "is_active": false
  }'

DELETE - Remove Cohort:

curl -X DELETE 'https://finetunelab.ai/api/analytics/cohorts?cohort_id=cohort-abc-123' \
  -H "Authorization: Bearer YOUR_TOKEN"

Response Example (Create):

{
  "cohort": {
    "id": "cohort-abc-123",
    "name": "Power Users",
    "description": "Users with >50 conversations in last month",
    "cohort_type": "dynamic",
    "criteria": {
      "min_conversations": 50,
      "lookback_days": 30,
      "active_status": true
    },
    "member_count": 0,
    "is_active": true,
    "created_at": "2025-11-12T10:30:00.000Z",
    "updated_at": "2025-11-12T10:30:00.000Z"
  }
}

Tell your AI assistant:

"Create a cohort for power users with more than 50 conversations in the last month"

🎉 You're All Set!

You now have complete API documentation for 35 endpoints covering training, analytics, configuration management, custom model integration, GraphRAG knowledge graph, and comprehensive analytics & metrics. Ready to build something awesome!