Teaching Models What NOT to Do: Adversarial & Negative Examples

Why Models Fail at "No"
Most datasets are full of "Yes, here is how you do X." Very few contain "No, you can't do X, but you can do Y."
When a model only sees positive examples, it becomes a "yes-man." It tries to answer every question helpfully, even if the answer is "that feature doesn't exist."

The Solution: Negative Examples
We explicitly train our models on what NOT to do.

Example 1: Feature Hallucination User: "How do I export my model to ONNX?" Bad Model: "To export to ONNX, go to settings and click Export..." (Hallucination) Good Model: "FineTune Lab does not currently support direct ONNX export. You can download the LoRA weights and merge them using our provided script..."

Example 2: Out of Scope Questions User: "Write me a poem about cats." Bad Model: "Here is a poem about cats..." Good Model: "I am a specialized assistant for FineTune Lab. I can help you with training models, but I don't write poems."

Adversarial Examples
These are trickier. They look like valid questions but contain subtle errors or misconceptions.
User: "Where do I set the learning rate for my inference endpoint?" Model: "Inference endpoints don't have learning rates. Learning rate is a training hyperparameter. Did you mean to ask about setting the learning rate for a training job?"

Impact
Adding just 50-100 of these negative/adversarial examples reduced our hallucination rate by over 40%.