Theory Insert — What Fine-Tuning Actually Does

Lena’s $42K lessonPerfect brand voice. Then confidently invented clauses that never existed. The model overfit to patterns and filled gaps with nonsense.

Lena thought fine-tuning would be her silver bullet. As PM at a fast-growing legaltech startup, she was tired of the base model ignoring their clause library. “Just fine-tune it on our 5,000 approved contracts,” she told engineering. Six weeks, $42K in labeling + GPU time later, the model went live.

First week: brand voice finally perfect. Second week: it confidently invented clauses that never existed. Legal almost had a heart attack. The model hadn’t “learned” new facts — it overfit to patterns and filled gaps with high-confidence nonsense.

Fine-tuning takes a pre-trained model and continues training on a small, high-quality dataset of your input–output pairs. You’re nudging the probability distribution so outputs look more like yours.

Does well

Tone, style, voice consistency. Format adherence (JSON, templates). Domain adaptation (legal, medical, jargon). Efficiency on narrow tasks.

Does not do

Reliably add new factual knowledge (use RAG). Fix reasoning weaknesses. Make a mediocre model brilliant.

Santiago’s ruleTry prompting, in-context examples, RAG, guardrails, and chaining first. Fine-tuning is the last resort.

“99% of problems don’t require fine-tuning… Fine-tuning should be your last resort, not the first step.”— Santiago @svpino, June 2025

When it is the right move, production wins come from LoRA or QLoRA — tiny adapter layers at 1/100th the cost. Elliot Arledge: instead of $10K to fine-tune 32B, run multiple rollouts + introspection for $18.

LoRA economics1/100th the cost of full fine-tuning. Tiny adapter layers that punch above their weight on narrow tasks.

Fine-Tuning as Steering. It narrows and specializes behavior — it does not expand intelligence. Warning: catastrophic forgetting.

Unlock the full chapter