Minimal Theory Insert · After Chapter 13

What Fine-Tuning
Actually Does

Fine-tuning is not magic education. It’s behavioral steering. Treat it like tailoring an off-the-rack suit, not building a new person.

📖 ~5 min readTheory Insert
scroll
Lena’s $42K lessonPerfect brand voice. Then confidently invented clauses that never existed. The model overfit to patterns and filled gaps with nonsense.

Lena thought fine-tuning would be her silver bullet. As PM at a fast-growing legaltech startup, she was tired of the base model ignoring their clause library. “Just fine-tune it on our 5,000 approved contracts,” she told engineering. Six weeks, $42K in labeling + GPU time later, the model went live.

First week: brand voice finally perfect. Second week: it confidently invented clauses that never existed. Legal almost had a heart attack. The model hadn’t “learned” new facts — it overfit to patterns and filled gaps with high-confidence nonsense.

Fine-tuning takes a pre-trained model and continues training on a small, high-quality dataset of your input–output pairs. You’re nudging the probability distribution so outputs look more like yours.

Does well

Tone, style, voice consistency. Format adherence (JSON, templates). Domain adaptation (legal, medical, jargon). Efficiency on narrow tasks.

Does not do

Reliably add new factual knowledge (use RAG). Fix reasoning weaknesses. Make a mediocre model brilliant.

Santiago’s ruleTry prompting, in-context examples, RAG, guardrails, and chaining first. Fine-tuning is the last resort.
“99% of problems don’t require fine-tuning… Fine-tuning should be your last resort, not the first step.”— Santiago @svpino, June 2025

When it is the right move, production wins come from LoRA or QLoRA — tiny adapter layers at 1/100th the cost. Elliot Arledge: instead of $10K to fine-tune 32B, run multiple rollouts + introspection for $18.

LoRA economics1/100th the cost of full fine-tuning. Tiny adapter layers that punch above their weight on narrow tasks.
BASE MODELWide highway — general knowledge LoRAadapter YOUR DOMAINNarrow road — brand voice, format BEFORE: swerving generic path AFTER: laser-straight branded path &x26A0; May forget side roads
Fine-Tuning as Steering. It narrows and specializes behavior — it does not expand intelligence. Warning: catastrophic forgetting.
Theory Insert

Unlock the full chapter

The first two chapters are free. Chapters 3 through 18 unlock with a one-time purchase on the same account.

$18.99one-time purchase via PayPal

Already purchased? Sign in with the same account you used at checkout.