Definition

Fine-Tuning

Fine-tuning is the process of further training a pre-trained large language model on a smaller, task-specific dataset to adapt its behavior for a particular use case. The model's weights are updated to specialize in a domain—such as a specific programming language, codebase, or output format—while retaining its general capabilities from pre-training.

How fine-tuning works

Fine-tuning starts with a pre-trained model that already understands language and code broadly. You provide a dataset of input-output pairs that demonstrate the behavior you want. The model trains on this data for a few epochs, adjusting its weights to produce outputs that match your examples. The result is a model that retains general capabilities but performs better on your specific task—following your code style, using your APIs correctly, or generating output in your preferred format.

Fine-tuning vs. prompting vs. RAG

These three approaches solve different problems. Prompting (including CLAUDE.md) gives the model instructions at runtime—no training required, instant to change, but limited by context window. RAG retrieves relevant information to include in the prompt—good for factual grounding with changing data. Fine-tuning changes the model itself—slower and more expensive to set up, but embeds knowledge into the model's weights permanently. Most teams should try prompting and RAG first, and only fine-tune when those are insufficient.

When fine-tuning makes sense for code

+Teaching the model a proprietary DSL or internal framework
+Enforcing a specific code style across all generations
+Reducing latency by baking in context that would otherwise need retrieval
+Specializing a smaller model to match a larger model's performance on a narrow task

Fine-tuning is expensive and can degrade general performance if done incorrectly. For most development teams, well-crafted prompts and CLAUDE.md files achieve 90% of what fine-tuning would, at a fraction of the cost and complexity.

Can I fine-tune Claude?+

Anthropic offers fine-tuning for enterprise customers on select models. For most users, the combination of system prompts, CLAUDE.md, and prompt engineering provides sufficient customization without the cost and complexity of fine-tuning.

How much data do I need for fine-tuning?+

It depends on the task. Simple formatting changes may need only 50-100 examples. Teaching a model a new domain or coding style typically requires 500-5,000 high-quality examples. More data generally helps, but quality matters more than quantity—noisy or inconsistent training data produces unreliable models.

Does fine-tuning replace the need for RAG?+

Not usually. Fine-tuning embeds knowledge into model weights, which works for stable knowledge. For data that changes frequently (like your codebase), RAG is better because it retrieves the latest information at query time. Many production systems use both: a fine-tuned model with RAG for current data.

Master Claude Code in days, not months

37 hands-on lessons from beginner to CI/CD automation. Module 1 is free.

START FREE →

← ALL TERMS

How fine-tuning works

Fine-tuning vs. prompting vs. RAG

When fine-tuning makes sense for code

Related terms