Types of LLMs

Large language models come in several architectural and access pattern variants.

By Access Pattern

Proprietary API Models

Hosted by companies like OpenAI, Anthropic, and Google. You call them via REST API, pay per token, and never see the weights. Examples: GPT-4o, Claude 3.5 Sonnet, Gemini 1.5 Pro.

Pros: Highest capability, no infrastructure, always up to date.
Cons: Data leaves your systems, per-token cost, rate limits.

Open-Weight Models

Weights are publicly released. You download them and run inference yourself. Examples: Llama 3.1, Mistral 7B, Qwen 2.5.

Pros: Data stays local, no per-token cost, customizable.
Cons: Requires GPU/CPU infrastructure, operational overhead.

Fine-Tuned Models

A base model further trained on domain-specific data. Can be proprietary or open-weight. Examples: Code Llama (code), BioMedLM (biomedical).

By Size Class

ClassParametersTypical Use Case
Small1B–7BEdge inference, classification
Medium8B–30BChat, summarization, RAG
Large70B–200BComplex reasoning, agentic tasks
Frontier>200BResearch, hardest tasks

Choosing the Right Model

For most production applications, start with a medium-class API model (GPT-4o-mini, Claude 3.5 Haiku) and measure accuracy on your specific task. Only upgrade to a larger model if the benchmark results justify the cost difference.