Types of LLMs

Large language models come in several architectural and access pattern variants.

By Access Pattern

Proprietary API Models

Hosted by companies like OpenAI, Anthropic, and Google. You call them via REST API, pay per token, and never see the weights. Examples: GPT-4o, Claude 3.5 Sonnet, Gemini 1.5 Pro.

Pros: Highest capability, no infrastructure, always up to date.
Cons: Data leaves your systems, per-token cost, rate limits.

Open-Weight Models

Weights are publicly released. You download them and run inference yourself. Examples: Llama 3.1, Mistral 7B, Qwen 2.5.

Pros: Data stays local, no per-token cost, customizable.
Cons: Requires GPU/CPU infrastructure, operational overhead.

Fine-Tuned Models

A base model further trained on domain-specific data. Can be proprietary or open-weight. Examples: Code Llama (code), BioMedLM (biomedical).

By Size Class

Class	Parameters	Typical Use Case
Small	1B–7B	Edge inference, classification
Medium	8B–30B	Chat, summarization, RAG
Large	70B–200B	Complex reasoning, agentic tasks
Frontier	>200B	Research, hardest tasks

Choosing the Right Model

For most production applications, start with a medium-class API model (GPT-4o-mini, Claude 3.5 Haiku) and measure accuracy on your specific task. Only upgrade to a larger model if the benchmark results justify the cost difference.