AI Glossary: Plain Language Definitions for Developers

Common terms in the AI/LLM world, alphabetically ordered and updated regularly.

Context Window

The maximum number of tokens a model can process in one request — input + output combined.
Example: Claude 3.5 = 200K tokens ≈ ~150 pages of a book.

Embedding

Converting text to a numeric vector to measure semantic similarity.
Used in: RAG, semantic search, clustering.

Fine-tuning

Training a base model further on your own data to adjust style, tone, or domain knowledge.
Best for: tasks requiring a specific voice or specialized knowledge not in pre-training.

Hallucination

When a model confidently generates factually incorrect information.
Mitigated by: RAG, grounding with real sources, prompting to say “I don’t know.”

Prompt Engineering

Designing inputs that elicit better outputs without fine-tuning.
Techniques: few-shot examples, chain-of-thought, role assignment, structured output.

RAG (Retrieval-Augmented Generation)

Fetching relevant documents from a knowledge base and inserting them into the prompt before generation.
Why it works: the model doesn’t need to memorize everything — it just reads and summarizes what’s retrieved.

Temperature

Controls output randomness.

0.0 = deterministic, same answer every time
1.0 = creative, varied
Code/fact tasks: low | Creative tasks: higher

Token

The smallest unit a model processes — not a character, not a word.
Rough rule: 1 English word ≈ 1–2 tokens | 1 Thai word ≈ 2–4 tokens.

Context Window

Embedding

Fine-tuning

Hallucination

Prompt Engineering

RAG (Retrieval-Augmented Generation)

Temperature

Token

More in AI & LLM

Your Data and AI — What's Safe, What's Not

Choosing the Right AI Tool — A Guide by Task

LLM Selection Guide for Developers 2026

Weekly digest