AI & LLM

AI Glossary: Plain Language Definitions for Developers

RAG, fine-tuning, embedding, context window, hallucination — short, direct definitions with no fluff.

Nat ·
#glossary #llm #rag #fine-tuning #embedding

Common terms in the AI/LLM world, alphabetically ordered and updated regularly.


Context Window

The maximum number of tokens a model can process in one request — input + output combined.
Example: Claude 3.5 = 200K tokens ≈ ~150 pages of a book.

Embedding

Converting text to a numeric vector to measure semantic similarity.
Used in: RAG, semantic search, clustering.

Fine-tuning

Training a base model further on your own data to adjust style, tone, or domain knowledge.
Best for: tasks requiring a specific voice or specialized knowledge not in pre-training.

Hallucination

When a model confidently generates factually incorrect information.
Mitigated by: RAG, grounding with real sources, prompting to say “I don’t know.”

Prompt Engineering

Designing inputs that elicit better outputs without fine-tuning.
Techniques: few-shot examples, chain-of-thought, role assignment, structured output.

RAG (Retrieval-Augmented Generation)

Fetching relevant documents from a knowledge base and inserting them into the prompt before generation.
Why it works: the model doesn’t need to memorize everything — it just reads and summarizes what’s retrieved.

Temperature

Controls output randomness.

  • 0.0 = deterministic, same answer every time
  • 1.0 = creative, varied
  • Code/fact tasks: low | Creative tasks: higher

Token

The smallest unit a model processes — not a character, not a word.
Rough rule: 1 English word ≈ 1–2 tokens | 1 Thai word ≈ 2–4 tokens.