AI Reference

AI Glossary

A comprehensive reference of 68+ essential AI terms, concepts, and acronyms. From AGI to Zero-Shot Learning.

A
10 terms

Activation Function

Foundations

A mathematical function applied to a neural network node's output that determines whether it should be activated. Common activation functions include ReLU, sigmoid, and tanh. They introduce non-linearity, allowing neural networks to learn complex patterns.

AGI (Artificial General Intelligence)

Emerging

A hypothetical form of AI that possesses the ability to understand, learn, and apply knowledge across any intellectual task that a human can perform. Unlike narrow AI, AGI would exhibit flexible, general-purpose reasoning. It remains a theoretical concept and a subject of significant debate in the AI research community.

AI Alignment

Ethics

The research field focused on ensuring that AI systems behave in accordance with human values, intentions, and goals. Alignment is considered one of the most critical challenges in AI safety, especially as systems become more capable and autonomous.

AI Ethics

Ethics

The branch of ethics that examines the moral implications of developing and deploying artificial intelligence systems. Key concerns include bias, fairness, transparency, accountability, privacy, and the societal impact of automation.

AI Hallucination

LLMs

A phenomenon where an AI model generates information that sounds plausible but is factually incorrect, fabricated, or nonsensical. Hallucinations are a known limitation of large language models and highlight the importance of human verification of AI outputs.

AI Literacy

Foundations

The ability to understand, use, evaluate, and communicate about artificial intelligence technologies. AI literacy encompasses knowing what AI can and cannot do, recognizing AI-generated content, and understanding the ethical implications of AI deployment.

Anthropic

Tools

An AI safety company founded in 2021 by former OpenAI researchers, known for developing the Claude family of AI assistants. Anthropic focuses on building reliable, interpretable, and steerable AI systems with a strong emphasis on safety research.

API (Application Programming Interface)

Tools

A set of protocols and tools that allows different software applications to communicate with each other. In the AI context, APIs enable developers to integrate AI capabilities (like text generation, image recognition, or speech-to-text) into their own applications without building models from scratch.

Attention Mechanism

LLMs

A component in neural network architectures that allows the model to focus on different parts of the input when producing output. The self-attention mechanism in Transformers enables the model to weigh the relevance of each word in a sentence relative to every other word, which is fundamental to how modern LLMs process language.

Autonomous Agent

Emerging

An AI system that can independently perceive its environment, make decisions, and take actions to achieve specified goals without continuous human intervention. Examples include self-driving cars, robotic process automation bots, and AI agents that can browse the web and complete tasks.

B
2 terms

Backpropagation

Foundations

The primary algorithm used to train neural networks. It calculates the gradient of the loss function with respect to each weight in the network, then adjusts weights to minimize error. This process of propagating errors backward through the network is what enables deep learning models to improve over time.

Bias (in AI)

Ethics

Systematic errors in AI outputs that arise from prejudiced assumptions in training data, algorithm design, or deployment context. AI bias can lead to unfair outcomes that disproportionately affect certain groups. Addressing bias requires careful data curation, model auditing, and ongoing monitoring.

C
6 terms

Chain-of-Thought Prompting

Prompting

A prompt engineering technique that instructs an AI model to break down complex reasoning into intermediate steps before arriving at a final answer. By explicitly asking the model to 'think step by step,' this approach significantly improves performance on math, logic, and multi-step reasoning tasks.

ChatGPT

Tools

A conversational AI product developed by OpenAI, built on the GPT (Generative Pre-trained Transformer) architecture. Launched in November 2022, ChatGPT popularized the use of large language models for general-purpose conversation, writing, coding, analysis, and creative tasks.

Claude

Tools

A family of AI assistants developed by Anthropic. Claude models are designed with a focus on being helpful, harmless, and honest. They are known for strong performance in analysis, writing, coding, and following nuanced instructions.

Computer Vision

Foundations

A field of AI that enables machines to interpret and understand visual information from the world, including images and videos. Applications include facial recognition, object detection, medical image analysis, autonomous driving, and quality control in manufacturing.

Constitutional AI

Ethics

An approach to AI alignment developed by Anthropic where AI systems are trained to follow a set of principles (a 'constitution') that guide their behavior. The model critiques and revises its own outputs based on these principles, reducing the need for human feedback on every interaction.

Context Window

LLMs

The maximum amount of text (measured in tokens) that a language model can process in a single interaction. A larger context window allows the model to consider more information when generating responses. Modern models range from 4K to over 1 million tokens in context window size.

D
4 terms

DALL-E

Tools

An AI image generation model created by OpenAI that produces images from text descriptions. DALL-E demonstrates the ability of AI to understand and visually represent complex concepts, compositions, and styles described in natural language.

Data Labeling

Foundations

The process of annotating raw data (images, text, audio) with meaningful tags or categories so it can be used to train supervised machine learning models. High-quality labeled data is essential for model accuracy and is often the most time-consuming part of AI development.

Deep Learning

Foundations

A subset of machine learning that uses artificial neural networks with multiple layers (hence 'deep') to learn representations of data at increasing levels of abstraction. Deep learning has driven breakthroughs in image recognition, natural language processing, speech recognition, and generative AI.

Diffusion Model

Foundations

A type of generative AI model that creates data (typically images) by learning to reverse a gradual noising process. Starting from random noise, the model iteratively denoises to produce coherent outputs. Stable Diffusion and DALL-E 3 are prominent examples.

E
3 terms

Embedding

LLMs

A numerical representation of data (words, sentences, images) as vectors in a high-dimensional space. Embeddings capture semantic meaning — similar concepts are placed closer together in the vector space. They are fundamental to how AI models understand relationships between concepts.

Emergent Behavior

Emerging

Capabilities or behaviors that appear in AI models at scale but were not explicitly programmed or anticipated. As language models grow larger, they sometimes develop unexpected abilities like in-context learning, chain-of-thought reasoning, or multilingual translation without specific training for those tasks.

EU AI Act

Ethics

The European Union's comprehensive regulatory framework for artificial intelligence, adopted in 2024. It classifies AI systems by risk level (unacceptable, high, limited, minimal) and imposes requirements accordingly, including transparency obligations, human oversight mandates, and prohibitions on certain AI practices.

F
3 terms

Few-Shot Learning

Prompting

A machine learning approach where a model learns to perform a task from only a small number of examples. In the context of LLMs, few-shot prompting involves providing a few examples of the desired input-output pattern within the prompt to guide the model's behavior.

Fine-Tuning

LLMs

The process of further training a pre-trained AI model on a specific, smaller dataset to adapt it for a particular task or domain. Fine-tuning allows organizations to customize general-purpose models for specialized applications like medical diagnosis, legal analysis, or customer service.

Foundation Model

LLMs

A large AI model trained on broad data at scale that can be adapted to a wide range of downstream tasks. Examples include GPT-4, Claude, Llama, and Gemini. Foundation models serve as the base upon which specialized applications are built through fine-tuning or prompting.

G
5 terms

Gemini

Tools

Google DeepMind's family of multimodal AI models capable of processing and generating text, images, audio, and video. Gemini represents Google's flagship AI offering and is integrated across Google products including Search, Workspace, and Android.

Generative AI

Foundations

AI systems capable of creating new content — text, images, audio, video, code, or other media — based on patterns learned from training data. Generative AI represents a shift from AI that classifies or predicts to AI that creates, and includes technologies like ChatGPT, Midjourney, and Suno.

GPT (Generative Pre-trained Transformer)

LLMs

A family of large language models developed by OpenAI. GPT models are trained using unsupervised learning on vast text corpora, then fine-tuned for specific tasks. The architecture is based on the Transformer, using self-attention mechanisms to generate coherent, contextually relevant text.

Grounding

Prompting

The technique of connecting AI model outputs to verified, factual sources of information. Grounding helps reduce hallucinations by ensuring the model's responses are anchored in real data, documents, or databases rather than relying solely on patterns learned during training.

Guardrails

Ethics

Safety mechanisms and constraints built into AI systems to prevent harmful, biased, or undesirable outputs. Guardrails can include content filters, output validation rules, topic restrictions, and behavioral guidelines that keep AI systems operating within acceptable boundaries.

H
1 term

Hugging Face

Tools

An open-source platform and community for machine learning, hosting thousands of pre-trained models, datasets, and tools. Hugging Face has become the de facto hub for sharing and discovering AI models, particularly in natural language processing.

I
2 terms

In-Context Learning

LLMs

The ability of large language models to learn and adapt their behavior based on examples or instructions provided within the prompt, without any changes to the model's weights. This emergent capability allows LLMs to perform new tasks simply by being shown what to do in the conversation context.

Inference

Foundations

The process of using a trained AI model to make predictions or generate outputs on new, unseen data. Inference is the 'production' phase of AI — after a model is trained, inference is how it's actually used in applications. Inference speed and cost are key considerations for deployment.

J
1 term

Jailbreaking

Ethics

Techniques used to bypass the safety guardrails and content restrictions of AI models, causing them to produce outputs they were designed to refuse. Jailbreaking highlights the ongoing challenge of making AI systems robust against adversarial manipulation.

K
1 term

Knowledge Cutoff

LLMs

The date after which a language model has no training data, meaning it lacks awareness of events, developments, or information that occurred after that date. Understanding a model's knowledge cutoff is important for evaluating the currency and reliability of its outputs.

L
3 terms

LangChain

Tools

An open-source framework for building applications powered by language models. LangChain provides tools for chaining together multiple LLM calls, integrating external data sources, managing memory, and building AI agents that can use tools and make decisions.

Large Language Model (LLM)

LLMs

An AI model trained on massive amounts of text data that can understand, generate, and manipulate human language. LLMs like GPT-4, Claude, and Llama use billions of parameters to capture patterns in language, enabling them to perform tasks ranging from conversation to code generation to analysis.

LoRA (Low-Rank Adaptation)

LLMs

An efficient fine-tuning technique that adapts large language models by training only a small number of additional parameters rather than modifying the entire model. LoRA dramatically reduces the computational cost and memory requirements of fine-tuning, making model customization more accessible.

M
4 terms

Machine Learning

Foundations

A subset of artificial intelligence where systems learn patterns from data and improve their performance on tasks without being explicitly programmed. The three main types are supervised learning (labeled data), unsupervised learning (unlabeled data), and reinforcement learning (reward-based).

Midjourney

Tools

An AI image generation platform that creates images from text descriptions (prompts). Known for producing highly artistic and aesthetically refined outputs, Midjourney operates primarily through a Discord bot interface and has become one of the most popular tools for AI-generated art.

Mixture of Experts (MoE)

LLMs

A neural network architecture where multiple specialized sub-networks (experts) are combined, with a gating mechanism that routes each input to the most relevant experts. MoE allows models to be very large in total parameters while only activating a fraction for each input, improving efficiency.

Multimodal AI

Emerging

AI systems that can process, understand, and generate multiple types of data — such as text, images, audio, and video — within a single model. GPT-4o and Gemini are examples of multimodal models that can seamlessly work across different data modalities.

N
3 terms

Natural Language Processing (NLP)

Foundations

The field of AI focused on enabling computers to understand, interpret, and generate human language. NLP encompasses tasks like sentiment analysis, translation, summarization, question answering, and text generation. Modern NLP is dominated by Transformer-based language models.

Neural Network

Foundations

A computing system inspired by the biological neural networks in the human brain. It consists of interconnected nodes (neurons) organized in layers that process information. Neural networks are the foundation of deep learning and power most modern AI systems.

NIST AI Risk Management Framework

Ethics

A voluntary framework published by the U.S. National Institute of Standards and Technology that provides guidance for managing risks associated with AI systems. It covers governance, risk mapping, measurement, and management across the AI lifecycle, and is widely referenced in AI policy discussions.

O
3 terms

Open Source AI

Tools

AI models and tools whose source code, weights, and/or training data are made publicly available for anyone to use, modify, and distribute. Open source AI promotes transparency, collaboration, and accessibility. Notable examples include Meta's Llama, Stability AI's Stable Diffusion, and Mistral's models.

OpenAI

Tools

An AI research and deployment company founded in 2015, known for developing the GPT series of language models, DALL-E image generation, Whisper speech recognition, and the ChatGPT conversational AI product. OpenAI has been instrumental in bringing large language models to mainstream adoption.

Overfitting

Foundations

A problem in machine learning where a model learns the training data too well — including its noise and outliers — resulting in poor performance on new, unseen data. Overfitting means the model has memorized rather than generalized, and is addressed through techniques like regularization, dropout, and cross-validation.

P
3 terms

Parameter

Foundations

A variable within a machine learning model that is learned from training data. In neural networks, parameters are the weights and biases that the model adjusts during training. The number of parameters is often used as a rough measure of model size — GPT-4 is estimated to have over 1 trillion parameters.

Perplexity

LLMs

A metric used to evaluate language models, measuring how well the model predicts a sample of text. Lower perplexity indicates better prediction. Also the name of an AI-powered search engine that provides cited, conversational answers to questions.

Prompt Engineering

Prompting

The practice of crafting effective inputs (prompts) to guide AI models toward producing desired outputs. Prompt engineering encompasses techniques like few-shot examples, chain-of-thought reasoning, role assignment, and structured formatting to maximize the quality and relevance of AI responses.

R
4 terms

RAG (Retrieval-Augmented Generation)

LLMs

A technique that enhances language model outputs by first retrieving relevant information from external knowledge sources, then using that information to generate more accurate and grounded responses. RAG reduces hallucinations and allows models to access up-to-date or proprietary information.

Reasoning (AI)

Emerging

The ability of AI models to perform logical deduction, multi-step problem solving, and complex analysis. Recent models like OpenAI's o1 and o3 series demonstrate improved reasoning capabilities through techniques like chain-of-thought processing and extended 'thinking' time before responding.

Reinforcement Learning from Human Feedback (RLHF)

LLMs

A training technique where human evaluators rank model outputs by quality, and this feedback is used to train a reward model that guides further optimization. RLHF is a key technique used to align language models with human preferences and make them more helpful, harmless, and honest.

Responsible AI

Ethics

An approach to developing and deploying AI systems that prioritizes ethical considerations, fairness, transparency, accountability, and societal benefit. Responsible AI frameworks guide organizations in building AI that respects human rights, minimizes harm, and operates within legal and ethical boundaries.

S
3 terms

Stable Diffusion

Tools

An open-source AI image generation model that creates images from text descriptions using a latent diffusion architecture. Its open-source nature has enabled a large ecosystem of tools, extensions, and fine-tuned variants for specialized image generation tasks.

Synthetic Data

Foundations

Artificially generated data that mimics the statistical properties of real-world data. Synthetic data is used to train AI models when real data is scarce, sensitive, or expensive to collect. It can help address privacy concerns and data imbalance issues.

System Prompt

Prompting

Hidden instructions provided to an AI model that define its behavior, personality, capabilities, and constraints for a given interaction. System prompts are set by developers and are not visible to end users. They are a fundamental tool for customizing AI assistant behavior.

T
5 terms

Temperature

Prompting

A parameter that controls the randomness of an AI model's outputs. Lower temperature (e.g., 0.1) produces more deterministic, focused responses, while higher temperature (e.g., 1.0) produces more creative, varied outputs. Adjusting temperature is a key technique in prompt engineering.

Token

LLMs

The basic unit of text that language models process. A token can be a word, part of a word, or a punctuation mark. For English text, one token is roughly 3/4 of a word. Tokenization — breaking text into tokens — is the first step in how LLMs process language. Model pricing and context windows are measured in tokens.

Transfer Learning

Foundations

A machine learning technique where a model trained on one task is repurposed for a different but related task. Transfer learning is the principle behind foundation models — a model pre-trained on general text can be adapted for specific tasks like sentiment analysis, translation, or code generation.

Transformer

LLMs

A neural network architecture introduced in the 2017 paper 'Attention Is All You Need' that revolutionized natural language processing. Transformers use self-attention mechanisms to process all parts of an input simultaneously (rather than sequentially), enabling much faster training and better performance. Nearly all modern LLMs are based on the Transformer architecture.

Turing Test

Foundations

A test proposed by Alan Turing in 1950 to evaluate a machine's ability to exhibit intelligent behavior indistinguishable from a human. If a human evaluator cannot reliably distinguish between the machine and a human in conversation, the machine is said to have passed the test.

V
1 term

Vector Database

Tools

A specialized database designed to store, index, and query high-dimensional vector embeddings efficiently. Vector databases are essential for RAG systems, semantic search, and recommendation engines, enabling fast similarity searches across millions of embedded documents or data points.

Z
1 term

Zero-Shot Learning

Prompting

The ability of an AI model to perform a task it has never been explicitly trained on, without any examples. In the context of LLMs, zero-shot prompting means giving the model a task description without any examples and relying on its general knowledge to produce the correct output.