AI that mimics human reasoning across any domain. Also known as “general AI” or “strong AI,” ideal for understanding the future of artificial general intelligence.
self-governing program that performs tasks and learns over time. Think of it as an autonomous decision-making bot or intelligent agent for automation.
A conversational AI tool that helps with scheduling, research, or customer support. Also called virtual assistants or digital helpers, perfect for boosting productivity with AI-driven assistance.
Guidelines ensuring AI is fair, transparent, and respects privacy. Covers topics like bias mitigation, responsible AI development, and ethical machine learning practices.
An add-on integrating AI features like language understanding into existing software. Also referred to as AI extensions or AI integrations for platforms like Slack or WordPress.
A set of protocols that enable different software systems to communicate. Key for AI developers using RESTful APIs, GraphQL, or machine learning APIs.
Ensuring AI creators and systems are responsible for outcomes. Includes accountability frameworks, auditing AI models, and governance policies
A core mathematical rule in neural networks that transforms inputs into outputs. Popular functions include ReLU, sigmoid, and tanh for deep learning activation.
A company building reliable, safety-focused AI models. Known for Claude, their conversational AI assistant with ethical AI research roots.
Technology that enables machines to perform human-like tasks such as learning and decision-making. Also called machine intelligence or AI technology for applications in healthcare, finance, and more.
When AI produces skewed or unfair results due to imbalanced data. Also known as algorithmic bias or data bias, addressed by fairness and bias detection methods
OpenAI’s large language model for conversational text generation. Used for AI chatbots, content creation, and prompting with GPT-3.5 or GPT-4.
An AI-driven program that simulates human conversation via text or voice. Also referred to as conversational agents or virtual chat assistants.
Anthropic’s safety-first AI assistant designed for natural dialogue. Used in customer service bots, content summarization, and AI research collaborations.
An AI model whose code and training data are proprietary and not publicly shared. Also called proprietary AI or commercial AI models.
AI services hosted on remote servers and accessed over the internet. Includes cloud machine learning platforms like AWS SageMaker, Google AI Platform, and Azure ML.
The span of text an AI model can consider when generating responses. Also called attention window or token window size in transformer models.
A curated collection of data-text, images, or audio-used to train AI models. Dataset examples include ImageNet, COCO, and custom training datasets.
A subset of machine learning using multi-layered neural networks to learn from large datasets. Includes techniques like convolutional neural networks and recurrent neural networks.
AI that runs locally on devices like smartphones or IoT hardware for faster responses. Also known as on-device AI or edge machine learning.
One full iteration through the entire training dataset during model training. Epochs help measure and control overfitting and underfitting in deep learning.
The ability to interpret and understand AI model decisions. Techniques include SHAP values, LIME, and model interpretability tools.
Ensuring AI treats all users and groups equally without discrimination. Fairness metrics include demographic parity, equal opportunity, and fairness-aware learning.
An individual measurable property or characteristic used as input for AI models. Examples include pixel values in images or words in text embeddings.
Training AI to perform tasks with only a handful of examples. Also called low-shot learning, useful for rapid prototyping and new domain adaptation.
Refining a pre-trained AI model on specific data to boost performance. Common in transfer learning workflows for NLP and computer vision.
A specialized processor for parallel operations that accelerates AI training. Widely used in deep learning for faster matrix computations.
Google’s next-generation AI model family for multimodal understanding. Handles text, images, and code in a unified transformer-based architecture.
A research lab under Alphabet pioneering AI breakthroughs like AlphaGo and AlphaFold. Focuses on reinforcement learning, neuroscience-inspired AI, and ethical AI research.
When AI confidently generates incorrect or fabricated information. Common in large language models without retrieval-augmented safeguards.
Using a trained AI model to make predictions on new data. Inference workflows include batch inference, real-time inference, and edge deployment.
AI’s capability to understand user goals from natural language inputs. Key for chatbots, virtual assistants, and voice AI commands.
A metric that compares the similarity between two sets by dividing the size of their intersection by the size of their union. Commonly used in clustering evaluation and search retrieval tasks.
A representation technique that projects multimodal data (e.g., images and text) into a shared vector space to enable cross-modal similarity searches.
An interactive web application for creating and sharing documents containing live code, equations, visualizations, and narrative text - ideal for data exploration and model prototyping.
A performance optimization that compiles code to optimized machine instructions at runtime, used in frameworks like PyTorch’s TorchScript to accelerate inference.
A high-level neural networks API written in Python that runs on top of TensorFlow, enabling fast experimentation with deep learning models.
An unsupervised learning algorithm that partitions data into k clusters by minimizing within-cluster variance, often used for customer segmentation and anomaly detection.
A technique in support vector machines and other algorithms that implicitly maps input data into higher-dimensional feature spaces to allow non-linear decision boundaries.
A structured representation of entities and their relationships, powering semantic search, recommendation engines, and question-answering systems.
Kullback–Leibler Divergence: A measure of how one probability distribution diverges from a reference distribution, central to variational inference and generative modeling.
tag or annotation that specifies the target output for training data. Used in supervised learning for classification and regression tasks.
AI trained on massive text corpora to understand and generate humanlike language. Includes GPT, BERT, and T5 architectures for NLP applications.
The time delay between sending a request and receiving an AI response. Low latency is critical for real-time AI services and voice assistants.
A field of AI where algorithms improve performance by learning from data. Includes supervised, unsupervised, and reinforcement learning approaches.
The process of putting a trained AI model into production for end users. Deployment options include cloud APIs, containerized services, and edge devices.
The learned numerical parameters in a neural network determining its predictions. Weights are saved during training and loaded for inference.
A computing system inspired by the human brain’s interconnected neurons. Includes layers of perceptron that transform inputs into outputs.
Irrelevant or random data that can obscure true patterns in datasets. Noise reduction techniques
include data cleaning and regularization.
An AI model whose architecture and code are publicly available. Encourages community contributions and transparency in AI research.
A research organization developing cutting-edge AI like GPT-3, ChatGPT, and DALL·E. Focuses on safe, scalable AI and open research collaborations.
When a model learns training data too closely and fails on new inputs. Mitigated by techniques like cross-validation, regularization, and dropout.
A metric for how well a language model predicts a sample of text. Lower perplexity indicates better language modeling performance.
Initial training phase on a broad dataset before specialized fine-tuning. Sets a strong foundation for downstream NLP or vision tasks.
Input text or instructions guiding a language model’s response. Effective prompts lead to accurate and relevant AI-generated content.
Crafting and refining prompts to optimize AI outputs. Involves techniques like few-shot examples and context setting.
A model-free reinforcement learning algorithm that learns the expected utility of action-state pairs to maximize cumulative reward.
A statistical technique that estimates conditional quantiles (e.g., median) of a response variable, useful for understanding distributional effects in predictive modeling.
Converting search queries into dense vector representations that capture semantic meaning, enabling neural search engines to retrieve more relevant documents.
An emerging discipline combining quantum computing with machine learning algorithms to potentially accelerate training and inference tasks.
Retrieval-Augmented Generation: Combines AI text generation with real-time data retrieval for accuracy. Ideal for building chatbots that cite up-to-date information.
AI training method using rewards and penalties to shape behavior. Used in game AI, robotics, and recommendation systems.
Search that understands user intent and context, not just keywords. Powerful in enterprise search engines and AI-driven discovery.
Training models on labeled data to map inputs to known outputs. Includes algorithms like regression, decision trees, and neural nets.
Automatically sorting text into predefined categories. Common in sentiment analysis, spam detection, and topic labeling.
Creating new text based on input prompts using AI. Used for content creation, storytelling, and code writing.
A unit of text, such as a word or subword, processed by language models. Tokenization methods impact model performance and vocabulary size.
The dataset used to teach AI models patterns and relationships. High-quality training data leads to more reliable AI systems.
A neural network architecture excelling at sequence tasks like translation. Foundation for models such as BERT, GPT, and T5.
Openness in AI model design, data sources, and decision processes. Vital for trust, regulatory compliance, and user understanding.
Adjusting hyperparameters or model weights to enhance AI performance. Techniques include grid search, Bayesian optimization, and manual tuning
When a model is too simple and cannot capture data complexity. Detected by high training and validation errors.
Algorithms that discover patterns in data without labels. Includes clustering, dimensionality reduction, and anomaly detection.
Numerical representations of words or items capturing semantic meaning. Used in recommendation systems, similarity search, and NLP.
AI systems focused on analyzing and interpreting images and videos. Applications include object detection, facial recognition, and medical imaging.
Technology that processes and generates human speech. Enables voice assistants, speech-to-text, and text-to-speech services.
A regularization technique that adds a penalty proportional to the squared magnitude of model weights to the loss function (also known as L2 regularization) to prevent overfitting.
An automatic speech recognition (ASR) system by OpenAI that transcribes audio into text with high accuracy across multiple languages.
A suite of methods and tools (e.g., LIME, SHAP) designed to make AI system decisions transparent and interpretable to human stakeholders.
An optimized gradient boosting library for decision-tree ensembles, renowned for its speed and performance in structured data competitions.
A generalized autoregressive pretraining method for language understanding that outperforms BERT on various NLP benchmarks.
A human-friendly data-serialization format frequently used for configuration files in AI projects due to its readability and support for complex structures.
A unit of digital information equal to 10^24 bytes (one septillion bytes), reflecting the massive data scales encountered in cloud and AI infrastructures.
AI’s ability to perform tasks with no prior examples by generalizing knowledge. Useful for rapid deployment in new domains and unseen tasks.