Glossary

AI Bombing

Also known as model manipulation, model poisoning, AI data poisoning

A coordinated effort to manipulate AI models by intentionally seeding specific information during interactions or training.

Related Writing

Artificial General Intelligence (AGI)

Also known as strong AI, full AI

A hypothetical type of intelligent agent that can understand, learn, and apply knowledge across a wide range of tasks at a human-like level

Related Writing

Artificial Intelligence (AI)

Also known as machine intelligence, artificial intelligence

A branch of computer science focused on creating intelligent machines that can simulate human-like cognitive functions

Related Writing

Audio Augmented Reality (AAR)

Also known as audio AR, contextual audio

A technology that uses audio cues, spatial sound, and contextual information to provide relevant information to users through headphones or smart audio devices.

Related Writing

The Rise of Audio AR

Augmented Reality (AR)

Also known as mixed reality, enhanced reality

A technology that overlays digital information onto the real world, enhancing the user's perception of their environment through digital elements.

Related Writing

The Rise of Audio AR

Authentic Data

Data that is collected from real-world events, interactions, or observations. Authentic data is *not* artificially generated or manipulated, by LLMs or other automated mechanisms.

Related Writing

Automation Era

The emerging phase of the internet where platforms use AI and intelligent systems to manage content and connections, focusing on saving users' time and attention.

Related Writing

Transitioning from the Attention Era to the Automation Era

Chain-of-Thought Training

Also known as reasoning training, step-by-step reasoning

A training approach where AI models are taught to break down complex problems into sequential, logical steps, mimicking human reasoning processes.

Related Writing

On Test-Time Compute: The New Game in Town

Common Crawl

Also known as C4

A large, open-source web crawling and data compilation project that provides web-scraped data used in training machine learning models

Related Writing

Copilots (AI)

Also known as AI assistants, AI coding assistants

AI-powered tools that observe user input and a project's current state to generate helpful suggestions, most often used by developers writing code.

Related Writing

Data Distillation

The process of condensing and refining large datasets to extract the most valuable and relevant information while reducing overall data volume.

Related Writing

Data Masking

Also known as anonymization

A technique used to protect sensitive information by obscuring or generalizing specific details

Related Writing

On Synthetic Data: How It's Improving & Shaping LLMs

Deep Learning (DL)

Also known as neural network learning

A subset of machine learning that uses neural networks with multiple layers to progressively extract higher-level features from raw input.

Related Writing

Déjà connu

A feeling of recognizing something as already discovered or invented just as you're about to create it, particularly in the context of AI-generated suggestions

Related Writing

Déjà Connu: AI Copilots Reveal Well-Traveled Paths

Edge Computing

Also known as local computing

A distributed computing paradigm that brings computation and data storage closer to the location where it is needed, improving response times and saving bandwidth.

Related Writing

Embeddings

Also known as semantic vectors, vector representations

Numerical representations of data (like text or images) in a high-dimensional space where similar items are closer together.

Related Writing

Foundation Models

Also known as base AI models

Large-scale AI models trained on broad datasets that can be adapted or fine-tuned for specific tasks or domains.

Related Writing

General Data Protection Regulation (GDPR)

A comprehensive data protection law in the European Union that regulates the processing of personal data and protects individual privacy rights.

Related Writing

My CCPA Dialog With OpenAI

Generated Data

Data that is artificially created, often by AI models, and used to augment authentic datasets or simulate real-world scenarios.

Related Writing

Graphics Processing Unit (GPU)

Also known as Graphics Card

A specialized electronic circuit designed to rapidly manipulate and alter memory to accelerate computer graphics and machine learning computations.

Related Writing

Hallucination

Also known as AI fabrication, model confabulation

An error in AI language models where the system generates confident but factually incorrect or entirely fabricated information.

Related Writing

Honeypot

Also known as Decoy System, Trap Website

A computer system or website intentionally designed to attract cyberattacks or, in this context, to be scraped for training data with specific manipulative intentions.

Related Writing

Manipulating AI: Prompt Injection, Trap Tokens, & Post-Training Suggestion

Human-in-the-Loop

Also known as human oversight, human-AI collaboration

A model of interaction where human judgment and decision-making are integrated into an automated process to provide oversight, validation, or correction.

Related Writing

Practical AI For App Makers

Hyped AI

Artificial intelligence technologies or concepts that are surrounded by excessive excitement, publicity, or exaggerated claims

Related Writing

Large Language Model (LLM)

Also known as transformer model, GenAI, generative AI, language model

AI models trained on massive text datasets that can generate human-like text by predicting statistically likely word sequences based on contextual relationships.

Related Writing

Machine Learning (ML)

A branch of artificial intelligence and computer science that uses data and algorithms to imitate the way that humans learn, gradually improving its accuracy.

Related Writing

Multimodal LLM

An LLM capable of processing and understanding multiple types of input, such as text, images, and potentially other media types.

Related Writing

Generating Descriptive Weather Reports with LLMs

Neural Processing Unit (NPU)

Also known as AI accelerator

A specialized processor designed to accelerate machine learning and artificial intelligence computations more efficiently than traditional CPUs or GPUs.

Related Writing

Personally Identifiable Information (PII)

Also known as personal data, identifying information

Data that can be used to identify a specific individual, such as email addresses, phone numbers, or home addresses

Related Writing

Post-Training Suggestion

Also known as Latent model injection

A manipulation technique involving intentionally planting specific input in training datasets to enable later, post-training manipulation of AI models.

Related Writing

Manipulating AI: Prompt Injection, Trap Tokens, & Post-Training Suggestion

Prompt Engineering

Also known as AI prompting, instruction design

The practice of designing and refining input instructions to AI systems to achieve more accurate, relevant, and desired outputs.

Related Writing

Prompt Injection

Also known as AI input attack

A security vulnerability where malicious input can manipulate an AI system's intended behavior by altering its original instructions.

Related Writing

Prompt Optimization

Also known as automated prompt engineering, prompt refinement

The process of automatically improving prompts to enhance the performance of language models by using techniques like metric-based evaluation and iterative prompt generation.

Related Writing

Pipelines & Prompt Optimization with DSPy

Reasoning Models

AI models designed to break down complex problems into step-by-step logical processes, to arrive at more accurate answers and advanced capabilities.

Related Writing

On Synthetic Data: How It's Improving & Shaping LLMs

Reciprocal Data Application (RDA)

A type of application or service that improves itself by learning from user interactions, creating a network effect where more usage leads to better performance.

Related Writing

Reinforcement Learning from Human Feedback (RLHF)

Also known as alignment training, human-guided AI training

A training technique where human contractors provide feedback to improve AI model outputs, correcting problematic responses and guiding the model's behavior.

Related Writing

Retrieval Augmented Generation (RAG)

An AI technique that enhances language model responses by retrieving relevant information from a knowledge base, often a vector database, before generating an answer.

Related Writing

Sober AI is the Norm

Section 230

Also known as safe harbor law

A US safe harbor law that provides immunity to online platforms for content posted by users.

Related Writing

The Platypus In The Room

Sober AI

A pragmatic approach to artificial intelligence that focuses on practical, incremental improvements and real-world, often mundane, applications.

Related Writing

Sober AI is the Norm

Synthetic Data

Artificially generated data created using authentic seed data to establish dataset qualities and guide data creation, as opposed to data captured from real-world events.

Related Writing

Teacher Model

A larger, more advanced AI model used to generate synthetic data or provide feedback, for training smaller models.

Related Writing

On Synthetic Data: How It's Improving & Shaping LLMs

Test-Time Compute

Also known as inference computation, reasoning computation

The computational resources and time spent by an AI model during inference to reason through and solve a specific task.

Related Writing

On Test-Time Compute: The New Game in Town

Text-to-Speech (TTS)

Also known as voice generation, speech synthesis

Technology that converts written text into spoken audio using synthetic voices and speech synthesis techniques.

Related Writing

Textbox Tyranny

Also known as the blank page problem

The psychological barrier created by an empty text input field that intimidates users from initiating interaction.

Related Writing

The Tyranny of the Blank Textbox

Tokenization

Also known as text encoding

The process of converting text into numerical tokens that can be processed by machine learning models

Related Writing

AI Lies, Privacy, & OpenAI

Trap Tokens

Unique, intentionally crafted phrases with low likelihood of occurring in training data, designed to help content owners prove their work was used in AI model training.

Related Writing

Manipulating AI: Prompt Injection, Trap Tokens, & Post-Training Suggestion

User Experience (UX)

User Experience, the overall experience a user has when interacting with a product, service, or interface.

Related Writing

AI Search Apps Succeed Because of Their UI, Not their LLMs

An AI Glossary

AI Bombing

Related Writing

Artificial General Intelligence (AGI)

Related Writing

Artificial Intelligence (AI)

Related Writing

Audio Augmented Reality (AAR)

Related Writing

Augmented Reality (AR)

Related Writing

Authentic Data

Related Writing

Automation Era

Related Writing

Chain-of-Thought Training

Related Writing

Common Crawl

Related Writing

Copilots (AI)

Related Writing

Data Distillation

Related Writing

Data Masking

Related Writing

Deep Learning (DL)

Related Writing

Déjà connu

Related Writing

Edge Computing

Related Writing

Embeddings

Related Writing

Foundation Models

Related Writing

General Data Protection Regulation (GDPR)

Related Writing

Generated Data

Related Writing

Graphics Processing Unit (GPU)

Related Writing

Hallucination

Related Writing

Honeypot

Related Writing

Human-in-the-Loop

Related Writing

Hyped AI

Related Writing

Large Language Model (LLM)

Related Writing

Machine Learning (ML)

Related Writing

Multimodal LLM

Related Writing

Neural Processing Unit (NPU)

Related Writing

Personally Identifiable Information (PII)

Related Writing

Post-Training Suggestion

Related Writing

Prompt Engineering

Related Writing

Prompt Injection

Related Writing

Prompt Optimization

Related Writing

Reasoning Models

Related Writing

Reciprocal Data Application (RDA)

Related Writing

Reinforcement Learning from Human Feedback (RLHF)

Related Writing

Retrieval Augmented Generation (RAG)

Related Writing

Section 230

Related Writing

Sober AI

Related Writing

Synthetic Data