An AI Glossary
"If you want to know where the future is being made, look for where langauge is being invented and lawyers are congregating." – Stewart Brand
AI Bombing
Also known as model manipulation, model poisoning, AI data poisoning
A coordinated effort to manipulate AI models by intentionally seeding specific information during interactions or training.
Artificial General Intelligence (AGI)
Also known as strong AI, full AI
A hypothetical type of intelligent agent that can understand, learn, and apply knowledge across a wide range of tasks at a human-like level
Artificial Intelligence (AI)
Also known as machine intelligence, artificial intelligence
A branch of computer science focused on creating intelligent machines that can simulate human-like cognitive functions
Audio Augmented Reality (AAR)
Also known as audio AR, contextual audio
A technology that uses audio cues, spatial sound, and contextual information to provide relevant information to users through headphones or smart audio devices.
Related Writing
Augmented Reality (AR)
Also known as mixed reality, enhanced reality
A technology that overlays digital information onto the real world, enhancing the user's perception of their environment through digital elements.
Related Writing
Automation Era
The emerging phase of the internet where platforms use AI and intelligent systems to manage content and connections, focusing on saving users' time and attention.
Related Writing
Chain-of-Thought Training
Also known as reasoning training, step-by-step reasoning
A training approach where AI models are taught to break down complex problems into sequential, logical steps, mimicking human reasoning processes.
Related Writing
Common Crawl
Also known as C4
A large, open-source web crawling and data compilation project that provides web-scraped data used in training machine learning models
Copilots (AI)
Also known as AI assistants, AI coding assistants
AI-powered tools that observe user input and a project's current state to generate helpful suggestions, most often used by developers writing code.
Data Distillation
The process of condensing and refining large datasets to extract the most valuable and relevant information while reducing overall data volume.
Data Masking
Also known as anonymization
A technique used to protect sensitive information by obscuring or generalizing specific details
Related Writing
Deep Learning (DL)
Also known as neural network learning
A subset of machine learning that uses neural networks with multiple layers to progressively extract higher-level features from raw input.
Déjà connu
A feeling of recognizing something as already discovered or invented just as you're about to create it, particularly in the context of AI-generated suggestions
Related Writing
Edge Computing
Also known as local computing
A distributed computing paradigm that brings computation and data storage closer to the location where it is needed, improving response times and saving bandwidth.
Embeddings
Also known as semantic vectors, vector representations
Numerical representations of data (like text or images) in a high-dimensional space where similar items are closer together.
Foundation Models
Also known as base AI models
Large-scale AI models trained on broad datasets that can be adapted or fine-tuned for specific tasks or domains.
General Data Protection Regulation (GDPR)
A comprehensive data protection law in the European Union that regulates the processing of personal data and protects individual privacy rights.
Related Writing
Graphics Processing Unit (GPU)
Also known as Graphics Card
A specialized electronic circuit designed to rapidly manipulate and alter memory to accelerate computer graphics and machine learning computations.
Hallucination
Also known as AI fabrication, model confabulation
An error in AI language models where the system generates confident but factually incorrect or entirely fabricated information.
Honeypot
Also known as Decoy System, Trap Website
A computer system or website intentionally designed to attract cyberattacks or, in this context, to be scraped for training data with specific manipulative intentions.
Human-in-the-Loop
Also known as human oversight, human-AI collaboration
A model of interaction where human judgment and decision-making are integrated into an automated process to provide oversight, validation, or correction.
Related Writing
Hyped AI
Artificial intelligence technologies or concepts that are surrounded by excessive excitement, publicity, or exaggerated claims
Related Writing
Large Language Model (LLM)
Also known as transformer model, GenAI, generative AI, language model
AI models trained on massive text datasets that can generate human-like text by predicting statistically likely word sequences based on contextual relationships.
Related Writing
- AI Lies, Privacy, & OpenAI
- Practical AI For App Makers
- The Platypus In The Room
- Considering AI Privacy Scenarios
- Finding Bathroom Faucets with Embeddings
- Extreme Compression with AI: Fitting a 45 Minute Podcast into 40kbs
- A Big Year For Small Models
- A Foundation Model For Every Culture
- My CCPA Dialog With OpenAI
- The Rise of Audio AR
- A Plea for Sober AI
- Sober AI is the Norm
- Be Better, Not Smaller
- Smaller, Cheaper, Faster, Sober
- How to Build a Bigger Bubble
- The Right to Confront Your AI Accuser
- Conflating Overture Places Using DuckDB, Ollama, Embeddings, and More
- We Need Help With Content Discovery More Than Content Generation
- The 3 AI Use Cases: Gods, Interns, and Cogs
- Why LLM Advancements Have Slowed: The Low-Hanging Fruit Has Been Eaten
- On Synthetic Data: How It's Improving & Shaping LLMs
- The New Game in Town
Machine Learning (ML)
A branch of artificial intelligence and computer science that uses data and algorithms to imitate the way that humans learn, gradually improving its accuracy.
Multimodal LLM
An LLM capable of processing and understanding multiple types of input, such as text, images, and potentially other media types.
Related Writing
Neural Processing Unit (NPU)
Also known as AI accelerator
A specialized processor designed to accelerate machine learning and artificial intelligence computations more efficiently than traditional CPUs or GPUs.
Related Writing
Personally Identifiable Information (PII)
Also known as personal data, identifying information
Data that can be used to identify a specific individual, such as email addresses, phone numbers, or home addresses
Related Writing
Post-Training Suggestion
Also known as Latent model injection
A manipulation technique involving intentionally planting specific input in training datasets to enable later, post-training manipulation of AI models.
Prompt Engineering
Also known as AI prompting, instruction design
The practice of designing and refining input instructions to AI systems to achieve more accurate, relevant, and desired outputs.
Related Writing
Prompt Injection
Also known as AI input attack
A security vulnerability where malicious input can manipulate an AI system's intended behavior by altering its original instructions.
Prompt Optimization
Also known as automated prompt engineering, prompt refinement
The process of automatically improving prompts to enhance the performance of language models by using techniques like metric-based evaluation and iterative prompt generation.
Related Writing
Reasoning Models
AI models designed to break down complex problems into step-by-step logical processes, to arrive at more accurate answers and advanced capabilities.
Related Writing
Reciprocal Data Application (RDA)
A type of application or service that improves itself by learning from user interactions, creating a network effect where more usage leads to better performance.
Reinforcement Learning from Human Feedback (RLHF)
Also known as alignment training, human-guided AI training
A training technique where human contractors provide feedback to improve AI model outputs, correcting problematic responses and guiding the model's behavior.
Retrieval Augmented Generation (RAG)
An AI technique that enhances language model responses by retrieving relevant information from a knowledge base, often a vector database, before generating an answer.
Related Writing
Section 230
Also known as safe harbor law
A US safe harbor law that provides immunity to online platforms for content posted by users.
Related Writing
Sober AI
A pragmatic approach to artificial intelligence that focuses on practical, incremental improvements and real-world, often mundane, applications.
Related Writing
Synthetic Data
Artificially generated data created using authentic seed data to establish dataset qualities and guide data creation, as opposed to data captured from real-world events.
Teacher Model
A larger, more advanced AI model used to generate synthetic data or provide feedback, for training smaller models.
Related Writing
Test-Time Compute
Also known as inference computation, reasoning computation
The computational resources and time spent by an AI model during inference to reason through and solve a specific task.
Related Writing
Text-to-Speech (TTS)
Also known as voice generation, speech synthesis
Technology that converts written text into spoken audio using synthetic voices and speech synthesis techniques.
Textbox Tyranny
Also known as the blank page problem
The psychological barrier created by an empty text input field that intimidates users from initiating interaction.
Related Writing
Tokenization
Also known as text encoding
The process of converting text into numerical tokens that can be processed by machine learning models
Related Writing
Trap Tokens
Unique, intentionally crafted phrases with low likelihood of occurring in training data, designed to help content owners prove their work was used in AI model training.
User Experience (UX)
User Experience, the overall experience a user has when interacting with a product, service, or interface.