The artificial intelligence glossary

A guide to the key terms you need to know and understand when discussing AI.

Credit: Yuri/Adobe Stock

Artificial intelligence is dominating conversations across all industries, including legal. To help legal professionals navigate this fast-evolving space, below, we define many of the key terms and concepts relating to AI. This glossary will be continually updated as new developments emerge.

Algorithm: In AI, a set of instructions or programming that tells a computer what to do in order to allow the machine to learn to operate on its own to solve a specific problem or perform a specific task. 

Artificial Intelligence (AI): The branch of computer science focused on the theory, development and design of computer systems that have the ability to mimic human intelligence and thought or perform tasks that normally require human intelligence.

Bard: A chatbot tool released by Google in February 2023, based on the LaMDA large language model.

Chatbot: A computer program that “converses” with its user. Rule- or flow-based chatbots deliver pre-written answers in response to questions and cannot deviate from this content. AI-based chatbots are more dynamic, can pull from larger databases of information, and can learn more over time. These are built on top of conversational AI.

ChatGPT: A commercially available chatbot from Open AI, based originally on the GPT-3.5 large language model, also known as text-davinci-003, and now on GPT-4, that was released on November 30, 2022.

Continuous Active Learning (CAL): An application of AI in which the system learns to correct itself—without the need for ongoing human supervision—because it has learned to discern between varying degrees of responsive and nonresponsive documents or concepts via supervised learning. In e-discovery, TAR 2.0 is an example of continuous-active-learning.

Conversational AI: Technologies that use large volumes of data, machine learning and natural language processing to allow users to “talk to” the technology, by imitating human interaction through recognizing text and speech inputs. Conversational AI serves as the synthetic “brain” behind some chatbots.

Deep Learning: A type of machine learning that utilizes neutral networks to mimic the human brain, using three or more layers of training to enable the AI cluster data and make predictions. 

Foundational Model: A large AI model trained on massive quantities of unlabeled data, usually through self-supervised learning, that can be used to accurately perform a wide range of tasks with minimal fine-tuning. Such tasks include: natural language processing, image classification, question answering and more.

Garbage In, Garbage Out: An expression meaning that an AI system is only as good as the data on which it is trained. If an AI system is trained on inaccurate, biased or outdated data, its outputs will reflect those shortcomings. 

Generative AI: A category of AI systems, including large language models, that can independently create unique, novel content, in the form of text, images, audio and more, based on the data they have previously been trained on. Unlike traditional AI systems, generative AI algorithms go beyond recognizing patterns and making predictions. Some advanced generative AI systems are not limited to their training datasets, and can learn to respond to questions or prompts containing information on which they were not previously trained. This is defined as zero-shot learning.

GPT: Generative Pre-trained Transformer; the prefix to various generations of large language models from the company OpenAI. For example, GPT-3 is the third generation of GPT models. GPT-1 was released in June 2018. GPT-2 was released in February 2019. GPT-3 was released in June 2020. GPT-3.5 was released in March 2022, with underlying models rolled out over the year, and tex-davinci-003 receiving significant attention in late 2022. GPT-4 was released on March 14, 2023.

Graphics Processing Unit (GPU): A type of efficient processor that is used to render graphics on a computer screen. GPUs are critical in the training of AI systems and large language models that require significant processing power. 

Hallucination: An instance where an AI system, when asked a question or prompt, provides a false, fictitious, yet convincing answer that it’s confident is correct.

LaMDA: Language Model for Dialogue Applications, a large language model released by Google in May 2021.

Large Language Model (LLM): A type of deep learning algorithm or machine learning model that can perform a variety of natural language processing tasks. These include: reading, summarizing, translating, classifying, predicting and generating text words or sentences, answering questions or responding to prompts in a conversational manner and translating text from one language to another. It performs these tasks based on knowledge gained from massive datasets and supervised and reinforcement learning. LLMs are one kind of foundational model.

LLaMA: Large Language Model Meta AI, a large language model released by Meta in February 2023.

Machine Learning: A broad branch of AI concerned with “teaching” AI systems to perform tasks, understand concepts or solve problems in a way that imitates intelligent human behavior, gradually becoming more accurate as it is trained on more data.

Model: An AI tool or algorithm based on a defined dataset that makes decisions a human expert would make given the same information, but without human interference in the decision-making process. GPT-3, for example, is an AI model.

Multimodal AI: An AI system that is capable of processing multiple types of data, such as images, video or sound, in addition to text, in order to generate output.

Natural Language Processing (NLP): A branch of AI and computer science that refers to the ability of computers or software to understand and read written and spoken language in the form of text and voice data, including intent and sentiment.

Neural Network: A means of machine learning that mimics the human brain, and includes the ability for multiple layers of training to occur simultaneously. Neural networks are made up of millions of processing nodes and are central to deep learning.

Parameters: Bits of knowledge or variables, which can be thought of as connections between concepts, that an AI model learns throughout its training process. Parameters are adjusted during training to achieve desired outputs from specific inputs. Generally speaking, the more parameters, the greater AI’s ability to understand and connect complex concepts together. Therefore, the more parameters, the more advanced the AI model.

Prompt: The instruction given to an AI model or machine learning algorithm in order to generate a specific output.

Prompt Engineering: Identifying and using the right prompts to produce the most useful or desirable outcomes from an AI tool.

Reinforcement Learning: A machine learning technique used to train an AI model in which the AI system interactively learns by trial and error, incorporating feedback from its own actions and outputs.

Robotic Process Automation (RPA): A form of business process automation, also known as software robotics, that allows humans to use intelligent automation technology to define a set of instructions for the performance of high-volume, repetitive human tasks quickly and without error. While RPA technology shares similarities with AI and is often included in the same discussions, it is not a form of AI.

Self-Supervised Learning: A form of machine learning in which a model is input with unstructured data and automatically generates data labels; essentially, the model trains itself to differentiate between different parts of the input. Also known as predictive or pretext learning. 

Semi-Supervised Learning: A form of machine learning in which some of the input data is labeled. Semi-supervised learning is a mix of supervised and unsupervised learning.

Supervised Learning: A form of machine learning in which a model is taught how to identify a certain concept or topic—for example, a specific type of document—via a person manually correcting the machine during the training process. In e-discovery, TAR 1.0 is an example of supervised learning.

Token: In natural language processing, a sequence of characters that form a semantic unit or certain role in a written language. The process of breaking a stream of language into meaningful elements such as words or sentences is called tokenization.

Unsupervised Learning: A form of machine learning in which a model employs deep learning techniques to detect patterns in data without explicit training on labeled data.

Web scraping: Extracting data from websites, usually a large number of them, and using that extracted data to train AI models. The extracted data becomes the basis of learning that informs outputs later generated by AI and generative AI tools.

Zero-Shot Learning: The ability for an AI system to learn how to respond to questions or prompts, create new content or classify data on which it was not previously trained.

Related:

Don’t fear that AI is here: How the emerging tech can help HR pros

Navigating the hype of AI tools in HR

Why AI can never replace the H in HR