AI Glossary: The definitive guide to essential terms in artificial intelligence

Artificial intelligence (AI) is the simulation of human intelligence in machines. Today, AI systems can learn and adapt to new information and can perform tasks that would normally require human intelligence. Machine learning is already having an impact in diagnosing diseases, developing new drugs, designing products, and automating tasks in a wide range of industries. It is also being used to create new forms of entertainment and education. As one of the most transformative technologies in history, advanced AI holds the potential to change our way of life and power society in ways previously unimaginable.

With the current pace of change, navigating the field of AI and machine intelligence can be daunting. To understand AI and its implications, it is important to have a basic understanding of the key terms and concepts. AI education and training can give you and your organization an edge. To this end, our team at Entefy has written this AI glossary to provide you with a comprehensive overview of practical AI terms. This glossary is intended for a broad audience, including students, professionals, and tech enthusiasts who are interested in the rapidly evolving world of machine intelligence.

We encourage you to bookmark this page for quick reference in the future.


Activation function. A mathematical function in a neural network that defines the output of a node given one or more inputs from the previous layer. Also see weight.

Algorithm. A procedure or formula, often mathematical, that defines a sequence of operations to solve a problem or class of problems.

Agent (also, software agent). A piece of software that can autonomously perform tasks for a user or other program(s).

AIOps. A set of practices and tools that use artificial intelligence capabilities to automate and improve IT operations tasks.

Annotation. In ML, the process of adding labels, descriptions, or other metadata information to raw data to make it more informative and useful for training machine learning models. Annotations can be performed manually or automatically. Also see labeling and pseudo-labeling.

Anomaly detection. The process of identifying instances of an observation that are unusual or deviate significantly from the general trend of data. Also see outlier detection.

Artificial general intelligence (AGI) (also, strong AI).The term used to describe a machine’s intelligence functionality that matches human cognitive capabilities across multiple domains. Often characterized by self-improvement mechanisms and generalization rather than specific training to perform in narrow domains.

Artificial intelligence (AI). The umbrella term for computer systems that can interpret, analyze, and learn from data in ways similar to human cognition.

Artificial neural network (ANN) (also, neural network). A specific machine learning technique that is inspired by the neural connections of the human brain. The intelligence comes from the ability to analyze countless data inputs to discover context and meaning.

Artificial superintelligence (ASI).The term used to describe a machine’s intelligence that is well beyond human intelligence and ability, in virtually every aspect.

Attention mechanism. A mechanism simulating cognitive attention to allow a neural network to focus dynamically on specific parts of the input in order to improve performance.

Autoencoder. An unsupervised learning technique for artificial neural network, designed to learn a compressed representation (encoding) for a set of unlabeled data, typically for the purpose of dimensionality reduction.

AutoML. The process of automating certain machine learning steps within a pipeline such as model selection, training, and tuning.


Backpropagation. A method of optimizing multilayer neural networks whereby the output of each node is calculated and the partial derivative of the error with respect to each parameter is computed in a backward pass through the graph. Also see model training.

Bagging. In ML, an ensemble technique that utilizes multiple weak learners to improve the performance of a strong learner with focus on stability and accuracy.

Bias. In ML, the phenomenon that occurs when certain elements of a dataset are more heavily weighted than others so as to skew results and model performance in a given direction.

Bigram. An n-gram containing a sequence of 2 words. Also see n-gram.

Black box AI. A type of artificial intelligence system that is so complex that its decision-making or internal processes cannot be easily explained by humans, thus making it challenging to assess how the outputs were created. Also see explainable AI (XAI).

Boosting. In ML, an ensemble technique that utilizes multiple weak learners to improve the performance of a strong learner with focus on reducing bias and variance.


Cardinality. In mathematics, a measure of the number of elements present in a set.

Categorical variable. feature representing a discrete set of possible values, typically classes, groups, or nominal categories based on some qualitative property. Also see structured data.

Centroid model. A type of classifier that computes the center of mass of each class and uses a distance metric to assign samples to classes during inference.

Chain of thought (CoT). In ML, this term refers to a series of reasoning steps that guides an AI model’s thinking process when creating high quality, complex output. Chain of thought prompting is a way to help large language models solve complex problems by breaking them down into smaller steps, guiding the LLM through the reasoning process.

Chatbot. A computer program (often designed as an AI-powered virtual agent) that provides information or takes actions in response to the user’s voice or text commands or both. Current chatbots are often deployed to provide customer service or support functions.

Class. A category of data indicated by the label of a target attribute.

Class imbalance. The quality of having a non-uniform distribution of samples grouped by target class.

Classification. The process of using a classifier to categorize data into a predicted class.

Classifier. An instance of a machine learning model trained to predict a class.

Clustering. An unsupervised machine learning process for grouping related items into subsets where objects in the same subset are more similar to one another than to those in other subsets.

Cognitive computing. A term that describes advanced AI systems that mimic the functioning of the human brain to improve decisionmaking and perform complex tasks.

Computer vision (CV). An artificial intelligence field focused on classifying and contextualizing the content of digital video and images. 

Convergence. In ML, a state in which a model’s performance is unlikely to improve with further training. This can be measured by tracking the model’s loss function, which is a measure of the model’s performance on the training data.   

Convolutional neural network (CNN). A class of neural network that utilizes multilayer perceptron, where each neuron in a hidden layer is connected to all neurons in the next layer, in conjunction with hidden layers designed only to filter input data. CNNs are most commonly applied to computer vision. 

Corpus. A collection of text data used for linguistic research or other purposes, including training of language models or text mining.

Central processing unit (CPU). As the brain of a computer, the CPU is the essential processor responsible for interpreting and executing a majority of a computer’s instructions and data processing. Also see graphics processing unit (GPU).

Cross-validation. In ML, a technique for evaluating the generalizability of a machine learning model by testing the model against one or more validation datasets.


Data augmentation. A technique to artificially increase the size and diversity of a training dataset by creating new data points from existing data. This can be done by applying various transformations to the existing data.

Data cleaning. The process of improving the quality of dataset in preparation for analytical operations by correcting, replacing, or removing dirty data (inaccurate, incomplete, corrupt, or irrelevant data).

Data preprocessing. The process of transforming or encoding raw data in preparation for analytical operations, often through re-shaping, manipulating, or dropping data.

Data curation. The process of collecting and managing data, including verification, annotation, and transformation. Also see training and dataset.

Data mining. The process of targeted discovery of information, patterns, or context within one or more data repositories.

DataOps. Management, optimization, and monitoring of data retrieval, storage, transformation, and distribution throughout the data life cycle including preparation, pipelines, and reporting.

Deepfake. Fabricated media content (such as image, video, or recording) that has been convincingly manipulated or generated using deep learning to make it appear or sound as if someone is doing or saying something they never actually did.    

Deep learning. A subfield of machine learning that uses neural networks with two or more hidden layers to train a computer to process data, recognize patterns, and make predictions.

Derived feature. A feature that is created and the value of which is set as a result of observations on a given dataset, generally as a result of classification, automated preprocessing, or sequenced model output.

Descriptive analytics. The process of examining historical data or content, typically for the purpose of reporting, explaining data, and generating new models for current or historical events. Also see predictive analytics and prescriptive analytics.

Dimensionality reduction. A data preprocessing technique to reduce the number of input features in a dataset by transforming high-dimensional data to a low-dimensional representation.

Discriminative model. A class of models most often used for classification or regression that predict labels from a set of features. Synonymous with supervised learning. Also see generative model.

Double descent. In machine learning, a phenomenon in which a model’s performance initially improves with increasing data size, model complexity, and training time, then degrades before improving again.


Ensembling. A powerful technique whereby two or more algorithms, models, or neural networks are combined in order to generate more accurate predictions.

Embedding. In ML, a mathematical structure representing discrete categorical variables as a continuous vector. Also see vectorization.

Embedding space. An n-dimensional space where features from one higher-dimensional space are mapped to a lower dimensional space in order to simplify complex data into a structure that can be used for mathematical operations. Also see dimensionality reduction.

Emergence. In ML, the phenomenon where a model develops new abilities or behaviors that are not explicitly programmed into it. Emergence can occur when a model is trained on a large and complex dataset, and the model is able to learn patterns and relationships that the programmers did not anticipate.

Enterprise AI. An umbrella term referring to artificial intelligence technologies designed to improve business processes and outcomes, typically for large organizations.

Expert System. A computer program that uses a knowledge base and an inference engine to emulate the decision-making ability of a human expert in a specific domain.

Explainable AI (XAI). A set of tools and techniques that helps people understand and trust the output of machine learning algorithms.

Extreme Gradient Boosting (XGBoost). A popularmachine learninglibrary based on gradient boosting and parallelization to combine the predictions from multiple decision trees. XGBoost can be used for a variety of tasks, including classification, regression, and ranking.


F1 Score. A measure of a test’s accuracy calculated as the harmonic mean of precision and recall.

Feature. In ML, a specific variable or measurable value that is used as input to an algorithm.

Feature engineering. The process of designing, selecting, and transforming features extracted from raw input to improve the performance of machine learning models. 

Feature vector (also, vector). In ML, a one-dimensional array of numerical values mathematically representing data points, features, or attributes in various algorithms and models.

Federated learning. A machine learning technique where the training for a model is distributed amongst multiple decentralized servers or edge devices, without the need to share training data.

Few-shot learning. A machine learning technique that allows a model to perform a task after seeing only a few examples of that task. Also see one-shot learning and zero-shot learning.

Fine-tuning. In ML, the process by which the hyperparameters of a model are adjusted to improve performance against a given dataset or target objective.

Foundation model. A large, sophisticated deep learning model pre-trained on a massive dataset (typically unlabeled), capable of performing a number of diverse tasks. Instead of training a single model for a single task, which would be difficult to scale across countless tasks, a foundation model can be trained on a broad dataset once and then used as the “foundation” or basis for training with minimal fine-tuning to create multiple task-specific models. Also see large language model.


Generative adversarial network (GAN). A class of AI algorithms whereby two neural networks compete against each other to improve capabilities and become stronger.

Generative AI (GenAI).A subset of machine learning with deep learning models that can create new, high-quality content, such as text, images, music, videos, and code. Generative AI models are trained on large datasets of existing content and learn to generate new content that is similar to the training data.

Generative model. A model capable of generating new data based on a given set of training data. Also see discriminative model.

Generative Pre-trained Transformer (GPT). A special family of models based on the transformer architecture—a type of neural network that is well-suited for processing sequential data, such as text. GPT models are pre-trained on massive datasets of unlabeled text, allowing them to learn the statistical relationships between words and phrases, and to generate text that is similar to the training data.

Graphics processing unit (GPU). A specialized microprocessor that accelerates graphics rendering and other computationally intensive tasks, such as training and running complex, large deep learning models. Also see central processing unit (CPU).

Gradient boosting. An ML technique where an ensemble of weak prediction models, such as decision trees, are trained iteratively in order to improve or output a stronger prediction model. Also see Extreme Gradient Boosting (XGBoost).

Gradient descent. An optimization algorithm that iteratively adjusts the model’s parameters to minimize the loss function by following the negative gradient (slope) of the functions. Gradient descent keeps adjusting the model’s settings until the error is very small, which means that the model has learned to predict the training data accurately.

Ground truth. Information that is known (or considered) to be true, correct, real, or empirical, usually for the purpose of training models and evaluating model performance.


Hallucination. In AI, a phenomenon wherein a model generates inaccurate or nonsensical output that is not supported by the data it was trained on.

Hidden layer. A construct within a neural network between the input and output layers which perform a given function, such as an activation function, for model training. Also see deep learning.

Hyperparameter. In ML, a parameter whose value is set prior to the learning process as opposed to other values derived by virtue of training.

Hyperparameter tuning. The process of optimizing a machine learningmodel’s performance by adjusting its hyperparameters.

Hyperplane. In ML, a decision boundary that helps classify data points from a single space into subspaces where each side of the boundary may be attributed to a different class, such as positive and negative classes. Also see support vector machine.


Inference. In ML, the process of applying a trained model to data in order to generate a model output such as a score, prediction, or classification. Also see training.

Input layer. The first layer in a neural network, acting as the beginning of a model workflow, responsible for receiving data and passing it to subsequent layers. Also see hidden layer and output layer.

Intelligent process automation (IPA). A collection of technologies, including robotic process automation (RPA) and AI, to help automate certain digital processes. Also see robotic process automation (RPA).


Jaccard index. A metric used to measure the similarity between two sets of data. It is defined as the size of the intersection of the two sets divided by the size of the union of the two sets. Jaccard index is also known as the Jaccard similarity coefficient.

Jacobian matrix. The first-order partial derivatives of a multivariable function represented as a matrix, providing critical information for optimization algorithms and sensitivity analysis.

Joins. In AI, methods to combine data from two or more data tables based on a common attribute or key. The most common types of joins include inner join, left join, right join, and full outer join.


K-means clustering. An unsupervised learning method used to cluster n observations into k clusters such that each of the n observations belongs to the nearest of the k clusters.

K-nearest neighbors (KNN). A supervised learning method for classification and regression used to estimate the likelihood that a data point is a member of a group, where the model input is defined as the k closest training examples in a data set and the output is either a class assignment (classification) or a property value (regression).

Knowledge distillation. In ML, a technique used to transfer the knowledge of a complex model, usually a deep neural network, to a simpler model with a smaller computational cost.


Labeling. In ML, the process of identifying and annotating raw data (images, text, audios, videos) with informative labels. Labels are the target variables that a supervised machine learning model is trying to predict. Also see annotation and pseudo-labeling.

Language model. An AI model which is trained to represent, understand, and generate or predict natural human language.

Large language model (LLM). A type of general-purpose language model pre-trained on massive datasets to learn the patterns of language. This training process often requires significant computational resources and optimization of billions of parameters. Once trained, LLMs can be used to perform a variety of tasks, such as generating text, translating languages, and answering questions.

Layer. In ML, a collection of neurons within a neural network which perform a specific computational function, such as an activation function, on a set of input features. Also see hidden layerinput layer, and output layer.

Logistic regression. A type of classifier that measures the relationship between one variable and one or more variables using a logistic function.

Long short-term memory (LSTM). A recurrent neural network (RNN) that maintains history in an internal memory state, utilizing feedback connections (as opposed to standard feedforward connections) to analyze and learn from entire sequences of data, not only individual data points.

Loss function. A function that measures model performance on a given task, comparing a model’s predictions to the ground truth. The loss function is typically minimized during the training process, meaning that the goal is to find the values for the model’s parameters that produce accurate predictions as represented by the lowest possible value for the loss function.


Machine learning (ML). A subset of artificial intelligence that gives machines the ability to analyze a set of data, draw conclusions about the data, and then make predictions when presented with new data without being explicitly programmed to do so.

Metadata. Information that describes or explains source data. Metadata can be used to organize, search, and manage data. Common examples include data type, format, description, name, source, size, or other automatically generated or manually entered labels. Also see annotation, labeling, and pseudo-labeling.

Meta-learning. A subfield of machine learning focused on models and methods designed to learn how to learn.

Mimi. The term used to refer to Entefy’s multimodal AI engine and technology.

MLOps. A set of practices to help streamline the process of managing, monitoring, deploying, and maintaining machine learning models.

Model training. The process of providing a dataset to a machine learning model for the purpose of improving the precision or effectiveness of the model. Also see supervised learning and unsupervised learning.

Multi-head attention. A process whereby a neural network runs multiple attention mechanisms in parallel to capture different aspects of input data.

Multimodal AI. Machine learning models that analyze and relate data processed using multiple modes or formats of learning.

Multimodal sentiment analysis. A type of sentiment analysis that considers multiple modalities, such as text, audio, and video, to predict the sentiment of a piece of content. This is in contrast to traditional sentiment analysis which only considers text data. Also see visual sentiment analysis.


N-gram. A token, often a string, containing a contiguous sequence of n words from a given data sample.

N-gram model. In NLP, a model that counts the frequency of all contiguous sequences of [1, n] tokens. Also see tokenization.

Naïve Bayes. A probabilistic classifier based on applying Bayes Rule which makes simplistic (naive) assumptions about the independence of features.

Named entity recognition (NER). An NLP model that locates and classifies elements in text into pre-defined categories.

Natural language processing (NLP). A field of computer science and artificial intelligence focused on processing and analyzing natural human language or text data.

Natural language generation (NLG). A subfield of NLP focused on generating human language text.

Natural language understanding (NLU). A specialty area within NLP focused on advanced analysis of text to extract meaning and context. 

Neural network (NN) (also, artificial neural network). A specific machine learning technique that is inspired by the neural connections of the human brain. The intelligence comes from the ability to analyze countless data inputs to discover context and meaning.

Neurosymbolic AI. A type of artificial intelligence that combines the strengths of both neural and symbolic approaches to AI to create more powerful and versatile AI systems. Neurosymbolic AI systems are typically designed to work in two stages. In the first stage, a neural network is used to learn from data and extract features from the data. In the second stage, a symbolic AI system is used to reason about the features and make decisions.


Obfuscation. A technique that involves intentional obscuring of code or data to prevent reverse engineering, tampering, or violation of intellectual property. Also see privacy-preserving machine learning.

One-shot learning. A machine learning technique that allows a model to perform a task after seeing only one example of that task. Also see few-shot learning and zero-shot learning.

Ontology. A data model that represents relationships between concepts, events, entities, or other categories. In the AI context, ontologies are often used by AI systems to analyze, share, or reuse knowledge.

Outlier detection. The process of detecting a datapoint that is unusually distant from the average expected norms within a dataset. Also see anomaly detection.

Output layer. The last layer in a neural network, acting as the end of a model workflow, responsible for delivering the final result or answer such as a score, class label, or prediction. Also see hidden layer and input layer.

Overfitting. In ML, a condition where a trained model over-conforms to training data and does not perform well on new, unseen data. Also see underfitting.


Parameter. In ML, parameters are the internal variables the model learns during the training process. In a neural network, the weights and biases are parameters. Once the model is trained, the parameters are fixed, and the model can then be used to make predictions on new data by using the parameters to compute the output of the model. The number of parameters in a machine learning model can vary depending on the type of model and the complexity of the problem being solved. For example, a simple linear regression model may only have a few parameters, while a complex deep learning model may have billions of parameters.

Parameter-Efficient Tuning Methods (PETM). Techniques used to improve the performance of a machine learning model by optimizing the hyperparameters (e.g. reducing the number of parameters required). PETM reduces computational cost, improves generalization, and improves interpretability.

Perceptron. One of the simplest artificial neurons in neural networks, acting as a binary classifier based on a linear threshold function.

Perplexity. In AI, a common metric used to evaluate language models, indicating how well the model predicts a given sample.

Precision. In ML, a measure of model accuracy computing the ratio of true positives against all true and false positives in a given class.

Predictive analytics. The process of learning from historical patterns and trends in data to generate predictions, insights, recommendations, or otherwise assess the likelihood of future outcomes. Also see descriptive analytics and prescriptive analytics.

Prescriptive analytics. The process of using data to determine potential actions or strategies based on predicted future outcomes. Also see descriptive analytics and predictive analytics.

Primary feature. A feature, the value of which is present in or derived from a dataset directly. 

Privacy preserving machine learning (PPML). A collection of techniques that allow machine learning models to be trained and used without revealing the sensitive, private data that they were trained on. Also see obfuscation.

Prompt. A piece of text, code, or other input that is used to instruct or guide an AI model to perform a specific task, such as writing text, translating languages, generating creative content, or answering questions in informative ways. Also see large language model (LLM)generative AI, and foundation model.

Prompt design. The specialized practice of crafting optimal prompts to efficiently elicit the desired response from language models, especially LLMs.  Prompt design and prompt engineering are two closely related concepts in natural language processing (NLP).

Prompt engineering. The broader process of developing and evaluating prompts that elicit the desired response from language models, especially LLMs. Prompt design and prompt engineering are two closely related concepts in natural language processing (NLP).

Prompt tuning. An efficient technique to improve the output of a pre-trained foundation model or large language model by programmatically adjusting the prompts to perform specific tasks, without the need to retrain the model or update its parameters.

Pseudo-labeling. A semi-supervised learning technique that uses model-generated labeled data to improve the performance of a machine learning model. It works by training a model on a small set of labeled data, and then using the trained model to predict labels for the unlabeled data. The predicted labels are then used to train the model again, and this process is repeated until the model converges. Also see annotation and labeling.


Q-learning. A model-free approach to reinforcement learning that enables a model to iteratively learn and improve over time by taking the correct action. It does this by iteratively updating a Q-table (the “Q” stands for quality), which is a map of states and actions to rewards.


Random forest. An ensemble machine learning method that blends the output of multiple decision trees in order to produce improved results.

Recall. In ML, a measure of model accuracy computing the ratio of true positives guessed against all actual positives in a given class.

Recurrent neural network (RNN). A class of neural networks that is popularly used to analyze temporal data such as time series, video and speech data.

Regression. In AI, a mathematical technique to estimate the relationship between one variable and one or more other variables. Also see classification.

Regularization. In ML, a technique used to prevent overfitting in models. Regularization works by adding a penalty to the loss function of the model, which discourages the model from learning overly complex patterns, thereby making it more likely to generalize to new data.

Reinforcement learning (RL). A machine learning technique where an agent learns independently the rules of a system via trial-and-error sequences.

Robotic process automation (RPA). Business process automation that uses virtual software robots (not physical) to observe the user’s low-level or monotonous tasks performed using an application’s user interface in order to automate those tasks. Also see intelligent process automation (IPA).


Self-supervised learning. Autonomous Supervised Learning, whereby a system identifies and extracts naturally-available signal from unlabeled data through processes of self-selection.

Semi-supervised learning. A machine learning technique that fits between supervised learning (in which data used for training is labeled) and unsupervised learning (in which data used for training is unlabeled).

Sentiment analysis. In NLP, the process of identifying and extracting human opinions and attitudes from text. The same can be applied to images using visual sentiment analysis. Also see multimodal sentiment analysis.

Singularity. In AI, technological singularity is a hypothetical point in time when artificial intelligence surpasses human intelligence, leading to the rapid but uncontrollable increase in technological development.

Software agent (also, agent). A piece of software that can autonomously perform tasks for a user or other software program(s).

Strong AI. The term used to describe artificial general intelligence or a machine’s intelligence functionality that matches human cognitive capabilities across multiple domains. Often characterized by self-improvement mechanisms and generalization rather than specific training to perform in narrow domains. Also see weak AI.

Structured data. Data that has been organized using a predetermined model, often in the form of a table with values and linked relationships. Also see unstructured data.

Supervised learning. A machine learning technique that infers from training performed on labeled data. Also see unsupervised learning.

Support vector machine (SVM). A type of supervised learning model that separates data into one of two classes using various hyperplanes. 

Symbolic AI.A branch of artificial intelligence that focuses on the use of explicit symbols and rules to represent knowledge and perform reasoning. In symbolic AI, also known as Good Old-Fashioned AI (GOFAI), problems are broken down into discrete, logical components, and algorithms are designed to manipulate these symbols to solve problems. Also see neurosymbolic AI.

Synthetic data. Artificially generated data that is designed to resemble real-world data. It can be used to train machine learning models, test software, or protect privacy. Also see data augmentation.


Taxonomy. A hierarchal structured list of terms to illustrate the relationship between those terms. Also see ontology. 

Teacher-student model. A type of machine learning model where a teacher model is used to generate labels for a student model. The student model then tries to learn from these labels and improve its performance. This type of model is often used in semi-supervised learning, where a large amount of unlabeled data is available but labeling it is expensive.

Text-to-3D model. A machine learning model that can generate 3D models from text input.

Text-to-image model. A machine learning model that can generate images from text input.

Text-to-task model. A machine learning model that can convert natural language descriptions of tasks into executable instructions, such as automating workflows, generating code, or organizing data.

Text-to-text model. A machine learning model that can generate text output from text input.

Text-to-video model. A machine learning model that can generate videos from text input.

Time series. A set of data structured in spaced units of time.

TinyML. A branch of machine learning that deals with creating models that can run on very limited resources, such as embedded IoT devices.

Tokenization. In ML, a method of separating a piece of text into smaller units called tokens, representing words, characters, or subwords, also known as n-grams.

Training data. The set of data (often labeled) used to train a machine learning model.

Transfer learning. A machine learning technique where the knowledge derived from solving one problem is applied to a different (typically related) problem.

Transformer. In ML, a type of deep learning model for handling sequential data, such as natural language text, without needing to process the data in sequential order.

Tuning. The process of optimizing the hyperparameters of an AI algorithm to improve its precision or effectiveness. Also see algorithm.

Turing test. A test introduced by Alan Turing in his 1950 paper “Computing Machinery and Intelligence,” to determine whether a machine’s ability to think and communicate can match that of a human’s. The Turing test was originally named The Imitation Game.  


Underfitting. In ML, a condition where a trained model is too simple to learn the underlying structure of a more complex dataset. Also see overfitting.

Unstructured data. Data that has not been organized with a predetermined order or structure, often making it difficult for computer systems to process and analyze.

Unsupervised learning. A machine learning technique that infers from training performed on unlabeled data. Also see supervised learning.


Validation. In ML, the process by which the performance of a trained model is evaluated against a specific testing dataset which contains samples that were not included in the training dataset. Also see training.

Vector (also, feature vector). In ML, a one-dimensional array of numerical values mathematically representing data points, features, or attributes in various algorithms and models.

Vector database. A type of database that stores information as vectors or embeddings for efficient search and retrieval.

Vectorization. The process of transforming data into vectors.

Visual sentiment analysis. Analysis algorithms that typically use a combination of image-extracted features to predict the sentiment of a visual content. Also see multimodal sentiment analysis and sentiment analysis.


Weak AI. The term used to describe a narrow AI built and trained for a specific task. Also see strong AI.

Weight. In ML, a learnable parameter in nodes of a neural network, representing the importance value of a given feature, where input data is transformed (through multiplication) and the resulting value is either passed to the next layer or used as the model output.

Word Embedding. In NLP, the vectorization of words and phrases, typically for the purpose of representing language in a low-dimensional space.


XAI (explainable AI). A set of tools and techniques that helps people understand and trust the output of machine learning algorithms.

XGBoost (Extreme Gradient Boosting). A popularmachine learninglibrary based on gradient boosting and parallelization to combine the predictions from multiple decision trees. XGBoost can be used for a variety of tasks, including classification, regression, and ranking.

X-risk. In AI, a hypothetical existential threat to humanity posed by highly advanced artificial intelligence such as artificial general intelligence or artificial superintelligence.


YOLO (You Only Look Once). A real-time object detection algorithm that uses a single forward pass in a neural network to detect and localize objects in images.


Zero-shot learning. A machine learning technique that allows a model to perform a task without being explicitly trained on a dataset for that task. Also see few-shot learning and one-shot learning.


Entefy is an advanced AI software and process automation company, serving SME and large enterprise customers across diverse industries including financial services, health care, retail, and manufacturing.