AI Terminology: ML, DL and LLM explained

apuntiturull
Oct 1, 2024
6 min read

In recent years, the rise of Generative AI has brought a flood of buzzwords like AI, Machine Learning, Deep Learning, and Large Language Models. While these terms are frequently used, they are often misunderstood and used interchangeably, making it difficult to grasp their distinct meanings. This article aims to clear up that confusion by explaining the key differences between these technologies and why it’s important to understand them.

What is Artificial Intelligence?

Let’s begin by addressing the key question: what is artificial intelligence? AI is a broad concept that refers to various techniques designed to replicate human logic, processes, and intelligence in machines, algorithms, or “artificial” brains. These techniques can be grouped into four main areas, each a more advanced subset of the previous one: artificial intelligence (AI), machine learning (ML), deep learning (DL), and large language models (LLMs).

In this article, we will explore each of these fields in chronological order and with increasing levels of complexity. To replicate human logic, AI can use approaches that range from rule-based systems, where we explicitly program the rules into the machine, to more advanced methods that enable the system to learn these rules and logic by analyzing data with implicit, complex patterns.

The journey of AI begins with the simplest approach—rule-based AI and algorithms—which serves as the foundation for more sophisticated techniques.

Rule-Based AI and Algorithms

The most traditional form of AI involves creating rule-based algorithms. These algorithms follow predefined rules to replicate human reasoning. For instance, if we wanted to identify whether an image contains a dog, we could create a list of predefined facial characteristics. These might include features like “pointed ears”, “snout”, “wet nose”, and “whiskers.” We would then analyze the image to check for these specific facial features. If the image matches most of the predefined facial characteristics, we could classify it as containing a dog. If the image lacks these key features, we would classify it as not containing a dog.

In rule-based AI, we have full control over the algorithm’s logic. We know exactly how it processes information and why it makes specific decisions. This type of AI has been used widely, from regulating thermostats to guiding robots, and even powering Deep Blue, the chess program that famously defeated world champion Garry Kasparov in 1997.

These algorithms can be based from simple logics to deep decision processes including highly complex mathematical and statistical models to perform a task. However, these rule-based algorithms are limited in their ability to learn or adapt. If we need the system to behave differently, we must manually modify or add new rules. This contrasts sharply with more advanced AI systems, like those powered by machine learning, which can adapt and learn from data.

Machine Learning (ML)

Machine Learning (ML) is a subset of AI that enables computers to learn from data without the need for explicit programming for every task. Instead of providing predefined rules, we supply examples—such as images labeled as contains dog or not—and the computer learns patterns from the data to make its own decisions. In this process, we control how the AI learns, but not exactly what it learns.

Developers in ML set up the learning framework, but the system itself discovers the underlying logic. To modify its behavior, we introduce new data rather than rewriting the rules. While we have control over the learning process and the input data, we lose the ability to directly govern the specific decisions the algorithm makes. Machine learning includes a broad range of models, each suited to learning different patterns in data. When referring to traditional ML, as opposed to deep learning, we typically talk about models such as regression, decision trees, or XGBoost.

These models often construct complex decision trees based on the provided data, making them much more adaptable than rule-based systems. Rather than manually defining the tree’s structure, as we would in a rule-based approach, the model autonomously learns the most effective “decisions” based on a set of inputs and the desired output. During training, it refines these decisions to better align the results with the desired outcome, learning through a diverse set of input-output examples.

For instance, consider a machine learning model designed to classify objects based on their characteristics, such as diameter size and color. First, we extract relevant features from the dataset (e.g., object diameter and color). The model then uses these features to split the objects into different categories. For example, it might first split based on diameter size, and then further refine the classification using color. By training the model on labeled examples, it learns to accurately classify new objects based on these features.

ML models are widely used in applications like product recommendations, energy demand forecasting, and credit risk assessment. They leverage historical data and patterns of similar behavior to predict future outcomes. However, as data becomes more complex, it becomes increasingly difficult to fully comprehend how these systems arrive at their conclusions. There are certain tasks, often requiring intuition or common sense, that even highly sophisticated decision trees struggle to handle. This is where Deep Learning excels, addressing limitations of traditional ML by uncovering more intricate patterns in the data.

Deep Learning (DL)

Deep learning, a subset of ML, takes things a step further by using artificial neural networks inspired by how biological neurons work. These networks can learn complex patterns that were previously impossible to detect. For example, a deep learning model tasked with analyzing emotions in text doesn’t just count words; it understands the relationships between them, the context, and even nuances like sarcasm and humor.

The way deep learning models operate is highly complex. Each layer in the neural network processes data differently, making it difficult to understand exactly how decisions are made. These models are often referred to as "black boxes" because, while we can see the inputs and outputs, the inner workings remain opaque. This opacity is particularly concerning in high-risk sectors like healthcare, finance, and legal systems. Relying on decisions made by AI without clear reasoning—especially when those decisions can deeply impact someone’s life—poses significant risks. In these sectors where explainability is critical, there is growing attention to developing tools and techniques to provide more insight into how deep learning models reach their conclusions. However, achieving full transparency in these systems remains a challenge.

Despite their complexity, deep learning models are extremely powerful and are widely used in applications such as voice recognition, image analysis, and text classification. Real-world examples include self-driving cars, automatic quality inspection on production lines, spam email filtering, and traffic monitoring through smart cameras. One of the most notable applications of deep learning is in natural language processing, where large language models (LLMs) like ChatGPT have revolutionized how we processed text until now.

Large Language Models (LLMs)

LLMs are specialized AI systems designed to understand and generate human language. They can perform tasks like classifying, summarizing, and even creating text. But how does an AI model learn the intricacies of language?

LLMs are trained on massive amounts of text, learning to predict the next word in a sentence, much like the predictive text function on a smartphone. By continuously predicting word after word, these models can generate coherent and contextually relevant text.

However, LLMs don’t actually "understand" the content. They generate text based on patterns found in their training data, earning them the nickname "stochastic parrots"—they repeat patterns rather than grasp meaning. The decision-making process of these models is highly complex, involving billions of parameters that adjust during training, making it impossible to explain exactly why they choose one word over another.

This complexity presents challenges in controlling the ethical use of AI. Companies are investing heavily in methods to guide AI behavior, but we are still far from having full control.

Conclusion

AI has the potential to transform our daily lives and the way the world functions. However, it’s important to recognize that AI isn’t a one-size-fits-all solution. It encompasses a wide range of technologies, each with its own strengths and applications. Understanding the distinctions between these technologies is key to choosing the right approach for each specific challenge. Different AI methods require different volumes and types of data, as well as distinct processes for training and optimizing models for particular tasks. The first step in harnessing the power of AI effectively is to gain a clear understanding of the various fields within this broad and rapidly evolving discipline.