MLAs, ANNs, LLMs, and AI – Jargon Decoded

What do we mean when we talk about AI?

There’s a lot of confusion about artificial intelligence. Different people with different aims and levels of knowledge mean different things when they talk about AI. The average layman mostly means a large language model, while a data analytics company might be talking about a neural net. The technology industry throws around terms like ‘deep learning’ and ‘tokenisation’ at a whim, while CEOs and Directors are eager to integrate the technology into their businesses come hell or high water, to varying levels of advisability or success.

Various types of data-interacting algorithms have appeared in the mainstream over the last 15 years, each a powerful tool with its own use case, and each with its own connections to, and overlaps with, AI.

What is going on beneath the jargon?

What is a large language model?

Large language models (LLMs) are easily the most recognised entity in current artificial intelligence discourse, effectively becoming synonymous with AI. The broadest userbase interacts with LLMs like ChatGPT, Grok, or Claude on a daily basis, now, and people have come to think of LLMs as being AI in totality. While understandable, this isn’t strictly accurate.

An LLM understands and generates human-like text. It learns patterns, context, and grammar by ingesting vast amounts of textual data. It then uses those patterns to predict the next word in a given sequence based on context.

To do this, LLMs are reliant on ‘transformer architecture’. Transformer architecture is a complicated, somewhat abstract, area of this emergent technology, but for a practical baseline, they can be thought of as layered token-processing algorithms. These individual layers handle things like sentence positioning and contextual relations between words and sentences.

Tokens are effectively a ‘unit’ that an AI model processes. For instance, units in ChatGPT include symbols, characters, affixes and subwords, and whole words. This segmentation is what people mean when they talk about ‘tokenisation’. Tokens are given numerical IDs and mapped to a ‘vector’ (also called ’embedding’). These vectors are the means by which the transformer architecture turns the tokens into contextually meaningful data for the LLM model to communicate.

LLMs are usually just a single part of a larger AI model. You can think of them as analogous to a communications department in a business. They take information, contextualise it, and attempt to communicate that information in a meaningful way. An LLM, however, is not directly responsible for tasks like knowledge retrieval or data analysis. Those tasks would be handled by other, better-suited, parts of a larger AI system.

However, it should be noted that the core difference between an LLM and a communications department is the human element. Unlike real thinking professionals, an LLM has no insight or intentionality to its output. At a fundamental level, LLMs do not understand the information that they are communicating, nor the context in which that information is situated. They also do not have any ability to recall past contexts, nor learn from them and adapt that new knowledge to novel projects. They simply translate data within a series of linguistic constraints into intelligible information.

What is an artificial neural network?

Often truncated to ‘neural nets’, an artificial neural network (ANN) is a computational model designed to parallel the functionality of the human brain. They are composed of many nodes in organised layers that, in theory, mimic the neurons in the brain.

They can learn to recognise patterns and relationships between data points, adjusting connection weights between nodes based on context, in a similar fashion to how synaptic connections in the brain can be strengthened by repeatedly thinking about a given subject – one part of a broader phenomenon known as ‘neuroplasticity’.

ANNs have many subtypes – Feedforward Neural Networks (FNNs), Convolutional Neural Networks (CNNs), and Recurrent Neural Networks (RNNs), to name a few – all of which have different designs that make them more suitable for different applications.

However, ANNs have several drawbacks. They require large datasets to train and use significant resources to run. Once they have been trained, the model parameters can be difficult to adjust, and over time they can become black boxes wherein their own creators no longer completely understand them. One of the most famous cases of this came in April 2023, when Google’s CEO, Sundar Pichai, confessed that Google themselves weren’t always clear on how their Bard AI arrived at some of its answers: “You don’t fully understand. And you can’t quite tell why it said this, or why it got wrong.”

To put that in context, when Pichai said that, Google was using the LaMDA system – a model with 137 billion parameters. This was replaced in the same year by PaLM, which used 540 billion parameters. Google has since moved on to Gemini, which, while parallel to PaLM, cannot be accurately measured in terms of sheer parameter volume due to Gemini’s multimodal architecture.

Technically, an LLM is a type of ANN, making the line between the two types of system blurry. ANNs can be thought of as a broad class of systems, of which LLMs are one subtype – analogous to the way in which a genus can contain multiple species on a clade diagram. The LLMs are like a species with the genus of the ANN.

While it isn’t strictly accurate to call Gemini an ANN, the model is built on neural network architecture at its core. Gemini in particular incorporates both ANNs and LLMs, reflecting the increasing complexity of AI models, and in turn the misconceptions around them, as a whole.

What is a machine learning algorithm?

A machine learning algorithm (MLA) is a broad term that umbrellas a range of algorithms enabling pattern recognition, predictive capabilities, and decision making, without directly programming those functions into the code.

MLAs are broadly typed under the following categories:

Supervised learning – In which a model is trained on labelled datasets to facilitate learning input-to-output mapping;
Unsupervised learning – In which a model is trained on unlabelled datasets to discern structures and patterns;
Reinforcement learning – In which agents are trained, via reward and punishment systems, to make sequential decisions through environmental interaction. One of the most well-known categories, this kind of algorithm is commonly found in evolutionary simulations and autonomous robotics;
Ensemble learning – In which a model takes the cumulative output of multiple other models to improve the accuracy of its predictions. This is effective in performance enhancement;
Deep learning – In which a model uses multiple layered neural nets, often referred to as ‘deep networks’, to tackle large data sets and discover complex patterns therein. This is possibly the most well-known term owing to its use in technology-based marketing and public relations collateral.

Each of these main types also has subtypes for various use cases.

As with previous models, it should be clear that the line between MLAs and ANNs isn’t necessarily strict. Both models can be applied to the same tasks, and share the broad goal of finding patterns in data. They often function alongside one another in more complex hybrid models in order to bolster accuracy and performance.

To return to the clade diagram example from the explanation of ANNs, MLAs can be thought of as the larger category to which ANNs belong, just as a genus is part of a family in biological taxonomy. All ANNs are MLAs, but not all MLAs are ANNs.

So what exactly is AI?

AI is effectively an umbrella term. Any AI system is comprised of a number of subsystems, depending on intended use case. AI can be broadly thought of as the attempt to displace the role of human cognition across a wide range of tasks for which simpler algorithms were unsuitable, both in execution and adaptability.

Machine learning algorithms, artificial neural networks, and large language models are all systems within the field of artificial intelligence. They are built with distinct goals in mind, but often serve overlapping purposes and have become increasingly interconnected in practice, as the field of AI develops and becomes increasingly complex.

The general use of all of these terms, and foggy knowledge thereof, is understandable given that the mass of industries attempting to sell AI-based products, educate the public, or find novel use cases, are frequently guilty of using the terms interchangeably or failing to explain them at all.

As we move further into the second quarter of the 21st century, we can expect that confusion to deepen, as a technological field, still wet behind the ears, transitions at breakneck pace from LLMs and neural nets into agentic and multimodal AI, and advances in semiconductors and materials drive significant leaps in processor and computer chip technology.

It is important, therefore, that we have a basic understanding of this emergent technology. Doing so should lead to better decision making and use in both personal and professional spheres. As the field develops and grows more complex, this will only become more pertinent. The good news is that the very technology that we need to understand has made research easier than ever before. Just remember to verify.