Published Date : 07/06/2025
The main problem with big tech's experiment with artificial intelligence (AI) is not that it could take over humanity. It's that large language models (LLMs) like OpenAI's ChatGPT, Google's Gemini, and Meta's Llama continue to get things wrong, and the problem is intractable.
Known as hallucinations, one of the most prominent examples was the case of US law professor Jonathan Turley, who was falsely accused of sexual harassment by ChatGPT in 2023. OpenAI's solution seems to have been to basically 'disappear' Turley by programming ChatGPT to say it can't respond to questions about him, which is clearly not a fair or satisfactory solution. Trying to solve hallucinations after the event and case by case is clearly not the way to go.
The same can be said of LLMs amplifying stereotypes or giving western-centric answers. There's also a total lack of accountability in the face of this widespread misinformation, since it's difficult to ascertain how the LLM reached this conclusion in the first place. We saw a fierce debate about these problems after the 2023 release of GPT-4, the most recent major paradigm in OpenAI's LLM development. Arguably the debate has cooled since then, though without justification.
The EU passed its AI Act in record time in 2024, in a bid to be a world leader in overseeing this field. But the act relies heavily on AI companies to regulate themselves without really addressing the issues in question. It hasn't stopped tech companies from releasing LLMs worldwide to hundreds of millions of users and collecting their data without proper scrutiny.
Meanwhile, the latest tests indicate that even the most sophisticated LLMs remain unreliable. Despite this, the leading AI companies still resist taking responsibility for errors. Unfortunately, LLMs' tendencies to misinform and reproduce bias can't be solved with gradual improvements over time. And with the advent of agentic AI, where users will soon be able to assign projects to an LLM such as booking their holiday or optimizing the payment of all their bills each month, the potential for trouble is set to multiply.
The emerging field of neurosymbolic AI could solve these issues, while also reducing the enormous amounts of data required for training LLMs. So what is neurosymbolic AI and how does it work?
LLMs work using a technique called deep learning, where they are given vast amounts of text data and use advanced statistics to infer patterns that determine what the next word or phrase in any given response should be. Each model, along with all the patterns it has learned, is stored in arrays of powerful computers in large data centers known as neural networks. LLMs can appear to reason using a process called chain-of-thought, where they generate multi-step responses that mimic how humans might logically arrive at a conclusion, based on patterns seen in the training data.
Undoubtedly, LLMs are a great engineering achievement. They are impressive at summarizing text and translating, and may improve the productivity of those diligent and knowledgeable enough to spot their mistakes. Nevertheless, they have great potential to mislead because their conclusions are always based on probabilities—not understanding. A popular workaround is called 'human-in-the-loop': making sure that humans using AIs still make the final decisions. However, apportioning blame to humans does not solve the problem. They'll still often be misled by misinformation.
LLMs now need so much training data to advance that we're now having to feed them synthetic data, meaning data created by LLMs. This data can copy and amplify existing errors from its own source data, such that new models inherit the weaknesses of old ones. As a result, the cost of programming AIs to be more accurate after their training—known as 'post-hoc model alignment'—is skyrocketing. It also becomes increasingly difficult for programmers to see what's going wrong because the number of steps in the model's thought process become ever larger, making it harder and harder to correct for errors.
Neurosymbolic AI combines the predictive learning of neural networks with teaching the AI a series of formal rules that humans learn to be able to deliberate more reliably. These include logic rules, like 'if a then b', such as 'if it's raining then everything outside is normally wet'; mathematical rules, like 'if a = b and b = c then a = c'; and the agreed-upon meanings of things like words, diagrams, and symbols. Some of these will be inputted directly into the AI system, while it will deduce others itself by analyzing its training data and doing 'knowledge extraction'.
This should create an AI that will never hallucinate and will learn faster and smarter by organizing its knowledge into clear, reusable parts. For example, if the AI has a rule about things being wet outside when it rains, there's no need for it to retain every example of the things that might be wet outside—the rule can be applied to any new object, even one it has never seen before. During model development, neurosymbolic AI also integrates learning and formal reasoning using a process known as the 'neurosymbolic cycle'. This involves a partially trained AI extracting rules from its training data then instilling this consolidated knowledge back into the network before further training with data.
This is more energy efficient because the AI needn't store as much data, while the AI is more accountable because it's easier for a user to control how it reaches particular conclusions and improves over time. It's also fairer because it can be made to follow pre-existing rules, such as: 'For any decision made by the AI, the outcome must not depend on a person's race or gender'.
Q: What are AI hallucinations?
A: AI hallucinations refer to instances where large language models (LLMs) generate incorrect or misleading information, often due to the probabilistic nature of their training and inference processes.
Q: Why is the current approach to solving AI hallucinations not effective?
A: The current approach often involves addressing issues on a case-by-case basis or trying to fix problems after they occur, which is inefficient and does not address the root causes of the hallucinations.
Q: What is neurosymbolic AI?
A: Neurosymbolic AI combines the predictive learning capabilities of neural networks with formal rules and logical reasoning, aiming to create more reliable and accurate AI systems.
Q: How does neurosymbolic AI reduce the need for extensive training data?
A: Neurosymbolic AI uses formal rules and knowledge extraction to organize and apply knowledge more efficiently, reducing the need for large amounts of training data and making the AI more energy efficient.
Q: What are the benefits of using neurosymbolic AI over traditional LLMs?
A: Neurosymbolic AI offers benefits such as reduced hallucinations, faster learning, better accountability, and fairer decision-making, as it can be programmed to follow pre-existing rules and principles.