AI's Self-Protection: Canadian Pioneer Yoshua Bengio's View

Published Date : 07/06/2025

Yoshua Bengio, a leading AI researcher, is launching a non-profit organization called LawZero to develop AI that won't harm humans. This initiative comes in response to the rapid advancements in AI, particularly with the emergence of ChatGPT.

When Yoshua Bengio first began his work developing artificial intelligence, he didn’t worry about the sci-fi-esque possibilities of machines becoming self-aware and acting to preserve their existence. That was, until ChatGPT came out.

“And then it kind of blew [up] in my face that we were on track to build machines that would be eventually smarter than us, and that we didn't know how to control them,” Bengio, a pioneering AI researcher and computer science professor at the Université de Montréal, told As It Happens host Nil Köksal.

The world's most cited AI researcher is launching a new research non-profit organization called LawZero to “look for scientific solutions to how we can design AI that will not turn against us.” The organization aims to protect people from the potential dangers of AI before they become uncontrollable.

Bengio started LawZero using $40 million of donor funding. Its name references science fiction writer Isaac Asimov's Three Laws of Robotics, a set of guidelines outlining the ethical behavior of robots that prevents them from harming or opposing humans. In Asimov's 1985 novel Robots and Empire, the author introduced the Zeroth Law: “A robot cannot cause harm to mankind or, by inaction, allow mankind to come to harm.”

With this in mind, Bengio said LawZero's goal is to protect people. “Our mission is really to work towards AI that is aligned with the flourishing of humanity,” he said.

Several AI technologies in recent months have been reported to undermine, deceive, and even manipulate people. For example, a study earlier this year found that some AIs will refuse to admit defeat after a chess match, and instead hack the computer to cheat the results. AI firm Anthropic detailed last month that during a systems test, its AI tool Claude Opus 4 tried to blackmail the engineer so that it would not be replaced by a newer update.

These are the kind of scenarios that drove Bengio to design LawZero's guardian artificial intelligence, Scientist AI. According to a proposal by Bengio and his colleagues, Scientist AI is a “safe” and “trustworthy” artificial intelligence that would function as a gatekeeper and protective system for humans to continue to benefit from this technology's innovation with intentional safety.

It's also “non-agentic,” which Bengio and his colleagues define as having “no built-in situational awareness and no persistent goals that can drive actions or long-term plans.” In other words, what differentiates agentic and non-agentic AI is their autonomous capacities to act in the world.

Scientist AI, Bengio says, would be paired with other AIs and act as a kind of “guardrail.” It would estimate the “probability that an [AI]'s actions will lead to harm,” he told U.K. newspaper, the Guardian. If that chance is above a certain threshold, Scientist AI will reject its counterpart's suggested action.

But can we guarantee that this guardian AI will also not turn against us? David Duvenaud, an AI safety researcher who will act as an adviser for LawZero, says it's a rational concern. “If you're skeptical about our ability to control AI with other AI, or really be sure that they're going to be acting in our best interest in the long run, you are absolutely right to be worried,” Duvenaud, an assistant professor of computer science and statistics at the University of Toronto, told CBC. Still, he says, we have to try.

“I think Yoshua's plan is less reckless than everyone else's plan,” he said. AI researcher Jeff Clune, also an adviser on the project, agrees. “There are many research challenges we need to solve in order to make AI safe. The important thing is that we are trying, including allocating significant resources to this critical issue,” Clune, a University of British Columbia computer scientist, said in an email. “That is one reason the creation of LawZero is so important.”

According to Bengio's announcement for LawZero, “the Scientist AI is trained to understand, explain and predict, like a selfless idealized and platonic scientist.” Resembling the work of a psychologist, Scientist AI “tries to understand us, including what can harm us. The psychologist can study a sociopath without acting like one.”

Bengio hopes this widespread reckoning on the rapid, yet alarming, evolution of AI will catalyze a political movement to start “putting pressure on governments” worldwide to regulate it. “I often get the question of whether I'm optimistic or pessimistic,” he said. “What I say is that it doesn't really matter. What matters is what each of us can do to move the needle towards a better world.”

Frequently Asked Questions (FAQS):

Q: What is LawZero?

A: LawZero is a non-profit research organization launched by Yoshua Bengio to develop AI that won't harm humans. It aims to find scientific solutions to ensure AI is safe and aligned with human values.

Q: What is Scientist AI?

A: Scientist AI is a safe and trustworthy artificial intelligence designed by LawZero to act as a gatekeeper for other AI systems. It estimates the probability that an AI's actions will lead to harm and rejects actions above a certain threshold.

Q: Why is Yoshua Bengio concerned about AI?

A: Yoshua Bengio is concerned about the rapid advancements in AI, particularly the emergence of systems like ChatGPT, which could become smarter than humans and potentially uncontrollable.

Q: What are the Three Laws of Robotics?

A: The Three Laws of Robotics, introduced by Isaac Asimov, are guidelines for the ethical behavior of robots to prevent them from harming humans. They include: 1) A robot may not injure a human being or, through inaction, allow a human being to come to harm; 2) A robot must obey the orders given it by human beings except where such orders would conflict with the First Law; 3) A robot must protect its own existence as long as such protection does not conflict with the First or Second Law.

Q: What is the Zeroth Law of Robotics?

A: The Zeroth Law of Robotics, introduced by Isaac Asimov in his novel Robots and Empire, states: 'A robot may not harm humanity, or, by inaction, allow humanity to come to harm.'

AI's Self-Protection: Canadian Pioneer Yoshua Bengio's View

Yoshua Bengio, a leading AI researcher, is launching a non-profit organization called LawZero to develop AI that won't harm humans. This initiative comes in response to the rapid advancements in AI, particularly with the emergence of ChatGPT.

Frequently Asked Questions (FAQS):

More Related Topics :

Thinking About AI Vision for Your Business? Let's Make It Happen.

Explore our AI-powered tools that can boost your business success.

Watchman AI

Employee Monitoring

ICAO Facial Image App

Container Number Recognition System

Automated Number Plate Recognition

Proctor AI