OpenAI's o3 Hits AGI: Sam Altman Targets Superintelligence

Published Date : 07/01/2025 

OpenAI CEO Sam Altman has stated that the company’s latest model, o3, has achieved AGI, and is now setting its sights on superintelligence. The model, currently undergoing safety testing, has outperformed humans on the ARC-AGI benchmark. 

OpenAI’s newest and most advanced model, announced in December, has achieved a significant milestone by passing the ARC-AGI test, outperforming most humans.

Sam Altman, the CEO of OpenAI, has reinvigorated the discussion on artificial general intelligence (AGI) with this breakthrough.

In an interview with Bloomberg, Altman revealed that the o3 model, currently being safety tested, has passed the ARC-AGI challenge, which is considered the leading benchmark for AGI.

The company is now setting its sights on superintelligence, which is a leap beyond AGI, just as AGI is to AI.


According to ARC-AGI, OpenAI’s new o3 system, trained on the ARC-AGI-1 Public Training set, scored a groundbreaking 75.7% on the Semi-Private Evaluation set at the stated public leaderboard $10k compute limit.

A high-compute (172x) configuration of o3 even scored 87.5%.

The benchmark identifies a score of 85% as a “pass” for AGI, while humans typically solve an average of 80% of ARC tasks.


In a blog post titled “Reflections,” Altman wrote, “We are now confident we know how to build AGI as we have traditionally understood it.

We believe that, in 2025, we may see the first AI agents ‘join the workforce’ and materially change the output of companies.”


What Exactly is AGI?


Defining AGI remains a challenge, as researchers and the broader industry have yet to agree on a concrete description.

However, the general consensus is that AGI will possess human-level intelligence, be autonomous, have self-understanding, and be capable of reasoning and performing tasks it was not trained for.


Altman loosely defined AGI as “when an AI system can do what very skilled humans in important jobs can do.” He also posited that it could be “the most impactful technology in human history.”


Going Beyond AGI


Superintelligence is generally understood to be AI systems that far surpass human intelligence.

Altman wrote, “With superintelligence, we can do anything else.

Superintelligent tools could massively accelerate scientific discovery and innovation well beyond what we are capable of doing on our own.” While this may sound like science fiction, Altman emphasized the need to act “with great care” while still maximizing the benefits.


ARC-AGI Explained


ARC-AGI stands for “Abstract and Reasoning Corpus for Artificial General Intelligence.” It was introduced in 2019 by renowned AI researcher François Chollet, who created the Keras deep learning framework.

The benchmark measures a system’s skill-acquisition efficiency over a scope of tasks, considering priors, experience, and generalization difficulty.

This means that AI can adapt to new, unanticipated problems it has never seen.


The benchmark presents AI with abstract grids or puzzles that require human-level understanding of concepts such as objects, boundaries, and spatial relationships.

Each input-output task features 10 squares of varying heights and widths, each of 10 colors.

To solve each puzzle, an AI system must use reasoning.

For example, a model might be presented with a 7X7 grid with 3 teal blocks forming an “L” pattern and another three forming a reverse lower case “r.” It must then reason that it needs to fill out the “L” and “r” with a bright blue block to create two squares.


While ARC-AGI claims to be the “only AI benchmark that measures our progress towards general intelligence,” other methods have been proposed.

For instance, researchers in Beijing introduced the Tong test, which evaluates AGI through “dynamic embodied physical and social interactions.” This method proposes five critical characteristics to identify AGI the ability to perform non-predefined infinite tasks, autonomously generate tasks without fine-grained instructions, learn and anticipate human needs, have causal understanding, and have “embodiment” that allows it to participate in human life.


Beyond ‘Sparks’ of AGI


OpenAI introduced its o3 model as part of its “12 Days of OpenAI” in December, providing safety researchers early access to its o3 frontier models to complement existing testing processes, including rigorous internal safety testing, external red teaming, and collaborations with third-party organizations and national safety institutes.

The company is accepting applications for the early access program through the end of this week (January 10).


OpenAI set out to build AGI from its founding in 2015, when the concept was “nonmainstream.” As Altman said to Bloomberg, “We wanted to figure out how to build it and make it broadly beneficial.

At the time, very few people cared, and if they did, it was mostly because they thought we had no chance of success.” The company recruited talent with the lure “just come build AGI,” and in April 2023, it appeared to have made some strides toward AGI Microsoft researchers said that ChatGPT had “sparks” of AGI, demonstrating that beyond its mastery of language, GPT-4 could solve novel and difficult tasks in math, coding, vision, medicine, law, psychology, and more, without special prompting.


As Altman noted, “there is still so much to understand, still so much we don’t know, and it’s still so early.

But we know a lot more than we did when we started.” 

Frequently Asked Questions (FAQS):

Q: What is AGI?

A: AGI, or Artificial General Intelligence, refers to an AI system that can perform any intellectual task that a human can. It is characterized by human-level intelligence, autonomy, self-understanding, and the ability to reason and perform tasks it was not specifically trained for.


Q: How did OpenAI's o3 model perform on the ARC-AGI test?

A: OpenAI's o3 model scored a groundbreaking 75.7% on the ARC-AGI Semi-Private Evaluation set and a high-compute (172x) configuration scored 87.5%. The benchmark identifies a score of 85% as a ‘pass’ for AGI.


Q: What is the ARC-AGI benchmark?

A: ARC-AGI stands for 'Abstract and Reasoning Corpus for Artificial General Intelligence.' It measures a system’s skill-acquisition efficiency over a scope of tasks, considering priors, experience, and generalization difficulty. It presents AI with abstract grids or puzzles that require human-level understanding of concepts such as objects, boundaries, and spatial relationships.


Q: What is superintelligence?

A: Superintelligence refers to AI systems that far surpass human intelligence. These systems could massively accelerate scientific discovery and innovation, well beyond what humans are capable of doing on their own.


Q: What is OpenAI's next goal after achieving AGI?

A: After achieving AGI with the o3 model, OpenAI is setting its sights on superintelligence. The company aims to build AI systems that far surpass human intelligence and can be used to solve complex problems and accelerate scientific discovery. 

More Related Topics :