Published Date : 09/06/2025
Microsoft is planning to rank artificial intelligence (AI) models based on safety. This effort is part of the tech giant’s effort to foster trust among its cloud customers as it sells them AI products from companies like OpenAI and xAI, as reported by the Financial Times (FT) on Sunday (June 8).
Sarah Bird, Microsoft’s head of Responsible AI, told the FT that the company would soon add a “safety” category to its “model leaderboard,” a feature it created recently for developers to rank iterations from providers including China’s DeepSeek and France’s Mistral.
The leaderboard, which is accessible to clients using the Azure Foundry developer platform, is expected to influence which AI models and applications are purchased via Microsoft. Microsoft currently ranks three metrics: quality, cost, and throughput, which is how fast a model can generate an output. Bird said the new safety ranking would ensure that “people can just directly shop and understand” AI models’ capabilities as they determine which to purchase.
The decision to offer safety benchmarks comes as Microsoft’s customers wrestle with the potential risks posed by new AI models to data and privacy protections, especially when deployed as autonomous “agents” that can function with no human supervision. Rankings let users access objective metrics when choosing from a catalogue of more than 1,900 AI models, allowing them to make an informed choice of which to use.
Safety leaderboards can help businesses cut through the noise and narrow down options, according to Cassie Kozyrkov, a consultant and former chief decision scientist at Google. “The real challenge is understanding the trade-offs: higher performance at what cost? Lower cost at what risk?” Kozyrkov noted.
In other AI news, PYMNTS recently examined the use of AI agents in banking compliance in an interview with Greenlite AI CEO Will Lawrence. He told PYMNTS CEO Karen Webster that while the 2000s were defined by rule-based systems and the 2010s ushered in machine learning, the 2020s are “the agentic era of compliance.” The shift has a range of implications. Trust is key for regulated financial institutions. Mistakes lead to declined transactions and also risk regulatory exposure.
“Right now, banks are getting more risk signals than they can investigate,” Lawrence said. “Digital accounts are growing. Backlogs are growing. Detection isn’t the problem anymore — it’s what to do next.” “AI is only scary until you understand how it works,” Lawrence added. “Then it’s just a tool — like a calculator. We’re helping banks understand how to use it safely.”
Q: What is Microsoft’s new initiative regarding AI models?
A: Microsoft is planning to rank AI models based on safety to enhance trust and transparency among its cloud customers. This will help businesses make informed decisions when choosing AI products from providers like OpenAI and xAI.
Q: Who is Sarah Bird, and what is her role at Microsoft?
A: Sarah Bird is Microsoft’s head of Responsible AI. She is leading the initiative to add a ‘safety’ category to Microsoft’s AI model leaderboard.
Q: What are the current metrics used by Microsoft to rank AI models?
A: Microsoft currently ranks AI models based on three metrics: quality, cost, and throughput (how fast a model can generate an output).
Q: Why is safety ranking important for AI models?
A: Safety ranking is important because it helps businesses understand the risks and trade-offs associated with different AI models, especially when they are deployed as autonomous agents with no human supervision.
Q: What does the agentic era of compliance mean for the banking industry?
A: The agentic era of compliance in banking means that AI agents are being used to handle compliance tasks, which can lead to more efficient and accurate risk management. However, it also requires a deep understanding of how AI works to ensure safe and compliant operations.