Close Menu
Luminari | Learn Docker, Kubernetes, AI, Tech & Interview PrepLuminari | Learn Docker, Kubernetes, AI, Tech & Interview Prep
  • Home
  • Technology
    • Docker
    • Kubernetes
    • AI
    • Cybersecurity
    • Blockchain
    • Linux
    • Python
    • Tech Update
    • Interview Preparation
    • Internet
  • Entertainment
    • Movies
    • TV Shows
    • Anime
    • Cricket
What's Hot

VTuber Usada Pekora Teased in Death Stranding 2 Game – Interest

June 6, 2025

11 Fresh Options for Meeting in the Western U.S.

June 6, 2025

Microsoft Helps CBI Dismantle Indian Call Centers Behind Japanese Tech Support Scam

June 6, 2025
Facebook X (Twitter) Instagram
Facebook X (Twitter) Instagram
Luminari | Learn Docker, Kubernetes, AI, Tech & Interview Prep
  • Home
  • Technology
    • Docker
    • Kubernetes
    • AI
    • Cybersecurity
    • Blockchain
    • Linux
    • Python
    • Tech Update
    • Interview Preparation
    • Internet
  • Entertainment
    • Movies
    • TV Shows
    • Anime
    • Cricket
Luminari | Learn Docker, Kubernetes, AI, Tech & Interview PrepLuminari | Learn Docker, Kubernetes, AI, Tech & Interview Prep
Home » Anthropic CEO wants to open the black box of AI models by 2027
AI

Anthropic CEO wants to open the black box of AI models by 2027

HarishBy HarishApril 24, 2025No Comments4 Mins Read
Facebook Twitter Pinterest LinkedIn Reddit WhatsApp Email
Share
Facebook Twitter Pinterest Reddit WhatsApp Email


Anthropic CEO Dario Amodei published an essay Thursday highlighting how little researchers understand about the inner workings of the world’s leading AI models. To address that, Amodei set an ambitious goal for Anthropic to reliably detect most AI model problems by 2027.

Amodei acknowledges the challenge ahead. In “The Urgency of Interpretability,” the CEO says Anthropic has made early breakthroughs in tracing how models arrive at their answers — but emphasizes that far more research is needed to decode these systems as they grow more powerful.

“I am very concerned about deploying such systems without a better handle on interpretability,” Amodei wrote in the essay. “These systems will be absolutely central to the economy, technology, and national security, and will be capable of so much autonomy that I consider it basically unacceptable for humanity to be totally ignorant of how they work.”

Anthropic is one of the pioneering companies in mechanistic interpretability, a field that aims to open the black box of AI models and understand why they make the decisions they do. Despite the rapid performance improvements of the tech industry’s AI models, we still have relatively little idea how these systems arrive at decisions.

For example, OpenAI recently launched new reasoning AI models, o3 and o4-mini, that perform better on some tasks, but also hallucinate more than its other models. The company doesn’t know why it’s happening.

“When a generative AI system does something, like summarize a financial document, we have no idea, at a specific or precise level, why it makes the choices it does — why it chooses certain words over others, or why it occasionally makes a mistake despite usually being accurate,” Amodei wrote in the essay.

In the essay, Amodei notes that Anthropic co-founder Chris Olah says that AI models are “grown more than they are built.” In other words, AI researchers have found ways to improve AI model intelligence, but they don’t quite know why.

In the essay, Amodei says it could be dangerous to reach AGI — or as he calls it, “a country of geniuses in a data center” — without understanding how these models work. In a previous essay, Amodei claimed the tech industry could reach such a milestone by 2026 or 2027, but believes we’re much further out from fully understanding these AI models.

In the long term, Amodei says Anthropic would like to, essentially, conduct “brain scans” or “MRIs” of state-of-the-art AI models. These checkups would help identify a wide range of issues in AI models, including their tendencies to lie or seek power, or other weakness, he says. This could take five to 10 years to achieve, but these measures will be necessary to test and deploy Anthropic’s future AI models, he added.

Anthropic has made a few research breakthroughs that have allowed it to better understand how its AI models work. For example, the company recently found ways to trace an AI model’s thinking pathways through, what the company call, circuits. Anthropic identified one circuit that helps AI models understand which U.S. cities are located in which U.S. states. The company has only found a few of these circuits but estimates there are millions within AI models.

Anthropic has been investing in interpretability research itself and recently made its first investment in a startup working on interpretability. While interpretability is largely seen as a field of safety research today, Amodei notes that, eventually, explaining how AI models arrive at their answers could present a commercial advantage.

In the essay, Amodei called on OpenAI and Google DeepMind to increase their research efforts in the field. Beyond the friendly nudge, Anthropic’s CEO asked for governments to impose “light-touch” regulations to encourage interpretability research, such as requirements for companies to disclose their safety and security practices. In the essay, Amodei also says the U.S. should put export controls on chips to China, in order to limit the likelihood of an out-of-control, global AI race.

Anthropic has always stood out from OpenAI and Google for its focus on safety. While other tech companies pushed back on California’s controversial AI safety bill, SB 1047, Anthropic issued modest support and recommendations for the bill, which would have set safety reporting standards for frontier AI model developers.

In this case, Anthropic seems to be pushing for an industry-wide effort to better understand AI models, not just increasing their capabilities.



Source link

Share. Facebook Twitter Pinterest LinkedIn WhatsApp Reddit Email
Previous ArticleUma Musume’s Cygames Partners With 151st Kentucky Derby – Interest
Next Article Ben Affleck Doesn’t Think Kevin Costner Remembers Him and Matt Damon
Harish
  • Website
  • X (Twitter)

Related Posts

Cursor’s Anysphere nabs $9.9B valuation, soars past $500M ARR

June 5, 2025

Perplexity received 780 million queries last month, CEO says

June 5, 2025

Anthropic co-founder on cutting access to Windsurf: ‘It would be odd for us to sell Claude to OpenAI’

June 5, 2025

Google says its updated Gemini 2.5 Pro AI model is better at coding

June 5, 2025

Anthropic unveils custom AI models for US national security customers

June 5, 2025

The founder of DeviantArt is making a $22,000 display for digital art

June 5, 2025
Add A Comment
Leave A Reply Cancel Reply

Our Picks

VTuber Usada Pekora Teased in Death Stranding 2 Game – Interest

June 6, 2025

11 Fresh Options for Meeting in the Western U.S.

June 6, 2025

Microsoft Helps CBI Dismantle Indian Call Centers Behind Japanese Tech Support Scam

June 6, 2025

Japan’s Video Game Rankings, May 19-25 – News [2025-05-31]

June 6, 2025
Don't Miss
Blockchain

Saylor’s Strategy upsized stock offering to $1B for Bitcoin purchases

June 6, 20252 Mins Read

Strategy, the world’s largest corporate Bitcoin holder, plans to raise nearly $1 billion through a…

DeFi, not MiCA II at Forefront

June 6, 2025

Crypto lobby wants software dev protections added to crypto bill

June 6, 2025

Kraken warns ‘be careful who you trust’ at crypto events

June 6, 2025

Subscribe to Updates

Subscribe to our newsletter and never miss our latest news

Subscribe my Newsletter for New Posts & tips Let's stay updated!

About Us
About Us

Welcome to Luminari, your go-to hub for mastering modern tech and staying ahead in the digital world.

At Luminari, we’re passionate about breaking down complex technologies and delivering insights that matter. Whether you’re a developer, tech enthusiast, job seeker, or lifelong learner, our mission is to equip you with the tools and knowledge you need to thrive in today’s fast-moving tech landscape.

Our Picks

Cursor’s Anysphere nabs $9.9B valuation, soars past $500M ARR

June 5, 2025

Perplexity received 780 million queries last month, CEO says

June 5, 2025

Anthropic co-founder on cutting access to Windsurf: ‘It would be odd for us to sell Claude to OpenAI’

June 5, 2025

Subscribe to Updates

Subscribe to our newsletter and never miss our latest news

Subscribe my Newsletter for New Posts & tips Let's stay updated!

Facebook X (Twitter) Instagram Pinterest
  • Home
  • About Us
  • Advertise With Us
  • Contact Us
  • DMCA Policy
  • Privacy Policy
  • Terms & Conditions
© 2025 luminari. Designed by luminari.

Type above and press Enter to search. Press Esc to cancel.