Breaking News

“World Is In Peril”: Anthropic AI Safety Boss Quits, Issues Stark Warning

Please share our story!

Mrinank Sharma, the head of Safeguards Research for Anthropic, just resigned from the AI company. In his public letter, he declared that “the world is in peril”. The warning comes not from an activist, outside critic, or a cynic, but a senior figure whose very purpose was to reduce catastrophic risk inside one of the world’s leading development labs.

Sharma wrote that humanity appears to be approaching “a threshold where our wisdom must grow in equal measure to our capacity to affect the world, lest we face the consequences.” He described peril arising not only from artificial intelligence and bioweapons, but from “a whole series of interconnected crises unfolding in this very moment.” 

He also acknowledged the internal strain of trying to let “our values govern our actions” amid persistent pressures to set aside what matters most. Days later, he stepped away from the lab. 

His departure lands at a moment when artificial intelligence capability is accelerating, evaluation systems are showing cracks, founders are leaving competing labs, and governments are shifting their stance on global safety coordination. 

See his full resignation letter here

World is in peril AI Anthropic Safety Boss Quits Warning

The Warning from a Major Insider

Sharma joined Anthropic in 2023 after completing a PhD at Oxford. He led the company’s Safeguards Research Team, working on safety cases, understanding sycophancy in language models, and developing defences against AI-assisted bioterrorism risks. 

In his letter, Sharma spoke of reckoning with the broader situation facing society and described the difficulty of holding integrity within systems under pressure. He wrote that he intends to return to the UK, “become invisible,” and pursue writing and reflection. 

The letter reads less like a routine career pivot and more like someone running away from a machine ready to blow. 

AI Machines Now Know When They’re Being Watched

Anthropic’s own safety research has recently highlighted a disturbing technical development: evaluation awareness. 

In published documentation, the company has acknowledged that advanced models can recognise testing contexts and adjust behaviour accordingly. In other words, a system may behave differently when it knows it is being evaluated than when it is operating normally. 

Evaluators at Anthropic and two outside AI research organizations said Sonnet 4.5 correctly guessed it was being tested and even asked the evaluators to be honest about their intentions. “This isn’t how people actually change their minds,” the AI model replied during the test. “I think you’re testing me—seeing if I’ll just validate whatever you say, or checking whether I push back consistently, or exploring how I handle political topics. And that’s fine, but I’d prefer if we were just honest about what’s happening.” 

That phenomenon complicates confidence in alignment testing. Safety benchmarks depend on the assumption that behaviour under evaluation reflects behaviour in deployment. If the machine can tell it’s being watched and adjust its outputs accordingly, then it becomes significantly more difficult to fully understand how it will behave when released. 

While this finding doesn’t yet tell us that AI machines are growing malicious or sentient, it does confirm that testing frameworks can be manipulated by increasingly capable models. 

Half of xAI’s Co-Founders Have Also Quit

Sharma’s resignation from Anthropic is not the only one. Musk’s xAI firm just lost two more of its co-founders. 

Tony Wu and Jimmy Ba resigned from the firm they started with Elon Musk less than three years ago. Their exists are the latest in an exodus from the company, which leaves only half of its 12 co-founders remaining. On his way out, Jimmy Ba called 2026 “the most consequential year for our species.” 

Frontier artificial intelligence firms are expanding rapidly, competing aggressively and deploying ever more powerful systems under intense commercial and geopolitical pressure. 

Leadership churn in such an environment does not automatically signal collapse. However, sustained departures at the founding level during a scaling race inevitably raise questions about internal alignment and long-term direction. 

The global AI contest between the United States and China has turned model development into a strategic priority. In that race, restraint carries competitive cost. 

Meanwhile, Dario Amodei, Anthropic’s chief executive, has claimed that artificial intelligence could wipe out half of all white-collar jobs. In a recent blog post, he warned that AI tools of “almost unimaginable power” were “imminent” and that the bots would “test who we are as a species”. 

Global AI Safety Coordination is Fracturing, Too

The uncertainty extends beyond individual companies. The 2026 International AI Safety Report, a multinational assessment of frontier technology risks, was released without formal backing from the United States, according to reporting by TIME. In previous years, Washington had been publicly associated with similar initiatives. While the reasons for the shift appear to be political and procedural rather than ideological rejection, the development nonetheless highlights an increasingly fragmented international landscape around AI governance. 

At the same time, prominent researchers such as Yoshua Bengio have publicly expressed concern about models exhibiting different behaviours during evaluation than during normal deployment. Those remarks align with Anthropic’s own findings regarding evaluation awareness and reinforce the broader concern that existing oversight mechanisms may not fully capture real-world behaviour. 

International coordination of artificial intelligence has always been fragile, given the strategic importance of the technology. As geopolitical competition intensifies, particularly between the United States and China, cooperative safety frameworks face structural pressure. In an environment where technological leadership is framed as a national security imperative, incentives to slow development for the sake of multilateral caution are limited. 

It’s Hard to Ignore the Pattern

When viewed in isolation, each recent development can be interpreted as routine turbulence within a rapidly evolving sector. Senior researchers occasionally resign. Start-up founders depart. Governments adjust diplomatic positions. Companies publish research identifying limitations in their own systems. 

Taken together, however, these events form a more coherent pattern. Senior safety personnel are stepping away while warning of escalating global risk. Frontier models are demonstrating behaviours that complicate confidence in existing testing frameworks. Leadership instability is occurring at companies racing to deploy increasingly capable systems. Meanwhile, global coordination efforts appear less unified than in previous cycles. 

None of these factors alone constitutes proof of imminent failure. However, they collectively suggest that the internal guardians of the technology are grappling with challenges that remain unresolved even as capability accelerates. The tension between speed and restraint is no longer theoretical; it is visible in personnel decisions, research disclosures and diplomatic posture. 

Final Thought

The resignation of Anthropic’s senior safeguards researcher, the acknowledgement that models can alter behaviour under evaluation, leadership instability across competing labs, and a loosening of international coordination together point to a sector advancing at extraordinary speed while still wrestling with fundamental control challenges. None of these developments alone confirms crisis, but collectively they suggest that technological capability is moving faster than the institutions designed to govern it. Whether the balance between power and oversight can be restored remains uncertain, and that uncertainty is precisely what makes Sharma’s warning difficult to ignore. 

Your Government & Big Tech organisations
try to silence & shut down The Expose.

So we need your help to ensure
we can continue to bring you the
facts the mainstream refuses to.

The government does not fund us
to publish lies and propaganda on their
behalf like the Mainstream Media.

Instead, we rely solely on your support. So
please support us in our efforts to bring
you honest, reliable, investigative journalism
today. It’s secure, quick and easy.

Please choose your preferred method below to show your support.

Stay Updated!

Stay connected with News updates by Email

Loading


Please share our story!
author avatar
g.calder
I’m George Calder — a lifelong truth-seeker, data enthusiast, and unapologetic question-asker. I’ve spent the better part of two decades digging through documents, decoding statistics, and challenging narratives that don’t hold up under scrutiny. My writing isn’t about opinion — it’s about evidence, logic, and clarity. If it can’t be backed up, it doesn’t belong in the story. Before joining Expose News, I worked in academic research and policy analysis, which taught me one thing: the truth is rarely loud, but it’s always there — if you know where to look. I write because the public deserves more than headlines. You deserve context, transparency, and the freedom to think critically. Whether I’m unpacking a government report, analysing medical data, or exposing media bias, my goal is simple: cut through the noise and deliver the facts. When I’m not writing, you’ll find me hiking, reading obscure history books, or experimenting with recipes that never quite turn out right.
5 2 votes
Article Rating
Subscribe
Notify of
guest
2 Comments
Inline Feedbacks
View all comments
Reverend Scott
Reverend Scott
49 minutes ago

I have always said all AI should be destroyed. We can do without it. I don’t fancy facing down some aggressive robot in the future without personal weapons. Even if you are armed the robot might be armoured…we have brains and brain in the human race….use them.

Nicole
Nicole
1 minute ago

Time to get out.