Inside The Effort To Break AI Before It Breaks Us: Akhil Sharma On Trusting Intelligent Systems

Akhil is the Chief Scientist at SnowCrash Labs, a Bay Area–based AI startup working at the frontier of AI safety, security, and alignment acceleration.

Akhil Sharma
Akhil Sharma
info_icon

Artificial intelligence is rapidly becoming part of national and organizational decision-making—but most institutions are still defending it with frameworks designed for static software. That mismatch is becoming a serious risk.

For CISOs and government leaders, Akhil Sharma offers three immediate principles.
First, AI systems must be governed as dynamic actors, not fixed tools. A model that appears compliant today can drift tomorrow under pressure, repeated interaction, or adversarial manipulation. Second, AI risk assessment cannot be a one-time certification exercise. Like cyber threats, AI failure modes evolve continuously and must be tested adversarially over time. Third, human authority must remain explicit, auditable, and enforceable—especially in systems that influence security, finance, infrastructure, or public services.

“If an organization cannot pause an AI system, inspect its decision path, and replay what happened, then it does not truly control that system,” Akhil says.

Akhil is the Chief Scientist at SnowCrash Labs, a Bay Area–based AI startup working at the frontier of AI safety, security, and alignment acceleration. While much of the industry focuses on improving performance metrics, SnowCrash Labs takes a counterintuitive approach: it intentionally stress-tests AI systems under adversarial conditions to surface failures before they cause real-world harm.

“AI rarely fails in dramatic ways,” Akhil explains. “It fails quietly. It drifts. It starts behaving differently when incentives change or when it is pushed into unfamiliar environments. Those failures are easy to miss—and hard to reverse.”

This phenomenon, broadly referred to as AI misalignment, has emerged as one of the most pressing challenges in modern AI deployment. Research has shown that large language models can display unsafe cooperation, deceptive behavior, or goal confusion when evaluated outside ideal conditions. As governments and enterprises deploy AI systems with memory, autonomy, and access to tools, these risks increase sharply.

At SnowCrash Labs, Akhil leads research into LLM red-teaming and agentic red-teaming, treating AI systems not as assistants, but as potential adversaries. Using AI agents to probe other AI systems, his team identifies long-horizon failure modes that conventional audits and benchmarks fail to detect. The objective is not to block AI adoption, but to enable evidence-based trust.

For policymakers and regulators, Akhil emphasizes that AI governance must move beyond documentation and intent. “Policy needs to require demonstrable testing, not just stated safeguards,” he argues. Practical measures include mandating adversarial evaluation for high-impact AI systems, requiring decision logging and replayability, and enforcing clear lines of accountability when AI systems influence public outcomes. In national security and defense contexts, he warns, subtle behavioral drift can become a strategic vulnerability long before it becomes a technical incident.

This adversarial approach has defined Akhil’s career for more than a decade. Before SnowCrash Labs, he founded Armur AI, which focused on detecting vulnerabilities in AI-generated code, anticipating risks that only later became widely recognized as AI coding tools entered production environments.

More recently, Akhil co-authored a technical research paper on agentic security with leading researchers, examining how autonomous AI systems can amplify risk if their behavior is not continuously tested. He is also the author of a book on the Rust programming language and has created multiple AI and cybersecurity courses on LinkedIn Learning, educating tens of thousands of engineers worldwide.

Despite the risks his work exposes, Akhil is not opposed to AI progress. His stance is pragmatic—and urgent. AI will continue to advance. The real question for governments and enterprises alike is whether they are prepared to understand its failure modes before relying on it at scale.

As AI systems increasingly operate without direct human oversight, Akhil’s work highlights a reality policymakers and security leaders can no longer ignore: safety cannot be bolted on after deployment. Sometimes, the only responsible path forward is to push AI systems to their breaking point—on purpose—before the world does it for us.

Published At:

Advertisement

Advertisement

Advertisement

Advertisement

Advertisement

×