Jack Li
Jack Li

Earlier this month, Anthropic released the most capable AI models it has built. The public version, Claude Fable 5, ships with safeguards that hold it back in sensitive areas such as cybersecurity; a more powerful, restricted sibling, Claude Mythos 5, with stronger cybersecurity capabilities, was made available only to a small, vetted group of cyber-defenders and government agencies. Whatever one makes of the rollout, it was a reminder of how fast AI is getting at one side of software security: finding the weaknesses buried in the world's code. Can AI help mitigate those weaknesses?

Most of the money and noise in artificial intelligence has gone to models that purely generate: that write, draw, code, and answer. But an arguably larger opportunity is opening on the other side of the boom: getting AI not just to produce output, but to invent, to build, and to verify and prove its own results — the kind of capability that could help secure the world's software. Automating that loop is one of the most strategically important problems in technology, and almost no one has fully solved it.

One of the few teams making real headway is Asari AI, a well-backed San Francisco startup, and the engineer at the center of its breakthrough is Jack Li. As the Head of Infrastructure at Asari AI, Li leads the buildout that lets AI agents not just build but verify their own work — proving results to standards that don't grade on a curve. It is the kind of capability the rest of the industry has been circling for years without cracking.

Reputation in frontier AI is earned through results outsiders can actually check, not in-house demos — and Asari AI holds itself to the highest standards. As a showcase of what its agents can do, the company gave them a deliberately unforgiving problem: translating four production C libraries into Safe Rust, a modern programming language built to avoid memory errors and loopholes. One of these libraries is used in spacecraft flight software. Working with an independent evaluator, the rewritten code was verified against European Space Agency standards — a benchmark reserved for software that simply cannot fail. Asari AI's Safe Rust code passed that bar and more: their agents found and fixed bugs that were undetected by humans for years. And the translated code is open source — published for anyone to read and check for themselves.

What makes those results trustworthy is less the model that writes the Rust than the system around it that decides whether to believe the output — the part Li helped build. The agents don't treat a translation as finished the moment it compiles. They work iteratively, built on a deceptively simple idea: a program's own test suite is the closest thing it has to a written definition of "correct." During a translation, the agents first rewrite the tests from C into Rust, then rewrite the program from C into Rust, and then run the original and the translation against the tests. The agents then compare what comes back — not merely whether it passes, but whether every value matches, down to the last digit. The entire process is then repeated as often as needed; a rewrite is accepted only when the Rust behaves exactly like the C it replaced.

Two things keep that from being brute force. Rust does part of the safety work by construction: moving memory that C manages by hand — raw pointers, manual allocation, manual freeing — into Rust's own type system means entire families of notorious failures, like use-after-free or reading the wrong region of memory, simply cannot be written in the first place. And the agents run largely on their own, pausing for human judgment only at a handful of genuinely ambiguous system-design choices or implementation issues per library. The Safe Rust they produce reads as if an expert Rust programmer wrote it; one senior Rust engineer who reviewed it said he would be fooling himself to claim he could tell it had been machine-translated from C.

That entire apparatus — the system deciding whether to trust the output — is the infrastructure Li designed and built. The project is, in miniature, proof of what the company's agents can verify, to a standard no one can fudge, and it paves the way for rigorous invention.

Before building AI infrastructure, Li researched the mathematics of how networks of neurons stay stable while they keep learning at Caltech — an unusual grounding for an infrastructure engineer, but a closer fit than it looks: both come down to whether a system can be trusted to behave correctly as it changes.

"A capable model isn't enough," Li says, "if the system around it can't make the results trusted and dependable." That insistence on verifiable results is precisely what makes the work valuable to the industries that need it most: defense, aerospace, finance, and critical infrastructure, where trust is the product.

Asari AI was founded with a mission to build AI agents that can learn, reason, plan, build, and verify their own work at scale — co-invention, the automation of the scientific process. The company's scientific foundation is deeply rooted in Caltech, with its founders and advisors remaining at the forefront of AI research and innovation.

What ultimately makes an innovation bet like this credible is the caliber of the people already behind it. Asari AI's backers read like a short list of people who helped set the direction of modern AI: the seed round was led by Eric Schmidt, the former chief executive of Google, included Caltech among its institutional investors, and drew angels such as Jeff Dean, the chief scientist of Google DeepMind. These are not investors who scatter small bets widely. They tend to concentrate on the rare teams they believe can define a category.

The frontier of AI is moving from producing answers to inventing and validating results in an automated scientific loop. This capability holds the promise to solve some of the hardest problems, like securing the world's software. The teams that can make that loop trustworthy are scarce, and the capability is wanted by the best-funded buyers on Earth. Asari AI is one of the few that can show it works — backed by a demonstration that cleared one of the most demanding standards in engineering, and by talent the field's leaders have their eyes and ears on. In the real world, every new era is built by investing early in the foundation that the rest will run on — roads, power, rails. Automating invention dependably is that foundation, and Li and his team are building it.