Anthropic Reportedly Explores Fractile’s Memory Compute Fusion for Faster, Cheaper AI Inference

May 5

Anthropic is reportedly in early talks with The Information over a potential deal involving Fractile, a London startup developing a new inference focused chip architecture built around what it calls Memory Compute Fusion. The reported interest fits Anthropic’s broader strategy of diversifying its compute stack as AI demand keeps climbing and inference becomes a larger part of the cost equation.

Fractile’s core idea is to reduce the amount of data that has to travel back and forth to external DRAM by keeping much more of the workload on chip through a large SRAM centric design. On its official site, Fractile says it is building systems to run frontier model inference up to 25x faster and at one tenth the cost, while third party reporting tied to the Anthropic discussions says the company has pitched an even more aggressive long term claim of up to 100x faster inference and 10x lower cost in some comparisons. That gap matters, because it shows the technology is still in an early and partly aspirational stage rather than a fully proven commercial product.

The broader market logic is easy to understand. Inference is increasingly constrained not just by raw compute, but by memory movement, latency, and energy efficiency. Fractile is trying to tackle that bottleneck with a design that keeps data closer to compute instead of constantly pushing it out to off chip memory. If that works at scale, it could make Fractile attractive to an AI company like Anthropic that wants lower latency and better cost control without depending entirely on a single supplier ecosystem.

It is also important to correct one point in the source text. NVIDIA did not clearly complete a full acquisition of Groq based on the best public reporting I could verify. Reuters reported that NVIDIA licensed Groq technology and hired key executives, while Groq was said to remain an independent company. That distinction matters, because it places Fractile in a market where SRAM heavy inference architectures are attracting serious attention, but not necessarily through simple outright acquisitions.

Anthropic’s infrastructure strategy already reflects this kind of diversification. The company recently announced a new agreement with Google and Broadcom for multiple gigawatts of next generation TPU capacity starting in 2027, and Reuters has also reported that Anthropic uses a mix of NVIDIA GPUs, Google TPUs, and AWS Trainium. That means any future Fractile tie up would likely expand Anthropic’s portfolio rather than replace its current suppliers.

Still, this is very early. Reporting on the talks says Fractile has not yet produced a test chip, and its commercial timeline appears to be further out, with broad availability not expected before 2027. That makes the current story less about immediate deployment and more about positioning. Anthropic seems to be exploring whether a fresh inference architecture could eventually give it an edge in cost, latency, and supply flexibility as the AI market shifts from training dominance toward massive scale inference.

For now, Fractile remains one of the more intriguing names in the next wave of AI silicon. The company’s claims are bold, the technology is not yet validated in shipping silicon, and Anthropic has not publicly confirmed a deal. But if Memory Compute Fusion can deliver even a meaningful portion of what is being promised, this could become one of the more important inference hardware stories to watch over the next few years.

What do you think will matter more for the next AI hardware race, bigger GPUs with more memory, or new chip architectures built specifically to cut inference cost and latency?

AnthropicFractileAI inferenceSRAMAI chips

Angel Morales

Founder and lead writer at Duck-IT Tech News, and dedicated to delivering the latest news, reviews, and insights in the world of technology, gaming, and AI. With experience in the tech and business sectors, combining a deep passion for technology with a talent for clear and engaging writing

Anthropic Reportedly Explores Fractile’s Memory Compute Fusion for Faster, Cheaper AI Inference

GTA 6 Final Crunch Reportedly Intensifies at Rockstar as Anonymous Reviews Describe Late Night QA Push

NVIDIA Reportedly Pulls Co Packaged Optics Forward to 2028 With Feynman, Accelerating a Key AI Interconnect Shift