Enterprises that have been rushing to adopt agentic artificial intelligence (AI) systems are consuming tokens at an unprecedented rate, resulting in exorbitant monthly bills from the major public cloud suppliers. As these costs spiral, Dell Technologies is looking to capitalise on the ensuing bill shock with new hardware and software offerings unveiled at its customer conference this week, betting that the future of enterprise AI is local, secure and shielded from variable cloud pricing.
The token cost crisis
The core issue driving Dell's strategy is the rapidly increasing volume of AI tokens consumed by enterprises. Tokens are the basic units of text or data processed by large language models, and agentic AI systems—which can autonomously perform complex tasks—consume them far faster than earlier generative AI applications. According to Varun Chhabra, senior vice-president of infrastructure solutions group at Dell, the amount of tokens generated is increasing faster than token costs are coming down. This means that despite falling per-token prices, the overall bill for customers is rising sharply.
To illustrate the point, Jon Siegal, senior vice-president for Dell’s client solutions group, noted that a single developer within Dell recently burned through one billion tokens in 24 hours, racking up a $3,400 cloud bill in a single day. Such anecdotal evidence underscores the financial pressure that agentic AI places on cloud budgets, especially as more enterprises deploy multiple AI agents across their operations. The phenomenon has been dubbed “token shock” by industry analysts, and it is driving a search for more cost-predictable alternatives.
Dell's on-premise AI sandbox
In response, Dell is introducing Dell Deskside Agentic AI, an on-premise sandbox for building, testing and running AI agents locally. Powered by Nvidia NemoClaw and running on high-performance Dell workstations capable of supporting models from 30 billion up to a trillion parameters, the offering ensures sensitive data never leaves the corporate environment. Siegal noted that running agentic AI entirely on-premise with open models can reduce enterprise spend by up to 87% over a two-year horizon compared to public cloud APIs, with a break-even point in as little as three months. "The workstation is really becoming that free token generator for the right use cases," Siegal explained. "Agentic AI, more than anything else, is most cost-effective when it's near the data."
This approach appeals to organisations in sectors such as healthcare, finance, and manufacturing, where data sovereignty and low latency are critical. By bringing the compute closer to where the data resides, enterprises can avoid the network latency and egress costs associated with cloud AI services. Moreover, the use of open models provides flexibility and avoids vendor lock-in, as enterprises can fine-tune or replace models without being tied to a specific cloud provider's API.
Bringing frontier models to the datacentre
Historically, the most powerful frontier models have been locked behind public cloud walls. But Dell is looking to tear down those walls through a series of high-profile partnerships, bringing advanced models on-premise or into hybrid settings for data sovereignty and performance. Dell announced that Google Gemini models will now run on-premise via Google Distributed Cloud on Dell PowerEdge servers. Additionally, the company is collaborating with Palantir to bring its Foundry and AI platforms on-premises and teaming up with SpaceX AI to bring Grok’s advanced reasoning and multimodal capabilities to on-premise or hybrid environments for customers. "I cannot stress how big of a deal this is," Chhabra said. "These are some of the world’s most powerful frontier models that have so far only been available in the cloud...giving customers more choice, flexibility on where they want to run these models, and bringing all of these models closer to their data and their enterprise workloads."
During the conference keynote, executives from major industrial and pharmaceutical giants also took the stage to detail how they are using on-premise AI infrastructure. Diogo Rau, executive vice-president and chief information and digital officer at Eli Lilly, explained how the company relies on a Dell supercomputer equipped with more than 1,000 Nvidia GPUs to simulate complex protein interactions for drug discovery and digitally inspect manufacturing lines in milliseconds. Meanwhile, Suresh Venkatarayalu, chief technology officer of Honeywell, described deploying physical AI servers directly at industrial sites to drive autonomous operations where real-time decision-making is critical. These examples illustrate that on-premise AI is not just a theoretical proposition but a practical reality for some of the world's largest enterprises.
Hardware and storage refresh
To support intensive AI models side-by-side with traditional workloads, Dell also announced a total hardware and software refresh for its flagship storage array: PowerStore Elite. Boasting up to three times the input/output operations per second (IOPS), density and throughput of previous generations, PowerStore Elite uses new E3 drives, removes the NVRAM cache to maximise usable capacity, and pushes Dell’s data reduction guarantee to an industry-leading 6:1 ratio. "The question isn’t just what can this platform do today? It is, will this decision still make sense a year from now? What happens when my workloads change? What happens when my costs shift?" Chhabra noted. "This is exactly why PowerStore Elite matters."
On the compute side, Dell unveiled the 18th-generation PowerEdge servers, touted as the broadest single-socket lineup the company has ever shipped. Delivering up to a 70% performance improvement over the previous generation and a 13:1 server consolidation ratio, the new servers will all ship with quantum-safe firmware in preparation for 2027 post-quantum cryptography mandates. For organisations grappling with the physical deployment of AI fabrics, Dell also introduced the Dell PowerRack, where AI compute, network and storage are engineered as a scalable unit and the Dell PowerCool CDU-C7000, a compact cooling distribution unit delivering over 220 kW of liquid cooling capacity for high-density GPUs such as Nvidia’s Rubin.
Security and resilience
Dell is also streamlining its security portfolio. The company introduced PowerProtect One, a unified cyber resilience platform that merges the capabilities of PowerProtect Data Manager and Data Domain into a single control plane, reducing deployment time by up to 75%. To help organisations improve resilience against cyber attacks, Dell also unveiled CyberDetect, an AI-powered analytics tool that deeply inspects data at the byte level to identify ransomware corruption. Boasting 99.99% accuracy, it allows IT teams to definitively know which data is clean after an attack, turning ransomware recovery "from uncertainty into AI-powered, evidence-based assurance".
These security enhancements are particularly important as organisations move sensitive AI workloads on-premise, where they may not have the same built-in security protections as public cloud environments. By offering integrated cybersecurity tools, Dell aims to provide a comprehensive platform that addresses both performance and safety concerns.
As Dell brings these major updates to market, its message to enterprise IT leaders is clear: the infrastructure to support scalable, cost-predictable AI is ready today and the financial models of the past no longer apply. "AI is not just changing technology, it’s changing the economics of technology in favour of enterprise infrastructure," Dell Technologies’ CEO Michael Dell said in his keynote address. "Now is the time to decide how you can most cost-effectively generate the tokens that you’re going to need for the long term."
The implications for the enterprise IT landscape are profound. As cloud bills continue to climb, the value proposition of on-premise AI becomes harder to ignore. Dell's aggressive push into this space, backed by partnerships with major model providers, signals a shift in the balance of power between cloud and on-premise infrastructure. Enterprises now have a viable alternative to the cloud for their most demanding AI workloads—one that promises cost predictability, data sovereignty, and performance without compromise.
Source: ComputerWeekly.com News