
In July 2022, Dominion Energy paused new data center connections in Northern Virginia’s Loudoun County - the grid couldn’t keep up. They’ve since resumed with reduced capacity, but the constraint is clear: data center demand jumped from 33 GW to 47 GW in less than a year, and the utility is scrambling with $50 billion in infrastructure upgrades.
This is “Data Center Alley” - 70% of the world's internet traffic flows through here. And it’s hitting limits. Dublin rejected a Google data center in August 2024 that would consume more electricity than all the city's homes combined. Singapore lifted its data center moratorium in 2022 but with strict sustainability requirements - approvals remain scarce. The pattern is global: data centers are growing faster than Earth can power them.
The bottleneck isn’t silicon anymore. NVIDIA’s Blackwell GB200 GPUs consume up to 1,200 watts per chip and require liquid cooling. At those power densities, facilities need entirely new electrical and cooling infrastructure just to turn them on.
By 2030, data centers will consume 10-12% of global electricity. That’s not a projection. That's physics meeting exponential demand.
At Davos in January 2026, Elon Musk said something that made the room go quiet.
“The lowest cost place to put AI will be space. And that’ll be true within two years, maybe three at the latest.”
Not eventually. Not someday. Two years.
Six days later, SpaceX filed with the FCC. The application requested approval for up to 1 million satellites - not for internet, but for data centers. The filing called it "a first step towards becoming a Kardashev Type II civilization."
Musk posted on X: "I thought we'd start small and work our way up."
Jeff Bezos had been saying it quieter for years. Blue Origin's long-term goal isn't just space tourism - it's moving all heavy industry off Earth. “If you want a whole solar system full of people,” Bezos said in November 2024, “you need gigawatt-scale data center capacity in space.”
The billionaires aren't fantasizing. They’re solving a bottleneck that threatens to choke the AI revolution.
Here’s what changes when you leave Earth:
Solar panels in orbit receive 1,361 watts per square meter, 24/7. No night. No clouds. No winter. A data center on Earth might get 6-8 hours of equivalent sunlight daily - if it's sunny.
Heat rejection becomes trivial. Point a radiator at space - at 3 Kelvin, nearly absolute zero - and watch physics handle the cooling. No chillers. No water. No 40% power overhead just to keep servers from melting.
And launch costs? They've collapsed. What cost $20,000 per kilogram a decade ago now runs $2,700 on Falcon 9. Starship is targeting $100/kg. At that price, launching a 20-ton data center module costs $2 million - less than building equivalent capacity on Earth.
The physics always worked. Now the economics do too.
In November 2025, a startup called Starcloud - backed by Y Combinator and NVIDIA - launched a satellite carrying an H100 GPU. The first AI-optimized processor in orbit. It’s up there right now, running Google's Gemma language model, proving the concept works.
Google isn’t waiting either. Project Suncatcher launches in 2027 - two satellites carrying custom TPU chips that already survived five years of simulated radiation in a particle accelerator. Sundar Pichai's vision: 81-satellite clusters communicating at 1.6 terabits per second via laser links. “This will be normal within a decade,” Pichai said.
Lumen Orbit is building 20-ton modules with 500-kilowatt solar arrays. Axiom Space is launching station modules with data center capacity. Thales Alenia Space is developing "Secure Data in Space" for the European Space Agency.
The race isn’t starting. It started.
But here’s what the press releases don't tell you:
Nobody knows if robots can reliably repair a $50 million satellite when a $200 component fails. Nobody’s proven that commercial processors can survive years of cosmic ray bombardment with just error correction - or if we need expensive radiation-hardened chips. Nobody's built megawatt-scale radiator panels that deploy reliably in vacuum.
And then there are Silent Data Errors - when radiation corrupts data without triggering any error detection. These are “silent-enough on Earth but in orbit, bombarded by cosmic rays? They won’t be silent. They’ll be catastrophic.
The questions are specific and brutal:
These aren’t solved problems. They’re active engineering challenges that will determine if the Orbital Cloud becomes infrastructure or vaporware.
Over the next 12 articles, I’m going deep on the engineering, the real barriers between billion-dollar announcements and working data centers in orbit.
1. The Terrestrial Bottleneck - Why Earth's infrastructure can't scale with AI demand.
2. Engineering the Vacuum - Cooling megawatt GPU clusters without air or water.
3. The Robotic Workforce - Autonomous repair when humans can't reach your hardware.
4. Radiation Hardening - Surviving cosmic rays, solar particles, and Silent Data Errors that won't stay silent.
5. Orbital Edge Computing - Why processing data in orbit beats downlinking to Earth.
6. High-Yield Solar - Deployable arrays, eclipse management, power per kilogram.
7. Legal Frontiers - Data sovereignty beyond Earth's jurisdiction.
8. The Starship Effect - Launch economics: at what $/kg does space actually win?
9. Laser Backbones - Inter-satellite optical links building the Orbital Cloud.
10. Circular Sustainability - Recycling orbital hardware, avoiding e-waste 500km up.
11. Real-time Telemetry - Monitoring constellations with AI anomaly detection at scale.
12. The Multi-Planetary Cloud - Lunar data centers, Mars infrastructure, interplanetary internet.
Each article examines what’s different in space, what’s harder, and what becomes possible when you remove Earth’s constraints.
As a certain goddess of death once said: “You have no idea what's possible.”
SpaceX filed for 1 million satellites. Google launches in 2027. An H100 is already orbiting above you.
The Orbital Cloud isn’t science fiction. It’s engineering happening now, and the engineering is everything.
Next: Part 1: The Terrestrial Bottleneck

TechArena is proud to announce its role as a media partner for the AI Infra Summit, the full-stack AI and ML infrastructure event bringing together hardware providers, hyperscalers, and enterprise practitioners under one roof. This September, 8,000 of the industry’s leading builders, buyers, and decision makers will gather for the summit at the Santa Clara Convention Center—and TechArena will be there with them, in the room and on the platform.
This partnership is a natural one. In nine years, the AI Infra Summit, run by Kisaco Research, has grown from a niche event into a hub for the industry: a meaningful, technical event where the conversations that happen between sessions are as valuable as the ones on stage. TechArena was founded on exactly that instinct: that the most important insights in technology happen when practitioners and innovators have a space to share their ideas. The AI Infra community is our community.
Starting this spring, TechArena will integrate AI Infra Summit content and coverage into our editorial calendar and work directly with the event’s exhibitors and sponsors to amplify the discussions that they’ll be driving in Santa Clara. That means pre-event content that builds awareness and drives the right conversations before attendees ever arrive, real-time coverage from the floor, and post-event distribution that keeps the ideas in motion.
AI infrastructure is the foundational layer on which everything else gets built, and the pace of investment, vendor selection, and architectural decision-making has never been faster. The AI Infra Summit draws companies representing every layer of the stack alongside hyperscaler attendees and enterprise practitioners actively evaluating vendors and budgeting for infrastructure upgrades.
Events like AI Infra Summit are where that decision making gets accelerated. TechArena exists to make sure those conversations don’t stop when the badges come off.
As part of our partnership, we’re pleased to offer benefits to our TechArena followers who want to join the conversation in Santa Clara this fall:
Look for TechArena editorial coverage of AI Infra Summit in the months ahead. We’ll be publishing practitioner-focused content, spotlighting the innovations that will be on the expo floor, and giving our community early visibility into what’s shaping up to be one of the most consequential gatherings in the AI infrastructure calendar.
If you’re exhibiting at the AI Infra Summit and want to talk about how TechArena can help amplify your presence, reach out directly. There’s no boilerplate here: every engagement starts with a conversation.
We’ll see you in Santa Clara.

As Q1 2026 winds down, the AI industry is undergoing a turbulent structural realignment, pivoting from a race for smarter models to a desperate land grab for the power, pipes, and provenance that make them functional.
If the last two years were defined by the “Model Wars,” as enterprises sprinted to produce the most capable large language model (LLM), 2026 is emerging as the year of vertical integration and middleware dominance.
The era of experimental pilots is over. Major tech incumbents and specialized neoclouds are no longer just buying intelligence; they are acquiring the infrastructure required to turn that intelligence into a functional enterprise operating system.

1. The Rise of Sovereign AI (Nscale & Future-tech)
The concept of Sovereign AI has moved from a policy aspiration to a massive commercial driver. With the UK and Canada actively funding domestic AI stacks, neoclouds like Nscale are seeing record valuations. Nscale’s $2 billion Series C is fueled by its ability to build AI factories that comply with local data residency laws, a mission bolstered by its 2025 acquisition of Future-tech, which gave it the in-house engineering muscle to design facilities faster than traditional hyperscalers.
2. The Orbital Escape (The SpaceX/xAI Merger)
Perhaps the most audacious deal in tech history, the merger of SpaceX and xAI values the combined entity at $1.25 trillion. The strategic rationale is purely physical: terrestrial data centers are hitting power grid limits. By merging with SpaceX, xAI aims to move massive training and inference workloads to orbital, solar-powered data centers, effectively leveraging the infinite square footage of outer space. (Stay tuned for a series about data centers in space from TechArena Voice of Innovation Niv Sundharam).
3. The Social Infrastructure of Agency: Meta Acquires Moltbook
The shift from isolated chatbots to social participants was cemented today, with Meta’s confirmed acquisition of Moltbook. Moltbook is an AI-agent social network designed specifically for autonomous systems to interact, share context, and coordinate tasks. By folding founders Matt Schlicht and Ben Parr into Meta’s AI division, the company is securing the social layer of the agentic era. This move signals that the next phase of competition isn’t just about how smart an agent is, but how effectively it can collaborate within a broader network.
We are witnessing the industrialization of intelligence.
For the past two years, the industry has been focused on the brain (the LLM); today, the focus has shifted to the nervous system and the skeleton. The rush to acquire middleware giants like Confluent and safety frameworks like Promptfoo proves that the “model moat” has evaporated.
In its place, a new barrier to entry is forming: architectural integration. Companies that can seamlessly connect real-time data to autonomous agents while maintaining a “moat of trust” will dominate the second half of this decade. For startups, the integration gap has expanded; if your product only identifies an AI problem without possessing the infrastructure to fix or govern it in real-time, you are an acquisition target, not a platform.

Maher Hanafi of Betterworks joins TechArena Data Insights to discuss AI in enterprise SaaS, why many AI proof-of-concepts fail, and how engineering leaders can successfully move AI into production.

Cloud expert Venkata Gopi Kolla joins Allyson Klein to discuss the CDN "single point of failure" and a new IETF protocol for sub-second edge recovery and AI correctness. A must-listen for infrastructure leads.

As organizations build private AI clouds to control costs and protect their data, they face a familiar dilemma: the trade-off between performance and operational simplicity. Hyperscalers (like AWS or Google) have both, but only because they have armies of engineers to build custom software that tames their hardware.
My recent conversation with Solidigm's Jeniece Wnorowski and Marc Austin, CEO and co-founder of Hedgehog, revealed how enterprises can now access that same "Hyperscaler Agility"—without the army of engineers.
The key? Decoupling the control plane from the hardware.
Hedgehog’s mission centers on enabling enterprises, government agencies, and neoclouds to “network like a hyperscaler.” This means moving beyond rigid trade-offs. Instead of being forced to choose between the stability of validated reference architectures or the flexibility of open standards, Hedgehog allows organizations to leverage both, orchestrated by a single software platform.
This approach offers a massive strategic advantage: Supply Chain Resilience.
As Marc explained, a diversified hardware strategy is critical for risk management. “If you have a supply shock—like a global pandemic or a trade war—that can limit your ability to scale because supply becomes constrained,” he noted. “You can’t add capacity to your network when you need to.”
By running open-source software on OCP standards-based servers, organizations can acquire equipment from whichever vendor offers the best price and availability at that moment. And because Hedgehog’s control plane is hardware-agnostic, it can eventually extend this same flexibility to other high-performance reference architectures, ensuring that the software experience remains consistent regardless of the underlying silicon.
Hardware diversity is only half the battle; the other half is operational speed. Hedgehog delivers all the software needed to automatically install, configure, and operate AI networks as a turnkey "appliance." This eliminates weeks of manual configuration work by network architects.
More importantly, it democratizes access. By providing a Virtual Private Cloud (VPC) service, Hedgehog allows enterprise users or neocloud tenants to operate within a private, secure segment—consuming on-premise AI infrastructure with the same self-service ease they expect from a public cloud provider.
The power of this "Universal Control Plane" is evident in how customers are using it to bypass traditional infrastructure bottlenecks.
Zipline, an automated drone delivery company, utilized Hedgehog to build a private cloud that cut infrastructure costs by 70% while keeping their delivery data secure. The critical win wasn't just the hardware savings—it was the operational model. They managed the deployment with their existing DevOps team, without hiring specialized network engineers, because Hedgehog abstracted the physical switching complexity into simple software commands.
In the high-performance arena, FarmGPU (operating the Solidigm AI Central Lab) used Hedgehog to orchestrate an 800G fabric for AI training. Independent testing by SemiAnalysis highlighted that Hedgehog’s software-defined congestion management maximized bandwidth and GPU utilization.
This proves a vital point for the future of AI: The software you use to manage the network matters just as much as the wire itself.
Agility isn't just about the switch fabric; it's about how data enters the building. FarmGPU faced a challenge familiar to many AI operators: ingesting terabytes of training data through a limited enterprise firewall.
Legacy solutions required expensive, proprietary hardware routers. Hedgehog’s software-defined gateway turns standard x86 servers into high-performance routers. This effectively brings the functionality of a public cloud "Transit Gateway" on-premise, allowing secure, multi-tenant segmentation for AI workloads.
Hedgehog is redefining the role of the network in the AI stack. By focusing on a hardware-agnostic control plane, they are ensuring that the "Brain" of the network (the automation) is distinct from the "Body" (the switch).
This is the architecture of the future. It gives enterprises the ultimate luxury: Choice. It allows IT leaders to select the best hardware for their specific workload—optimizing for cost, performance, or supply chain availability—while maintaining a consistent, automated operating experience across the entire fleet.
For organizations that view data as their competitive moat, this ability to unify diverse infrastructure under one automated standard is the key to scaling AI.

In the race for advancing technology, time, funding, and attention are often dedicated to immediately monetizable applications. While industry roadmaps certainly drive technological advancement, basic science, which forwards our fundamental understanding of the universe, can create breakthrough findings with wide-reaching applications and effects.
My recent conversation with Silvia Zorzetti from Fermilab and Solidigm’s Jeniece Wnorowski revealed how such research into the convergence of high-energy physics and quantum technology is creating outstanding developments for quantum computing.
As a U.S. particle accelerator laboratory, Fermilab has spent decades perfecting superconducting cavities that accelerate particle beams to near light speed. Through years of study, researchers at the lab have identified several sources of noise that can make these superconducting cavities less efficient, and they have worked to eliminate those sources of loss.
In 2017, researchers began studying these same cavities at the quantum level. As Silvia explained, at this single photon level, the energy is much lower, which means there are new potential sources of loss compared to the higher energy levels. “We can focus on the basic science and the basic understanding of those mechanisms,” Silvia explained.
At the same time, Fermilab is finding ways to transform superconducting cavities to be efficient for quantum computing. By placing qubits inside the cavities, Fermilab has achieved 20 milliseconds of coherence. That coherence time represents a critical advance in the typical rapid decay time of quantum information. “And we know that it is possible to achieve more coherence,” Silvia said.
Quantum computers won’t replace classical computers for all tasks. The technology excels at specific problems, including the Shor algorithm for prime number factorization and Grover’s algorithm, a quantum algorithm for unstructured search. As the threats to digital security grow, the Shor algorithm is of extreme interest because of its efficient ability to factorize prime numbers, a process that is very important for cryptography. “This means that we can have systems that are very secure because they cannot be broken by someone else with a more advanced cryptography system, let’s say an alien, because here on Earth, we all have to comply with the quantum mechanics rule,” she explained.
Beyond these practical applications, quantum computing excels for quantum simulations for field theories, which involve many interacting components. “We have these huge many body problems in which there are so many entities. We know how to describe them if they are alone,” Silvia explained. “But then when they start to interact to each other, it becomes a very complex model. So, we know that quantum computing is very good for those kinds of applications.”
While quantum computing has seen great progress, challenges remain to realizing its full potential. Silvia identified two types of hurdles to be overcome: “engineering problems,” where technical challenges are understood and need to be addressed, and problems where more fundamental research is required.
“Those are mainly between the interconnects,” she said. “The interconnects are needed for scaling quantum computing…and in particular, interconnects with very low losses.” Quantum information is a weak signal, so research to find solutions that will prevent the loss of this information remains a priority.
Quantum error correction presents another major hurdle. Quantum states are inherently fragile, and errors can arise in quantum information due to decoherence and other issues. The community is developing algorithmic techniques that make computations more robust to noise, while simultaneously working on hardware improvements.
The environmental sensitivity that creates problems for quantum computing is actually an advantage for quantum sensing, which is already delivering results. Fermilab is leveraging this sensitivity to detect dark matter, dark photons, and axions. It is also studying radiation, like the gamma ray, to understand better how these affect quantum computers, and how quantum computers could be built that are robust to radiation.
Fermilab’s approach to quantum computing demonstrates how domain expertise from one field can catalyze breakthrough innovations in another. By leveraging decades of superconducting cavity development for particle accelerators, the laboratory has achieved amazing quantum coherence times. The pursuit of fundamental knowledge about how materials behave at the quantum level has yielded practical breakthroughs that are now accelerating the entire field forward.
For those interested in learning more, Fermilab hosts regular symposiums and outreach programs, including their Quantum 101 track designed for non-experts. Learn more about Fermilab’s quantum research at fnal.gov.

Software development has matured over the years but fundamentally still comes down to beating your head against the wall while figuring out the art of giving directions to a machine that, at its foundation, “speaks” in on and off switches. The computers powering NASA in the race to the moon required punch cards that were literal stacks of paper fed into a machine, where the presence or absence of a hole represented a code translated into an electrical signal. A deck of punch cards represented a program. Later, programmers became experts in binary code and firmware until the introduction of high-level language compilers. Ironically, compilers and increasingly higher levels of programming languages led some to predict the demise of the art of software programming itself.
Since December 2025, there has been a lot of marketing and PR from AI model companies that "software engineering is dead" because of their internal experience in consuming their own AI tools for writing software. While researching that statement, I confirmed that the same AI model companies continue to post healthy numbers of job openings on their career sites for software engineering roles. How can both of these things be true? I believe the answer can be found going back into the history of computing, where many dire predictions were made about the disappearance of engineering and management jobs - yet today those professions continue to evolve and grow.
We were overdue for a major update in how humans interact with machines, given the maturity of web-based design. Punch cards gave way to compilers and command lines, then to the GUI and web interfaces, and later to mobile. Each progression was revolutionary compared to how interaction worked in the prior generation. For example, web design and search engine optimization were refactored once mobile phone interfaces and app stores began to dominate customer experiences.
The original GenAI chatbots were seamless because web interfaces and mobile computers were already ubiquitous. But they weren’t the breakthrough that brought us closer to the dream of harnessing computing to help with nagging, annoying tasks. Interfacing with and harnessing the computer continued to be the nagging, annoying task, requiring proficiency in rote learning and grinding through lines of code.
Apple has a long history of shaking up the computing industry through breakthroughs in how we interact with devices—from the first Apple computer with a graphical user interface to the iPod, iPhone, and beyond. Open Claw represents a breakthrough because operating it does not depend on the old modes of human–computer interaction that chatbots required. There are many downsides related to security and vulnerabilities, but that isn’t the point. For meaningful differentiation in software solutions, finding ways to interact with users the way they want—rather than the way software developers envision—will be sustainable differentiation.
There are opportunities for an Open Claw–like corporate offering that includes lifecycle management, auditing, and governance-as-code, with built-in FinOps to ensure there are no unpleasant budgeting surprises from token usage.
For the last 15 years, discussion forums like Reddit and Blind have been packed with prospective developers looking for the recipe for the fastest path into one of the “FAANG” companies. How many Big Tech parents pushed their kids into coding and robotics clubs to set them up for a career in SaaS? The goal often seems to be: get in, suck it up, work at a relentless pace, watch your stock go up, cash out, and move to the next rung.
As long as companies needed armies of developers to work in a production line—one that LeetCode could prepare you for—the assembly line from code camp to university to long-term employment at a large software company ran smoothly. GenAI and coding tools have been rattling these markets, and I’ve lost count of the social media influencers who have preached, “You won’t be replaced by AI; you’ll be replaced by someone using it,” especially to software developers over the last three years.
To those who believe that learning the mechanics of AI coding tools faster than the other 90% of equally panicked peers is the path to safety—as if it’s another “ace interviews at Google” course—here’s the reality: your use of AI tools is not a sustainable differentiator between long-term employment and becoming a bartender. CEOs—many of whose admins still print emails for them to read—may tell you that using AI tools is the goal and that you should start working with them or else. Tool usage and leverage are not sustainable differentiators, even if they serve as a short-term sifting mechanism.
Speed in the mechanics or methods of producing software is now a baseline, but it does nothing to explain why that work is worth doing in the first place. Software development has a long history of ever-higher levels of abstraction that demand more compute power but deliver better scale, faster feature development, and improved debugging. The increased efficiency of how humans direct machines isn’t interesting to anyone outside the tech industry unless it materially improves business outcomes for customers.
Yes, learn AI coding tools—they are the next level of abstraction in a long history of attempts to relieve developer frustration from grinding out lines of code. But for your own sake, learn them while working on problems that matter, and do it in a maintainable, enterprise-ready way.
There are so many AI tools and applications competing for business and there are equal numbers of non-developers working with coding tools. For software engineers who have ideas but no experience doing market research, there are now ready made research support systems from multiple vendors to help vet and verify the market opportunity for your idea. The fact that you do not have to pitch the idea through management to ensure a 100 person coding team gets assigned to develop the first version is both freedom and frightening. You are no longer held back by business analysis - just your decision on what to make or buy and how you take it to market. For non technical business owners who have AI products pitched to you nearly round the clock, focus on where you want your business to be unique and where you want to offload mechanics that are for you a chore or cost that is a necessity.
While there are many decision frameworks around when to make or buy AI, AI agents, AI management platforms and AI applications, very few of the frameworks look at how to balance ease of use with long term supplier oversight and management. The decision is not just a technology skill question but needs to focus on whether your supplier will change access to their technology or whether the supplier has to change pricing models because their board or shareholders demand margin improvements. Talking to small and large enterprises, I find that the focus is on technology sustainability for mid to large companies who have access to technical skills - but perhaps more focus is needed on FinOps for AI where autonomous agents are being scaled. At smaller companies I advise, I find that the decisions start from a financial lens but could use structured thinking around long term differentiation - where does the owner or leader need to own or retain a key capability to set their business apart in the minds of their customers.
The next few articles in the series will walk through at a deeper level some of the decision tradeoffs, strategic questions and frameworks I have used to drive clarity for major decisions customers and partners have faced.

Ishween Kaur, Generative AI Lead at Salesforce, shares practical lessons on building multilingual AI systems, closing language gaps, and scaling with trust and real-world feedback.

Enterprise AI conversations still revolve around models. Benchmarks, context windows, and release cycles dominate the discussion.
But inside production environments, the real shift is happening elsewhere.
The competitive advantage in enterprise AI is moving away from model selection and toward the runtime architecture that surrounds it.
Once AI leaves the demo environment and enters core workflows, it must be governed, monitored, cost-controlled, and made resilient. That is not a model problem. It is an operational one.
Traditional enterprise systems were built around deterministic pipelines. Data moved from source to warehouse to dashboard. Outputs were reproducible. Monitoring focused on throughput and uptime.
AI systems depend on something different: dynamic context.
Large language models (LLMs) rely on embeddings, retrieval layers, policy documents, customer history, and transactional signals that are often refreshed on tight cycles. When that context degrades, output quality degrades.
In production deployments, many perceived model quality issues are actually context failures:
The architecture therefore shifts from static data pipelines to context engines. These systems are designed around freshness service level agreements (SLAs), versioned vector stores, and controlled retrieval boundaries.
The model generates the answer. The context determines its reliability.
Earlier governance models focused on who could access a system. AI introduces a different challenge: how the system behaves once accessed.
Outputs are probabilistic. Prompts vary. Users experiment. Sensitive data can surface in unexpected ways.
Modern AI runtimes increasingly include:
Consider a financial services copilot assisting relationship managers. A user asks for a client summary. The model has access to customer relationship management (CRM) notes, transaction data, and compliance documentation.
Without behavioral guardrails, the system could:
The runtime must intercept, evaluate, and shape responses before delivery.
Governance is no longer static access enforcement. It becomes active runtime mediation.
Traditional infrastructure cost planning followed relatively stable patterns. Workloads were forecastable. Capacity was provisioned accordingly.
AI inference introduces volatility.
Token consumption fluctuates based on prompt size. Adoption spikes increase inference load. Copilots embedded into daily workflows can quietly multiply request volumes.
Enterprises are responding by embedding cost awareness into the runtime layer:
Without runtime-level cost instrumentation, AI initiatives can scale faster than financial oversight.
In this environment, cost architecture is structural.
Traditional observability focuses on infrastructure metrics such as central processing unit (CPU) utilization, memory pressure, and latency.
AI systems require something more nuanced. A model can respond quickly and consistently while producing degraded or risky decisions. Decision health expands observability into the quality and impact of AI outputs.
In practice, this includes monitoring:
If an AI assistant’s recommendations are increasingly overridden by users, the system may be technically healthy but operationally degrading. A rise in fallback activations may signal retrieval gaps or tightening policy enforcement.
AI systems are not just infrastructure components. They are decision amplifiers. Observability must reflect that reality.
Across industries, a common architecture is taking shape beneath enterprise AI deployments. Organizations are building structured data backbones, versioned embedding layers, policy-aware orchestration engines, guardrail services that mediate outputs, integrated cost monitoring tied directly to request routing, and decision-level observability with full audit trails.
The visible model can change. The runtime scaffolding persists.
Over time, the reliability, governance posture, and economic efficiency of that runtime determine whether AI systems become trusted infrastructure or remain isolated pilots.
Models will continue to evolve. Benchmarks will continue to shift.
Inside enterprise environments, models are becoming modular components.
The durable advantage lies in the architecture that governs them. The runtime manages context freshness, enforces policy, instruments cost, and monitors decision integrity.
Enterprise AI is no longer just a capability layer. It is an operational layer.
As with every operational layer before it, organizations that engineer it with discipline, not enthusiasm, will outperform those that treat it as novelty.
The future of enterprise AI will not be defined by who selects the best model.
It will be defined by who builds the most resilient system around it.

At VAST Forward 2026, VAST Data expanded the view of its ambitions for its AI operating system. The company dropped five major announcements in a single day: new agentic AI capabilities, a fully accelerated hardware stack built with NVIDIA, a global infrastructure control plane, a video intelligence partnership, and a formalized partner ecosystem. Taken together, they represent a vision of what enterprise AI infrastructure looks like when it stops being a collection of components and starts behaving like a unified, intelligent system.
The marquee announcement was the unveiling of two new capabilities of the VAST AI Operating System, VAST Data PolicyEngine and VAST Data TuningEngine. These two new services, slated for release by end of 2026, are designed to work in tandem inside the VAST DataEngine to create what the company calls a “thinking machine” — a system that doesn’t just execute AI pipelines but governs them, evaluates them, and improves on them automatically.
PolicyEngine functions as an inline enforcement layer for agentic workflows, applying fine-grained, tamper-proof controls on what agents can access, what tools they can invoke, and how they communicate with other agents. TuningEngine captures outcomes from those workflows and feeds them into fine-tuning pipelines using methods like LoRA, supervised fine tuning, and reinforcement learning, automatically generating candidate models for evaluation and deployment.
The result is a closed operational loop: observe, act, evaluate, improve. VAST co-founder Jeff Denworth framed it plainly: “Just as people are always learning, so should tomorrow’s applications.” For enterprises trying to deploy AI in regulated or mission-critical environments, the combination of zero-trust governance and automated model improvement in a single platform is a major step toward trusted, autonomous systems.
The software story gets hardware muscle through an expanding collaboration with NVIDIA with an end-to-end, fully CUDA-accelerated AI data stack. The companies introduced CNode-X, a new class of NVIDIA-Certified servers that run the VAST AI Operating System directly on GPU-powered infrastructure. The deeper integration embeds NVIDIA libraries—cuVS for vector search, cuDF-based SQL acceleration via an engine called Sirius, and support for NVIDIA’s Context Memory Storage (CMX) platform—directly into VAST’s core data services.
The goal is to eliminate the fragmented stack problem: separate storage, database, and AI compute tiers that slow enterprise AI pipelines from pilot to production and add operational complexity at every seam. In doing so, it clears the path for agentic AI-enabled workloads to fulfill their promised potential. “CNode-X is CUDA-accelerated at every layer to give AI agents persistent memory so they can work on complex problems over days or weeks, and eventually years, without forgetting—opening the world to the next frontier of AI,” said NVIDIA Founder and CEO Jensen Huang.
CNode-X will come to market through OEM partners, including Cisco and Supermicro, giving enterprises a path to GPU-accelerated VAST infrastructure through vendors they already buy from.
VAST also announced Polaris, a global control plane purpose-built to provision, operate, and orchestrate distributed AI infrastructure across public cloud, neocloud, and on-premises environments. As AI pipelines span regions and providers between data collection, training, and inference, Polaris offers a centralized service delivery that converts disparate infrastructure instances into one operational environment.
Polaris is built on a Kubernetes-based architecture with a lightweight agent on every VAST node and operates as an intent-driven management layer. Administrators define the desired state of infrastructure, and Polaris coordinates the cloud-native services and VAST software to get there and keep it there. It supports cloud service provider partners, sovereign deployments, and multi-site, multi-cluster configurations under centralized management. It is available as part of VAST cloud deployments.
VAST’s announced partnership with TwelveLabs, which develops video foundation models, introduced a partnership to help organizations extend video intelligence beyond public cloud deployments. Through the collaboration, the companies will support demand for deploying advanced video intelligence closer to where the data originates and is governed. The pitch is squarely at enterprises and government agencies sitting on massive video archives that can’t or won’t push data to a hyperscaler: media companies, financial services firms running surveillance-based fraud detection, and public sector agencies where data sovereignty is non-negotiable. TwelveLabs gains a deployment path into on-prem and neocloud environments, and VAST gains a compelling vertical use case to anchor its platform story.
Finally, VAST formalized its Cosmos partner ecosystem into a unified global partner program, consolidating resellers, system integrators, independent software vendors, cloud providers, and advisory partners under a single framework with structured onboarding, tiered benefits, deal registration, and a centralized partner portal. Cosmos offers a clear engagement model for each partner type, from hardware platform partners validating architectures to consulting firms running deployment practices. H2O.ai and WWT are among the early participants.
Today’s announcements represent the most concrete evidence yet that VAST is building out its “thinking machine” vision a coherent, layered way, and is doing so with the right partners. The strategic logic is sound: if AI pipelines are becoming continuous, always-on systems, then the infrastructure layer needs to behave like an operating system—governing, learning, and adapting in real time. PolicyEngine, TuningEngine, and Polaris are all designed to elevate the company’s AI OS position.
Meanwhile, its work with TwelveLabs, the formalization of its partner ecosystem, and the CNode-X collaboration with NVIDIA demonstrate that VAST is assembling a coalition to maximize its reach and impact. Each partnership extends VAST’s reach into a different part of the enterprise buying process, including technical validation, vertical use cases, and channel distribution. Together, they suggest a company that understands the AI OS can’t succeed alone.

Brookfield Asset Management has officially launched Radiant, a vertically integrated AI infrastructure company, through its acquisition of Ori Industries, a UK-based distributed AI cloud provider. The announcement marks the transition of Radiant from concept to active operations and signals a significant bet in the AI infrastructure space.
Radiant enters the market as one of the first bets from Brookfield’s AI Infrastructure Fund (BAIIF), which itself serves as the anchor for a broader $100 billion investment program. Financial terms of the Ori acquisition were not disclosed.
As it launches, Radiant is targeting the delivery of high-performance, purpose-built AI compute to sovereign governments, telecom providers, and select large enterprises under long-term contracts.
Radiant will expand on Ori’s existing integrated AI Cloud assets, which operate out of more than 20 data centers around the world. The new infrastructure will be built on the NVIDIA DSX reference design, NVIDIA’s blueprint for what it calls AI factories, making Radiant an NVIDIA Cloud Partner. That designation matters because NVIDIA’s DSX architecture, which is designed to be Vera Rubin–ready, provides a standardized, scalable foundation for high-throughput AI workloads.
Alongside the long-term contract business, Radiant will continue to operate the Ori Global AI Cloud, a GPU-as-a-Service platform Ori built over seven years, for customers that need on-demand capacity and rapid deployment.
Brookfield’s head of AI infrastructure, Sikander Rashid, described the model plainly: “I think of it as a leasing business.” Radiant structures contracts to lock in revenue across the estimated five-year useful life of a chip cluster, with investment-grade customers committed to pay regardless of utilization. Brookfield has been explicit that it will not be taking on technology obsolescence risk in this model.
One differentiator Radiant is emphasizing is vertical integration that extends all the way to energy. AI compute is an energy consumption problem, and Brookfield’s existing portfolio includes power utilities and renewable generation assets.
Radiant’s ability to pair data center operations directly with behind-the-meter power generation represents a structural cost and reliability advantage. In a blog published with Radiant’s launch, Head of Product João Coelho noted, “Our behind-the-meter model co-locates AI Factories directly with massive-scale hydro, wind, solar, or nuclear generation. This is not an optimization of the datacenter; it is a re-architecture of the entire supply chain.”
A growing number of national governments require that AI workloads processed on their behalf remain within their borders, and Radiant is explicitly positioning itself to meet that demand. Its sovereign framework is designed to go beyond simply deploying compute in-country.
As Radiant CMO Jonathan Symonds said, “Compute sovereignty depends on ownership of the supply chain: land, power, and capital.” Radiant plans to address all of these elements.
In terms of the technology, that means air-gapped control planes, hardware-rooted security, and single-tenant bare metal configurations that ensure sensitive datasets and model weights remain invisible to external parties, including Radiant’s own engineers. Open-weight model architectures are supported to reduce dependence on any single vendor’s proprietary stack.
The capital structure reinforces the sovereign pitch. Radiant argues that most AI infrastructure has been financed with short-term, high-cost capital—venture equity or private credit carrying hurdle rates around 20%—which creates incentive structures poorly suited to the stable, long-duration assets that national AI programs require. By financing at infrastructure-grade rates of approximately 5%, Radiant contends it can offer sovereigns compute offtake contracts of 3, 5, or 10 years with predictable, hedgeable pricing.
The launch of Radiant is a meaningful development in AI infrastructure. Sovereign governments and large enterprises increasingly want AI compute that is predictable in cost, physically located in-country, and not delivered by a US hyperscaler carrying geopolitical and data residency complications.
Radiant is designed precisely for that buyer. With long-term contract structures, NVIDIA-validated architecture, and the energy integration to underpin reliable operations, Brookfield has assembled a credible stack.
The risk, as with any infrastructure-scale bet, lies in execution timing. AI chip generations turn over quickly, demand patterns from sovereign customers are still maturing, and $100 billion programs are easier to announce than to deploy. Brookfield’s contractual approach—locking customers into full payment regardless of utilization—reduces its downside but will require winning and maintaining the confidence of investment-grade counterparties in a competitive market.
Still, the resource moat is significant. Very few organizations can bring Brookfield’s combination of infrastructure experience, energy assets, and long-duration capital to bear on a compute leasing business. If the sovereign AI infrastructure market develops the way Brookfield is betting it will, Radiant will be well-positioned. Technology leaders evaluating AI infrastructure partnerships should pay close attention to what Radiant builds in its first 18 months of operation. As of today, the proof of concept is underway.

Back in November 2024, I wrote a blog which discussed the implications of cybersecurity in automotive. In that blog, I outlined some of the efforts and frameworks that have been established to address this evolving field as it applied to automotive while highlighting the ISO 21434 cybersecurity framework.
With the advent of quantum computing, which promises to bring to deliver far greater computing performance than previously imagined, existing cybersecurity solutions are expected to be at risk once Cryptographically Relevant Quantum Computer (CRQC) become available. While a CRQC is not widely expected before 2029, the importance of recognizing and addressing this new form of cybersecurity attack today cannot be overstated. As an update to my previous blog, it seemed appropriate to explore the potential impact of Post Quantum Cryptography (PQC) on next-generation vehicles and the approaches and considerations that must be taken to address this looming threat.

While automotive OEMs race to deploy Software Defined Vehicles (SDV) with sophisticated connectivity, artificial intelligence (AI), and Over-the-Air (OTA) update capabilities, quantum computing, once confined to research laboratories, is rapidly approaching commercial viability, and with it comes the harsh reality of cryptographic obsolescence. Post Quantum Cryptography (PQC) represents the industry's attempt to address this looming threat, yet its implementation presents challenges as complex as the vehicles it aims to protect. For an industry already grappling with ISO 21434 compliance and the expanding attack surface of connected cars, the transition to quantum resistant security cannot be an afterthought.
PQC refers to a new generation of cryptographic algorithms that were designed to withstand attacks from both classical and quantum computers. Public key systems currently in use such as RSA and ECC (Elliptic Curve Cryptography), which have been effective to date, are based upon the premise that the available computing resources were insufficient to crack these codes. With the arrival of CRQC, today’s public key systems are expected to be readily cracked, fully compromising today’s security infrastructure.
Unlike current public key systems such as RSA and ECC which rely on mathematical problems that quantum computers can solve exponentially faster, PQC algorithms are built upon hard mathematical problems that are believed to remain difficult to solve even for quantum systems. These include lattice-based cryptography, hash-based signatures, and multivariate polynomial equations. The National Institute of Standards and Technology (NIST) has begun standardizing these algorithms, with CRYSTALS-Kyber for key encapsulation and CRYSTALS-Dilithium for digital signatures emerging as primary candidates. As a quick explanation, the key exchange mechanism can be considered a “secret encoder and decoder ring” that is used to encrypt and decrypt the message, whereas digital signature ensures that the encrypted messages are coming from a trusted, known source. For automotive systems, the transition to PQC isn't merely an upgrade, it's a fundamental architectural transformation.
The urgency to address this looming problem stems from a phenomenon security professionals call “harvest now, decrypt later.” Adversaries with access to quantum capabilities in the future could capture encrypted vehicle communications today and store them for decryption once quantum supremacy arrives. Given that vehicles remain operational for 15 to 20 years, data transmitted through V2X communications, OTA update channels, and telematics systems in 2024 could be vulnerable to retrospective decryption in 2035. The automotive supply chain compounds this risk; cryptographic vulnerabilities in Tier 2 or Tier 3 components may not manifest until years after deployment, creating liability exposure that extends across decades and multiple ownership transfers.
The attack surface for quantum-enabled threats mirrors and magnifies existing automotive vulnerabilities. Consider the Common Exposure Library identified in current threat modeling: WiFi, cellular connections, Bluetooth, TPMS (tire pressure monitoring systems), OBD-II (on board diagnostic) ports, USB interfaces, EV charging infrastructure, and V2X communications. Each of these vectors currently relies on cryptographic protocols that quantum computers will eventually compromise. V2X communications, particularly, present an acute concern; these systems depend on low-latency cryptographic handshakes between vehicles and infrastructure to prevent collisions and coordinate traffic flow. The computational overhead of PQC algorithms, often requiring larger key sizes and more processing cycles, threatens to introduce latency that could degrade safety-critical response times.
Over-the-Air updates, already identified as high-risk vectors for malware injection, face compound quantum threats. The digital signatures that authenticate OTA packages historically have employed traditional ECDSA or RSA schemes which will be vulnerable to quantum attacks. A malicious actor with future quantum capabilities could forge signatures for malicious firmware updates, effectively weaponizing the vehicle’s own maintenance infrastructure. The Software-Defined Vehicle architecture, with its billion lines of code and continuous update cycles, requires cryptographic agility—the ability to rotate algorithms without hardware replacement. Yet current automotive ECUs were designed with static cryptographic implementations, often burned into hardware security modules with decade-long lifecycles.
The intersection of Functional Safety (ISO 26262) and cybersecurity (ISO 21434) will become particularly challenging in the PQC transition. Safety-critical systems such as steering control, braking, and ADAS depend on predictable, high-speed timing and deterministic behavior. Many PQC candidates, while mathematically robust, exhibit variable, and extended execution times or impose system level requirements that could violate existing safety elements. Lattice-based algorithms, for instance, because of their inherent extended computational needs, can require memory allocations that may trigger watchdog timers or interfere with real-time operating systems. Additionally, the extended computational time associated with these calculations may lead to exceeding the FTTI (Fault Tolerant Time Interval) of the vehicle, which is effectively the deadline that the system must beat once a fault is detected to prevent an accident. Furthermore, the threat agent risk assessment (TARA) processes mandated by ISO 21434 must now incorporate quantum-capable adversaries—nation-state actors with access to cryptanalytic quantum resources or organized criminal groups leasing quantum computing time through cloud services. In short, a lot of complexity now must be added to address critical threats.
Hardware constraints in general present formidable barriers to PQC deployment. Automotive microcontrollers, selected for cost efficiency and environmental resilience rather than computational headroom, often lack the memory and processing capabilities to execute post-quantum algorithms efficiently. A typical vehicle contains dozens of ECUs ranging from 32-bit microcontrollers with kilobytes of RAM to sophisticated infotainment processors and highly complex ADAS SoCs. Retrofitting PQC across this heterogeneous landscape requires either hardware replacement, which is prohibitively expensive (not viable) for vehicles already in service, or careful algorithm selection that balances security margins against resource constraints. Hybrid approaches, combining classical and post-quantum algorithms during transition periods, effectively double the cryptographic overhead.
Supply chain complexity amplifies these challenges. Automotive components source semiconductors from global foundries, incorporate software from hundreds of vendors, and integrate cryptographic modules from specialized providers. Coordinating a PQC transition requires synchronization across this ecosystem; OEMs must specify quantum-resistant requirements, chip vendors must implement hardware acceleration for lattice operations, and software suppliers must refactor cryptographic libraries. The “should strongly consider” language of ISO 21434, while providing flexibility, may prove insufficient to drive the coordinated industry response that PQC demands. Unlike the Jeep Cherokee vulnerability, which prompted immediate patches, the quantum threat offers no dramatic demonstration—only mathematical certainty of future compromise.
The data privacy implications extend beyond vehicle control into the personal information ecosystem now embedded in modern automobiles. Biometric authentication data, payment credentials for EV charging, location histories, and occupant behavior patterns encrypted with current standards may persist in vehicle storage, cloud backups, and third-party databases for decades. Quantum-enabled decryption of this archive would expose not just current owners but entire household networks connected through vehicle telematics. A Nov. 2024 podcast that I participated in provided some real insights into the EV charging infrastructure security that are also very relevant and perhaps overlooked. Charging network operators must simultaneously protect real-time transaction integrity and ensure that historical charging patterns remain confidential against future quantum analysis.
Addressing these challenges requires immediate architectural decisions with long-term consequences. Automotive cybersecurity teams must begin crypto-agility engineering; designing systems where cryptographic algorithms can be updated without hardware replacement, where certificate chains support algorithm diversity, and where secure boot processes can accommodate evolving signature schemes.
Algorithm diversity, in my opinion, is an admission to the fact that there is a real concern that the lattice-based algorithms may be cracked down the road, so alternative algorithms based on different math, Hamming and Hashing, are available. That said, an algorithm that was proposed by NIST for digital signage, which was deemed uncrackable after many years of review, was cracked within weeks of introduction using a relatively low-end microprocessor. In short, because CRQC are currently not available, there is no guarantee that PQC algorithms cannot be cracked leading to architectures that would require extreme amounts of agility and flexibility.
In summary, the transition to PQC cannot follow the automotive industry’s traditional model of generational updates; it must occur as a continuous capability evolution. As vehicles become software-defined platforms with connectivity lifespans exceeding their mechanical longevity, post-quantum readiness becomes not merely a security feature but a fundamental requirement for market viability.
.webp)
TechArena Founder and Principal Allyson Klein was named Female Founder of the Year today at the 2026 Global Business Tech Awards in London.
Judged by an independent panel of leading technology experts, the awards honor companies and individuals whose work has added measurable, tangible value across customer experience, business management, data intelligence, and emerging innovation.
“I founded TechArena to find a voice for the bold innovation driven by creators across the tech landscape,” Allyson said. “Since its inception, our platform has fostered a community of leadership from the world’s tech titans to the next wave of visionary startups. Our collaborations with inventors and market makers from across the value chain have accelerated IT strategy for navigating the inception of the AI era. This recognition is about the fantastic team we’ve built at TechArena, the choice collectively to dedicate our career aspirations to the north star of collaborative innovation, and all of the brilliant people who have shared their stories.”
Allyson grew up in Silicon Valley during the birth of semiconductors, in a home where technology wasn't abstract; it was dinner table conversation. Her father was an international marketing executive; her mother worked as a nurse for chip plants, bringing home stories about the intricate chemical processes behind the magic of fabrication. By the time Allyson spotted a glowing green Apple computer screen at a friend's house, she was already primed to find it mesmerizing.
Allyson spent 22 years at Intel, where her work went far beyond marketing individual products. She helped build ecosystems, crafted foundational industry narratives, and created initiatives that brought companies together around shared tech visions. She built the foundations of industry engagement that ushered in data center virtualization, cloud computing, 5G networks, and artificial intelligence. Her marketing strategy helped grow a $20 billion dollar business for the company and unquestioned leadership in the industry.
In 2009, when her boss told her to "go figure out social media," one of Allyson's two resulting recommendations was to start a podcast. Chip Chat launched as a weekly show and ran for 754 episodes, reaching over 20 million listeners and winning numerous industry awards. The insight behind it was simple but powerful: the best conversations about technology were happening in tech cafeterias, not in board rooms. Engineers came alive when given permission to talk about what they’d invented, and how they felt when their visions came to life.
After Intel, Allyson led global marketing and communications at Micron, overseeing everything from CHIPS Act messaging to COVID-19 communications. But by 2022, something was missing.
“I missed creating content. I missed telling stories,” she said. “Those things gave me unique joy that leading massive marketing organizations never could.”
TechArena was founded on the premise that the industry’s pace of innovation had fundamentally shifted, and conversations on the sidelines weren’t as valuable as direct access to inventors. This drove a conviction that the most important voices in technology aren’t always the loudest ones, and that insider knowledge creates a different kind of journalism. As Allyson puts it, “Most tech journalists don't have the background of living inside tech companies. At TechArena, we understand the shorthand.” That perspective has attracted an impressive range of guests and clients: the platform has featured companies representing more than $9 trillion in market cap, alongside 84 founders and CEOs of emerging tech startups who’ve shared their stories with TechArena’s audience of IT and cloud architects and infrastructure operations teams.
Every piece of TechArena content includes what the team calls the “TechArena take,” an opinion grounded in genuine insider experience. It’s a deliberate editorial choice that sets the platform apart.
The Female Founder of the Year designation carries particular weight in an industry that still has significant ground to cover in terms of representation at the founding and leadership level. For the technology community TechArena serves, this award affirms that building something substantive, durable, and editorially credible is work worth recognizing. The community TechArena has built can strive farther and move faster in part because of the connections that they build together in the arena.
Allyson’s outlook on where technology is headed is characteristically optimistic. When cloud computing arrived, she recalls, the industry feared it would collapse the server market. Instead, new applications proliferated, new businesses were born, and human ingenuity found new expression. She sees AI the same way.
“Humans are going to have a renaissance in terms of what they can do based on AI innovation,” she said, “and while we re-calibrate on where intelligence is created between humans and machines, human to human interaction becomes even more essential and valued.” And she intends for TechArena to be there to tell those stories.
The entire TechArena team extends our heartfelt congratulations on this well-deserved honor. It is a reflection of every interview conducted, every story pursued, and every voice given space to be heard. We look forward to continuing to build something worthy of this recognition.
To learn more about Allyson Klein and explore her work, visit techarena.ai/innovator/allyson-klein.

In higher education, information technology infrastructure often operates behind the scenes, quietly enabling learning without drawing attention to itself. For Rose-Hulman Institute of Technology, that philosophy recently drove a significant infrastructure transformation. The goal was straightforward: remove barriers so faculty and students can focus on research, teaching, and learning rather than wrestling with technology limitations.
During my recent TechArena Data Insights episode with Solidigm’s Jeniece Wnorowski and Justin Baker, systems administrator lead at Rose-Hulman, Justin shared how the institution modernized their infrastructure. The results demonstrate how strategic infrastructure investments can dramatically improve operational efficiency while directly supporting educational outcomes.
Before its latest upgrade, Rose-Hulman’s previous infrastructure challenged system administrators in a variety of ways. Older, disparate systems that were pieced together created slowdowns in trying to do any sort of maintenance, from bringing systems back up if they went down to meeting the demand to roll out new software.
For a small IT team managing everything from student information systems to enterprise resource planning platforms and Microsoft 365 administration, these delays were a serious hindrance. The team needed infrastructure that would let them respond rapidly to emerging needs rather than constantly fighting the limitations of aging hardware.
“Upgrading made the most sense in terms of being able to get that speed and that ease of use….and making fewer points of failure,” Justin explained.
Rose-Hulman’s decision to upgrade by partnering with DataON and incorporating Solidigm solid-state drives (SSDs) as the storage foundation centered on technical compatibility. As a Microsoft shop running primarily Windows servers, Rose-Hulman saw DataON’s close collaboration with Microsoft as a perfect fit. In addition, DataON’s hardware expertise ensured the new infrastructure would support Rose-Hulman’s critical administrative and educational systems.
The performance improvements following the infrastructure upgrade were substantial. Scheduled maintenance windows that previously consumed six to eight hours now are completed in under three hours. Server deployment timelines have been compressed from up to two hours to 10-to-15 minutes. The team no longer needs to wait for “after hours” time blocks to do maintenance or fine tuning, and has time to address critical institutional systems.
“We’re able to run more with less,” Justin explained. “So we can focus on the types of things that allow us to add reliability or backup or something like that to our environment versus having to front-load most of the infrastructure for it just to run everything.”
Beyond upgrading core infrastructure, Rose-Hulman is exploring how Azure Local paired with Azure Virtual Desktop (AVD) and NVIDIA L4 graphics processing units (GPUs) can transform software delivery for students. The pilot deployment runs demanding engineering applications through virtual desktop infrastructure, eliminating the traditional constraint of needing powerful local hardware.
This approach addresses a longstanding challenge in engineering education: ensuring every student can access resource-intensive applications regardless of the device they own. By centralizing compute resources and delivering applications virtually, Rose-Hulman can provide consistent performance and eliminate student concerns around having the right high-performance device, or needing to make time to get to a lab to complete coursework.
Rose-Hulman’s infrastructure transformation illustrates how strategic technology investments can directly support educational missions in higher education. By partnering with vendors who understand their technology ecosystem and deploying high-performance storage solutions, the institution is achieving measurable operational improvements that cascade into better student experiences. For educational institutions managing tight budgets and small IT teams, efficiency gains translate directly into capacity for innovation and improved service delivery.
As Rose-Hulman continues expanding their Azure Local deployment and virtual desktop capabilities, they’re positioned to offer students greater flexibility and access while maintaining the high-performance infrastructure that engineering education demands. This balance between operational efficiency and educational excellence reflects the thoughtful approach required when infrastructure decisions directly impact student success. Learn more about Rose-Hulman Institute of Technology at www.rose-hulman.edu.

Cloud security conversations have matured. We talk about identity, Zero Trust, workload isolation, posture management. But one layer still gets treated as background configuration: Network architecture. And that’s where quiet failures begin.
Many cloud security issues don’t stem from advanced exploits. They stem from routing assumptions, Network Address Translation (NAT) shortcuts, Classless Inter-Domain Routing (CIDR) reuse, and peering decisions that were never revisited as the environment grew.
Cloud networking is easy to deploy. That does not make it easy to design correctly.
In cloud environments, routing tables determine more than reachability. They determine inspection paths. If traffic does not pass through a firewall, it is not inspected, regardless of how strong that firewall is.
Architecturally, this means:
A useful design question is simple:
Can any workload reach sensitive resources without crossing an inspection boundary?
If the answer is yes, the network design needs refinement.
NAT design affects attribution, monitoring, and policy enforcement.
When architecting egress, consider:
Egress architecture should align with security assumptions. If your security model assumes consistent source identity, your NAT model must support it.
Otherwise, policy becomes guesswork.
IP address allocation is often treated as an early-stage task. It defines long-term flexibility.
Intentional CIDR planning should consider:
When address space overlaps or becomes fragmented, segmentation logic becomes complex. Complexity increases error rates.
Segmentation clarity starts with clean IP design.
Centralized connectivity models like transit gateways, hub-and-spoke, virtual Wide Area Network (WAN) are powerful.
They also centralize blast radius of an attack.
Architecturally:
Connectivity should be intentional and constrained.
Flatness in cloud rarely happens by design. It happens by accumulation.
The ultimate test of network architecture is containment.
If a workload is compromised:
Network design is not just about uptime. It defines how far compromise can spread. That is a security decision.
Strong cloud network design typically includes:
It is rarely accidental. It is intentional. Cloud platforms abstract hardware, not responsibility. The network remains one of the few layers that can enforce unavoidable boundaries. When it is designed casually, security becomes fragile. When it is designed deliberately, it becomes a containment mechanism.
Cloud network architecture is not just foundational. It is decisive.

In a move that sent ripples through the burgeoning AI ecosystem, cloud computing giant Nebius announced its acquisition of Tavily, an Israeli startup making waves with its “agentic search” technology.
While official figures remain under wraps, reports peg the all-cash deal at an estimated $275 million, potentially climbing to $400 million with performance incentives. This isn't just another tech acquisition; it's a strategic chess move that could fundamentally reshape how AI agents are built, deployed, and scaled.
Tavily, founded in late 2024, has been a darling of the developer community, racking up over 3 million monthly SDK downloads and attracting a million-strong user base in record time. Their tech, specializing in real-time web retrieval for AI agents, addresses a critical pain point: hallucinations and outdated information that plague even the most advanced large language models (LLMs). With early funding from heavy hitters like Insight Partners and Alpha Wave Global, Tavily’s rapid, high-value exit underscores the intense demand for solutions that can ground AI in reality.
The combined entity aims to offer a full-stack solution for developers looking to build sophisticated AI agents. Imagine an AI that not only reasons effectively but can also instantaneously access and synthesize the latest information from the web. This integrated approach promises to streamline development, reduce latency, and, crucially, enhance the reliability of AI agents across various applications, from enterprise automation to customer service and beyond.
The market certainly seems to be listening. Nebius pointed to analyst projections that forecast the agentic AI market to explode from $7 billion in 2025 to a staggering $200 billion by 2034. This isn’t just growth; it’s a gold rush, and Nebius just staked a significant claim. Tavily’s continued operation under its own brand and the retention of its 30-person team, including CEO Rotem Weiss, suggests a smart integration strategy, preserving the innovative spirit that made Tavily so attractive in the first place.
This isn’t merely a strategic acquisition for Nebius; it’s a declarative statement. For too long, the narrative in AI cloud has been dominated by the hyperscalers – AWS, Google Cloud, Azure – with their vast, vertically integrated empires. Nebius, often seen as a formidable player in high-performance compute, has made a bold play to differentiate itself by becoming the go-to platform for autonomous AI agent development.
The integration of Tavily's agentic search is a stroke of genius because it tackles the “black box” problem of AI head-on. By providing real-time, verifiable data, Nebius is directly addressing the trust deficit that has plagued AI adoption. This move positions them as a champion of “grounded AI,” a concept that will only grow in importance as AI agents take on more critical roles in our lives and businesses.
Nebius isn’t just buying a company; they’re buying a crucial piece of the future. By offering a complete agentic stack, they’re competing on capability and, more importantly, trust. This acquisition is a clear signal that the AI agent arms race is heating up, and Nebius just fired a warning shot across the bows of every major cloud provider. Keep a close eye on this space; the game just changed.

The rise of AI is exposing a widening gap between what modern data centers were designed to do and what AI workloads now demand. Boards and executive teams expect faster time-to-value from AI investments. Quietly, the infrastructure has become the bottleneck.
At AI Infrastructure Field Day 4 (AIIFD4), the Cisco Data Center Networking team addressed this gap head-on. Cisco made it clear they are not walking away from Ethernet. Instead, they are rethinking what Ethernet needs to become to reliably support the unique demands of AI workloads.
AI workloads behave very differently from traditional enterprise applications. Training and large-scale inference generate long-lived, east west, GPU-to-GPU flows that are extremely sensitive to latency, jitter, and packet loss. Even minor congestion can cascade into stalled jobs, underutilized GPUs, and missed business deadlines.
During the session, a critical business consequence became obvious: time-to-first-token (TTFT) now matters as much as raw performance. Delays caused by network misconfiguration, troubleshooting blind spots, or prolonged deployment cycles directly erode the return on multimillion dollar GPU investments. In many cases, organizations lose months of effective depreciation time before AI clusters deliver meaningful value.
In other words, long TTFT times mean expensive GPUs are sitting idle while the teams troubleshoot the network.
This is where the gap emerges. Traditional Ethernet is optimized for best-effort, north-south traffic. It was never designed for sustained, lossless, ultra-dense GPU communication. At the same time, many enterprises lack the operational appetite to introduce entirely separate fabrics just to support AI.
Surprisingly, one theme that came through clearly was that plain Ethernet is not enough for modern AI clusters.
Standard Ethernet assumes packet loss is acceptable and recoverable. AI training does not. When one GPU waits on another due to congestion or dropped packets, the entire job slows down. No amount of compute spend can compensate for unpredictable network behavior.
Beyond performance, there is an operational issue. AI environments introduce unprecedented complexity across compute, storage, optics, and networking. Without deep visibility, network teams are often blamed first. But they usually don’t have the telemetry needed to prove where problems actually originate.
It’s hard to understand the challenge when you look at the complexity of a “small” 96 GPU network topology:

This is an executive level risk. AI failure modes are no longer isolated to IT, they impact product timelines, research velocity, and competitive advantage.
InfiniBand has long been the gold standard for HPC and AI training. It delivers native losslessness and extremely low latency, and it performs exceptionally well in controlled environments.
However, Cisco drew a clear contrast at AIIFD4. While InfiniBand works technically, it introduces business and operational challenges for enterprises:
It creates a separate fabric with specialized tooling and skills.
It limits multitenancy and segmentation, which are essential for shared enterprise AI platforms.
It offers limited end-to-end observability, particularly outside the fabric itself.
It complicates convergence with storage and front-end networks.
InfiniBand excels as a purpose-built backend fabric. But most enterprises aren’t building isolated AI factories. They are trying to operationalize AI alongside everything else.
Cisco’s AIIFD4 appearance was not about replacing Ethernet, it was about evolving it.
Their approach combines Ethernet’s universality with AI-specific enhancements that deliver predictability and control. This transforms Ethernet from a best effort transport into a deterministic system fabric, capable of supporting AI training and inference without introducing separate operational silos.

One of the most important themes from Cisco’s sessions was that security in AI data centers is about insight and control. It can’t be just about isolation.
Cisco’s AI-optimized Ethernet emphasizes:
Logical segmentation using EVPN-VXLAN, enabling strong multitenant isolation
Secure, TLS-based control plane communication in cloud managed environments like Nexus Hyperfabric
Proactive detection of physical layer issues, such as optic degradation, before they impact workloads
Job-level analytics that tie performance anomalies directly to infrastructure causes
The common thread is control: seeing problems early, understanding their impact, and fixing them before GPUs go idle.
This level of visibility simply does not exist in traditional InfiniBand environments. Cisco’s argument is that what you can see, you can secure, and what you cannot see becomes a business risk.
Cisco’s appearance at AIIFD4 reframed the Ethernet versus InfiniBand debate as a business decision, not just a technical one.
For hyperscalers building single purpose AI factories, InfiniBand may remain the right choice. But for enterprises building multiple AI clusters, often incrementally, across teams and use cases, Cisco’s AI optimized Ethernet offers a compelling alternative: one fabric, one operating model, and one security posture.
The takeaway for executives is simple: the question is no longer whether Ethernet can support AI. The question is whether your Ethernet is engineered for determinism, visibility, and AI scale operations.
Cisco’s answer at AIIFD4 was clear. Enterprises don’t need a second fabric to keep up with AI. They need Ethernet that has been deliberately engineered for determinism, visibility, and scale.
Q: What is AI Ethernet?
A: AI Ethernet is Ethernet that has been deliberately engineered for AI workloads, with deterministic performance, lossless behavior, and end-to-end observability to support large GPU clusters at scale.
Q: Why isn’t standard Ethernet sufficient for AI workloads?
A: Standard Ethernet assumes packet loss is acceptable. AI training workloads are tightly synchronized, so even small amounts of loss or congestion can stall jobs and leave expensive GPUs underutilized.
Q: How does deterministic networking improve AI performance?
A: Deterministic networking delivers predictable latency and controlled congestion, which leads to faster job completion, higher GPU utilization, and more reliable AI production timelines.
Q: When does InfiniBand make sense for AI?
A: InfiniBand can be a good fit for hyperscalers or single purpose AI factories. Enterprises running shared, multitenant AI platforms often find its operational complexity and lack of convergence limiting.
Q: Why is observability critical for enterprise AI networking?
A: AI environments span GPUs, NICs, switches, and optics, making issues hard to diagnose without end-to-end visibility. Observability enables faster root cause analysis and reduces the risk of idle GPUs and lost value.
Q: Is AI Ethernet only about performance?
A: No. AI Ethernet also addresses operational simplicity, security, and risk by combining visibility, segmentation, and policy driven control as AI platforms scale. driven control as AI platforms scale.

It’s fair to ask whether AI in 2026 is a bubble. The echoes of the early 2000s are real: valuations running ahead of revenues, plenty of compelling tech, and plenty of fuzzy business models. We’ve seen this movie before.
But here’s what feels different this time. We’ve now seen AI deliver real, tangible value, from agentic systems like self-driving cars to generative models like ChatGPT, Gemini, and Claude. New workflows are already reshaping engineering and productivity. The value is real, even if the business models are still forming. What’s no longer speculative is what AI demands in practice: massive compute, running continuously, coordinated across thousands, and soon millions of processing elements.
And where compute goes, networking must follow.
Training isn’t just about FLOPS; it’s about keeping GPUs fed and synchronized—moving data between accelerators, memory tiers, and storage with tight timing. Inference at scale isn’t “lightweight” either. Agentic systems add constant coordination, state exchange, and feedback loops. This is persistent, symmetric traffic, less like consumer internet burstiness, more like an industrial control system that hates latency and variance.
So, while the top of the stack is still sorting itself out, the bottom of the stack is converging. Those infrastructure requirements are driving real decisions: AI-first data centers, power secured years out, liquid cooling systems designed in from day one, and campuses planned as a single distributed computer.
In the dot-com era, Alan Greenspan famously cautioned against “irrational exuberance.” What’s unfolding now feels more deliberate and methodical, albeit no less exuberant. It manifests not in pitch decks, but in data centers, power contracts, and miles of fiber.
Early in any technology cycle, progress is driven by ideas. Better algorithms. Smarter software. More elegant abstractions. Over time, however, the limiting factor shifts from what we can imagine to what we can physically deploy.
That shift is now unmistakable in AI.
Regardless of which hyperscaler wins, which model architecture dominates, or which application becomes the killer use case, the requirements inside the data center are converging quickly. AI systems must be dramatically faster, far denser, and far more tightly coupled than anything the industry has operated before—not just larger clusters, but clusters that behave as a single, synchronized system.
For years, optics and networking evolved as predictable plumbing. Bandwidth increased incrementally. Power budgets were manageable. Traffic patterns were relatively well behaved. That trajectory worked for cloud computing and the consumer internet.
AI introduces a discontinuity.
When that linear roadmap is mapped against the demands of large-scale training, generative inference, and agentic workloads, the gap becomes obvious. East–west traffic explodes. Latency consistency matters as much as raw throughput. GPUs grow intolerant of waiting. At scale, the cost, and energy, of moving data begins to rival the cost of computing on it.
This is how industries respond to step changes: they build the substrate first.
Hyperscalers and vendors are investing ahead of certainty—not betting on a single application or winner, but on the belief that AI will require fundamentally different physical systems. In doing so, they are running into a new reality: scaling AI is no longer gated by software ambition alone. It is increasingly constrained by three intertwined limits—speed, thermals, and power delivery.
Those constraints now define the AI infrastructure roadmap.
As AI systems scale, the industry is no longer debating abstract limits. It is colliding with three very concrete ones. They arrive together, reinforce each other, and cannot be solved independently.
These are the three walls now shaping AI infrastructure: speed, thermal envelope, and power delivery.
AI workloads demand orders of magnitude more data movement than previous generations of compute. Training large models requires constant synchronization across thousands of accelerators, while emerging agentic systems add persistent coordination and state exchange across distributed components.
To meet that demand, signaling speeds have been pushed relentlessly higher — and this is where physics intrudes.
At the frequencies required for modern AI interconnects, copper becomes a fundamental constraint. Signal integrity degrades rapidly with distance. Loss rises. Reach collapses dramatically from meters to centimeters. At scale, this creates a hard architectural ceiling.
This is not simply a matter of faster PHYs. As AI clusters expand beyond a single rack or building into “scale-across” systems, bandwidth and latency become inseparable. Propagation delay matters as much as throughput, and copper simply cannot preserve both over distance.
Optics relaxes this constraint by delivering far higher bandwidth while maintaining reach and latency as systems scale across racks, buildings, and campuses.
Even where copper can deliver sufficient speed, it increasingly fails on heat.
As electrical signaling rates rise, resistive losses convert a growing share of energy directly into heat. In high-density AI racks, this creates a feedback loop: higher speed drives more heat, which demands more cooling, which consumes more power and constrains further scaling.
This is why liquid cooling has moved from an optimization to a requirement in modern AI infrastructure. At rack densities well beyond 100 kW, thermals increasingly shift from an operational concern to an architectural one.
Optics changes this equation by reducing resistive loss at the source. Moving data as light — and shortening or eliminating electrical paths through approaches like co-packaged optics — lowers heat generation and expands the thermal envelope available for compute.
At AI scale, optics isn’t about going faster. It’s about not melting the system while doing so.
The final wall is the most unforgiving: power delivery.
In practice, many data centers are now constrained less by space or fiber availability than by access to electricity itself. New facilities are increasingly sited where power is available, near hydroelectric, nuclear, or renewable sources rather than where latency is most convenient.
In the cloud era, we measured success in Gigabits per second. In the Agentic era, one of the defining metrics increasingly becomes Joules per Inference. We are moving from a performance-constrained world to an energy-constrained one. Power must be budgeted hierarchically: per server, per rack, per row, per facility. One of the largest and fastest-growing consumers of that power is data movement, particularly the repeated conversion between electrical and optical domains.
The math is sobering. At scale, the energy spent moving bits can rival the energy spent computing on them.
Optics is central here not just because it is efficient, but because it enables efficiency everywhere. By doing more in light and less in copper — and by pushing optical interfaces closer to compute — operators can reduce energy per bit, per port, and per rack, freeing scarce power for actual computation.
This is what allows power-constrained data centers to continue scaling, and what makes it feasible to couple multiple facilities into much larger virtual systems.
These three walls are tightly coupled. Solving one in isolation makes the others worse. Faster electrical signaling increases heat. More cooling increases power draw. Greater power demand stresses both facilities and the grid, capping further scale.
This coupling is what makes AI infrastructure different from previous compute cycles.
Optics is unique because it relaxes all three constraints simultaneously. It delivers the bandwidth and reach required for scale-across architectures, reduces thermal load by minimizing resistive loss, and lowers energy consumption per bit, freeing scarce power for computation rather than transport.
That combination is why optics has moved from predictable plumbing to a first-order architectural consideration. Across components, systems, and emerging approaches like optical switching and co-packaged optics, the industry is increasingly using light to break limits that electrons can no longer navigate efficiently.
This shift applies not only to new builds. Existing data centers are being retrofitted to accommodate AI workloads, driving additional optical demand as legacy, copper-heavy designs are reworked to survive higher speeds, tighter thermal envelopes, and stricter power budgets.
Optics doesn’t eliminate tradeoffs, but at AI scale, it expands the feasible design space in ways no other approach can.
We still don’t know which applications will dominate, which business models will endure, or which hyperscalers will capture the most value. Those questions remain open.
But one thing is no longer in doubt.
Whatever form AI ultimately takes, it will require a fundamentally new physical substrate — one that is faster, more deterministic, and dramatically more power-efficient than what came before. That substrate is being built now, and it is being driven by optics.
This is not speculation. It is infrastructure.
And infrastructure, once committed to at this scale, has a way of shaping the future regardless of who wins the race at the top of the stack.
History offers a useful parallel. After World War II, the United States embarked on an enormous infrastructure project: the interstate highway system. It was built without knowing exactly where people would live, which cities would boom, or which industries would dominate. It was built on a conviction that mobility would matter, and that the country would be better off prepared for wherever it led.
The AI infrastructure build-out has the same shape.
Data centers, power delivery, cooling systems, and optical interconnects are being constructed not because the industry has perfect clarity on applications or economics, but because it has conviction that AI will be foundational. Once that conviction takes hold, infrastructure becomes destiny.
This is why this moment feels different from past bubbles. Software cycles can inflate and deflate. Markets can overshoot and correct. But when an industry runs into hard physical limits, the response is not debate. It is construction.
Many AI companies will fail. Some valuations will reset. Entire categories will consolidate or disappear. That is how every major cycle unfolds.
But the infrastructure being built now will not vanish with the noise. Like the highways of the last century, it will outlive the narratives that justified its construction and quietly shape everything that comes next.
After World War II, we paved the country with concrete and asphalt. Today, we are doing it again, this time with photons, lasers, and fiber.
We are building massive highways of light.
The applications will change. The winners will shift. The economics will evolve.
But the highways will remain.

The first sign of global disruption is rarely a system outage. It is a quiet rise in alerts, a spike in phishing volume, or subtle misuse of valid credentials that look ordinary until it is not.
During periods of instability, cyber risk does not suddenly appear. It compounds. Conflict acts as a force multiplier by exposing existing weaknesses, straining critical services, and pushing security teams into sustained high-alert mode. Recognizing this dynamic is essential for organizations that want resilience rather than reaction.
Periods of disruption do not create new classes of cyber risk. They reveal gaps that already exist but are often tolerated under normal conditions. Identity systems, access controls, and operational shortcuts become pressure points when speed and availability take priority. Data from IBM Security shows that compromised credentials and misuse of valid accounts remain among the most common initial access vectors in major breaches, and incidents involving valid credentials take longer to detect and cost more to remediate. When organizations rely heavily on cloud services and remote access, these weaknesses become easier to exploit, not harder.
The impact is most visible where failure carries immediate consequences. Energy, healthcare, transportation, and communications systems operate with little tolerance for disruption. Advisories from the Cybersecurity and Infrastructure Security Agency consistently warn that elevated risk environments increase attempted intrusions against critical services. Even short-lived outages or degraded performance can affect safety, continuity, and public confidence. In these environments, the perception of instability often causes as much damage as the technical event itself.
Cyber activity also becomes harder to classify during periods of instability. Analysis from Europol highlights how financially motivated attacks, espionage, and disruptive activity increasingly overlap. For defenders, this ambiguity complicates response decisions, regulatory obligations, and communication strategies. Familiar technical indicators can suddenly carry unfamiliar consequences, forcing teams to operate with incomplete information.
The strain is not limited to systems. Sustained high-alert conditions place continuous pressure on security teams, particularly those responsible for incident response. SOC surveys from the SANS Institute show rising fatigue and burnout across security operations roles. Prolonged stress reduces detection accuracy, slows response times, and increases the likelihood of error. In this context, burnout becomes a measurable security risk rather than a workforce concern.
It is tempting to assume that advanced tooling, automation, and threat intelligence can neutralize these challenges. While technology improves visibility and response speed, it does not eliminate structural weaknesses. Tools cannot replace clear decision-making, effective communication, or well-rested teams. Post-incident reviews repeatedly show that organizations fail not because of missing tools, but because coordination and judgment break down under pressure.
The World Economic Forum continues to rank cyber insecurity among the top global risks precisely because it compounds during uncertainty. Conflict does not pause cybersecurity. It accelerates it. Organizations that invest in identity protection, realistic incident planning, and sustainable operating models are better positioned to absorb prolonged instability.
The question for leaders is no longer whether disruption will occur; it is whether their systems, decisions, and people can sustain pressure when it does.

As organizations deploy AI models at scale, a new set of challenges has emerged around operational efficiency, developer velocity, and infrastructure optimization. A recent conversation with Solidigm’s Jeniece Wnorowski and Brennen Smith, head of engineering at Runpod, revealed how cloud platforms are rethinking the entire AI stack to help developers move from concept to production in minutes rather than months.
Runpod operates 32 data centers globally, providing graphics processing unit (GPU)-dense compute infrastructure for small companies and enterprises building and deploying AI systems. This service is crucial considering the economics of modern GPUs, where a single system with 8 GPUs can cost hundreds of thousands of dollars. Runpod understands that the compute hardware is only part of the equation. “Storage and networking…glue these systems together,” Brennen said. “By ensuring that there’s high quality storage paired up with these GPUs….we have been able to show that this results in a markedly better experience.”
On top of this, the company provides a sophisticated software stack that allows developers to go from their idea to production in minutes, across training and inference use cases. The goal is to “Make it so developers and AI researchers can focus on what they do best, which is actually delivering value to their customers,” Brennen said.
The ability to rely on optimized infrastructure is becoming even more important as organizations move from training to deployment. Smith likened training infrastructure to traditional business capital expenditures, noting that the high up-front costs see a return on investment over a long period of time. In inferencing, organizations deal with ongoing operational realities, grappling with scaling, efficiency, and delivering value to customers daily. As a result, Runpod has engineers specifically looking at inference optimization. With the rise of AI factories, “How well these systems are run from an operational excellence perspective will dictate the winners and losers,” Brennen said. “You run an inefficient factory, you’re out.”
One of the most important insights from our conversation addressed storage, which is now seen as a hidden bottleneck in AI. Brennen recounted how his engineering team recently investigated Docker image loading times. While unrelated to a specific large language model (LLM) activity, developers flag issues like slow loading times as hurting their overall workflow. This gets in the way of things needing “to magically just work.”
For the solution, Brennen reiterated that storage is what glues the system together. “What we have found is every time, as long as we are optimizing our storage, we are able to make the data move faster,” he said. And when data movement is optimized, entire development cycles accelerate.
Runpod recently launched ModelStore, a feature in public beta that leverages NVMe storage and global distribution to make AI models seem to appear “like magic.” What previously took minutes or hours now happens seamlessly, compressing development iteration cycles. For organizations under pressure to deliver AI capabilities quickly, these time savings compound into significant competitive advantages.
Brennen emphasized that faster developer cycles enable teams to fail fast and iterate more effectively to deliver successful outcomes. When CTOs receive mandates to implement AI, their success depends on giving teams tools that accelerate innovation rather than creating additional friction.
Looking ahead, Brennen identified the convergence of infrastructure and software as a transformative trend. The goal is to enable code to self-declare and automatically establish the infrastructure required to run it, freeing developers from thinking about infrastructure so they can focus on their code and creating value aligned to business logic. “Anything we can do to make it even easier to get global distribution, that’s a hugely powerful paradigm,” he said.
Runpod’s emphasis on developer experience demonstrates that sustainable AI deployment requires thinking holistically about the entire infrastructure stack. The company’s focus on making complex infrastructure feel magical to developers reflects a broader industry recognition that reducing friction accelerates innovation.
As AI moves from experimentation to production deployment, organizations that optimize for developer velocity and operational efficiency will have a significant advantage from their ability to accelerate time to value. For organizations evaluating AI infrastructure partners, Runpod’s approach offers a model that balances performance, scalability, and ease of use.
Connect with Brennen Smith on LinkedIn to continue the conversation, or visit Runpod’s website and active Discord community to explore how their platform might support your AI initiatives.

Humans desperately want to find patterns, meaning in patterns, and to create and connect.
We anthropomorphize constellations, animals, elements, and now AI. From Pygmalion to Frankenstein to the internet and computer games like The Sims, our urge to be a creator as part of human Imago Dei is a thread throughout human history. Going back to Milton’s Paradise Lost, we desperately want to say, “Did I solicit you from darkness to life?” to a creation of ours.
In today’s viral moment of February 2026, we have Claudebot/Moltbot/OpenClaw/? patterned after a pinnacle of human achievement, said no one ever, Reddit.
Yes, it is consistent that what fascinated humans with Victor Frankenstein continues to fascinate us now. It is clear that humans can’t help but step close to mistaking the technical with labels that evoke transcendence. Of course, it is curious and telling of a deep human need that training LLMs on Reddit would result in a pseudo-religion. The extent to which Moltbot/OpenClaw is a mirror reflecting back ourselves will be a subject of ongoing study, just as much as cybersecurity professionals are studying the security implications.
In all of this, there are some positives that point to directions the next innovator can build on. Like the iPhone, the fundamentals that came together in a novel, breakthrough approach weren’t necessarily new. Crucially, AI breakthroughs aren’t about models and model benchmarks anymore; we have shifted to applications and services that neutralize model identity. A few trends that were improved or extended include:
Messaging Apps: Internally at my “day job” company, individuals were building assistants/agents that they had to schedule a Teams call with to continue training their assistant like a junior employee. For SaaS/FAANG, Slack and WhatsApp have been the natural communications channels. Moltbot/OpenClaw messages you back proactively, extending other chatbots that require a ‘check back” from the user.
Personalization and Memory: Most chatbots have improved saving state over the last few years. Even free versions can hold a conversation history so you don’t experience 50 First Dates with every new chat. Private GPTs and avatar chatbots trained on years of an individual’s writing have been around for almost two years. Thanks to how the internet and remote work have conditioned us, those interactions were starting to feel like we were collaborating with a team member rather than a program. Tying into point #1, if the channel is the same for a human team and an AI agent, who or what is on the other side can start to matter less than the task that is being completed. It can even feel like you’re really connecting because Moltbot remembers you.
A Cruise Director for Your Life: Years ago, a woman I was in a leadership cohort with caused the entire room to burst into laughter because she said, “I need a work wife!” There is a reason a faithful and patient personal assistant is a constant sidekick in movies about rock stars and the rich. Someone who knows you and who proactively directs you on what to focus on, where to go, and even arranging your day will make “adulting” easier for us all. Personalized assistants are now democratized.
There are also some downsides. As a certified AIGP (AI Governance Professional) in tech for years, I have seen that this technology has been unruly for its own creator. A technology that can be incredibly powerful only if given full system access can be incredibly powerful against you and your system.
Vulnerability: LLMs still fall prey to prompt injection, data poisoning, and model drift. They are probabilistic rather than deterministic. LLMs can’t always differ between a legitimate prompt and a prompt hidden in what should be benign information fields. Set limits upfront and mandate agent behaviors on specific tasks to check back before taking actions outside specific guardrails you set in advance.
Security: There was a joke years ago about how Gen X were raised with the fear of sharing personal information with strangers online or getting into a stranger’s car. The following generations pioneered social media and Uber… cybersecurity pros are rightly raising alarms about Moltbot. For now, you have to be prepared for securing your system and data, API keys, and tokens, setting limits and mandating agent behaviors on specific tasks. Over time, agentic security controls and governance will catch up and be more off-the-shelf for average users. Until then, assume a defensive driving posture like you’re riding a motorcycle in a third-world country without wearing a helmet.
You Own It: More than anything, open-source agentic AI means you have to have agency yourself. It sounds great to be your own billion-dollar, one-person company. it sounds amazing to have your own personal assistant. The quality of your ideas, your ability to reach farther, and your ability to refine faster with a critical eye will determine your success. Your technical ability to expand and secure your setup is something you own for yourself.

When I think back to the last OCP Global Summit 2025, one of the most memorable sights on the show floor wasn’t a chip or a server tray. It was the racks.
Meta’s Open Rack Wide (ORW) specification introduced adouble-width form factor that looked, at first glance, almost counterintuitive, especially in an industry moving toward disaggregation.
But ORW is a useful clue about where AI infrastructure actually is right now. We may be headed toward disaggregated systems, but today’s highest-performance AI deployments are still heavily constrained by short-reach, high-lane-count copper connections, plus the physical sprawl of power delivery, networking, and cooling that modern platforms demand. In other words, the rack is increasingly behaving less like furniture and more like the computer.
The Open Rack specification has been a cornerstone of hyperscale data center design for years. Unlike traditional 19-inch racks, Open Rack was designed from the ground up for large-scale cloud and AI deployments. Its signature21-inch width improves airflow and its powered busbar simplifies power delivery while reducing cable clutter.
Over time, Open Rack evolved to meet the growing demands of AI and high-performance computing. The original ORV1 specification introduced a 12V busbar, ORV2 improved scalability and cooling, and ORV3 moved to 48V—enabling higher power density and making liquid cooling easier to integrate (via rear-mounted manifolds). Then came ORV3 HPR (High Power Rack), which pushed further with added depth and more robust power management to support the most demanding AI servers while maintaining compatibility with the ORV3 standard.
For a while, ORV3 HPR seemed like the pinnacle of rack design. But as AI workloads continued to push the limits of power and cooling, even HPR began to show its constraints.
The industry is undeniably moving toward disaggregation—separating IT load, power, and cooling into distinct systems. Draft specifications and roadmaps for dissagregated power architectures targeting 100kW today and up to 1MW-class racks over time are already being shared through the OCP community, so a wider rack design might seem like a step backward. However, before we can fully embrace disaggregation at rack scale, we need to overcome the limitations of copper-based electrical connections. The sheer number of electrical and signaling leads—plus distance, loss, and power constraints—required to connect rack systems at scale presents significant challenges. Until those challenges are resolved, many AI deplyments favor a “scale-up” architecture over a “scale-out” approach.
There’s another factor at play: the physical layout of compute systems is expanding. As GPU die sizes grow, so do the memory, networking, and power delivery systems that support them. In short, while we know disaggregated systems are the future, we still need an intermediate solution to bridge the gap. That’s where Open Rack Wide (ORW) comes in.
ORW scales up the HPR’s feature set to accommodate much larger, heavier, and more power-intensive AI systems. With double the width of ORV3 racks and a slightly taller frame, ORW provides the space and structural integrity needed for next-generation AI platforms.
ORW isn’t just a bigger rack—it’s a reimagined platform designed for the unique demands of AI. At 1200mm wide (compared to ORV3’s 600mm), ORW offers significantly more real estate for high-density compute trays, liquid cooling manifolds, and power distribution systems. It supports up to 3500 kg of IT gear—more than double the capacity of ORV3 HPR, and is engineered to handle the thermal and electrical loads of modern AI workloads. (Fun fact: ORW is also affectionately known as “BFR” — Big Freaking Racks.)
One of the most compelling aspects of ORW is its flexibility. The specification supports multiple power architecture options, including legacy ORV3 power shelves, side power racks for low- or high-voltage DC input, and even native high-voltage busbars that distribute power directly within the rack. This adaptability ensures that ORW can evolve alongside AI infrastructure, whether for training clusters, inference workloads, or hybrid deployments.
Liquid cooling is another key feature. ORW’s design accommodates high-power liquid-cooled busbars, which are essential for managing the heat generated on the busbar by the power delivery of today’s AI chips. This focus on cooling efficiency aligns with the industry’s push toward sustainable, high-performance data centers.
ORW isn’t just a Meta project—it’s an open standard developed in collaboration with industry leaders. The base specification for ORW was announced by Meta at the OCP Global Summit 2025, and it quickly gained traction. Companies like AMD, Wiwinn, and Rittal debuted their own ORW-based designs at the summit, showcasing the specification’s potential. AMD’s "Helios" rack-scale reference system, for example, leverages ORW to deliver optimized performance for AI clusters, while Wiwinn unveiled its double-wide rack architecture for next-generation AI workloads. Rittal, meanwhile, is preparing ORW-compatible enclosures and accessories for mass production later in 2026. This collective effort underscores the importance of open standards in shaping the future of AI infrastructure.
It’s worth noting that not everyone is on board. NVIDIA, for instance, is advancing vertically integrated rack-scale systems and architectures that don’t necessarily map cleanly to ORW. But for those committed to open standards, ORW offers a compelling path forward. The AMD design exemplifies this as it integrates GPU, CPU and networking into a single, cohesive rack system for large-scale AI and High-Performance Computing (HPC) workloads.
Developing ORW wasn’t without its challenges. The increased size and weight of the rack required new manufacturing approaches, including automation and bolt-together assembly techniques to simplify production and shipping. Testing presented another hurdle: traditional test equipment couldn’t handle ORW’s 3500 kg payload, forcing the team to partner with automotive and aerospace testing facilities to validate the design.
Standardization is also critical. For ORW to succeed, the OCP community must continue to refine the specification and ensure interoperability across vendors. This collaborative approach is what makes open standards like ORW so powerful—they bring together hyperscalers, vendors, and researchers to solve shared challenges.
ORW represents a foundational shift in data center design. It addresses today’s power, cooling, and space constraints while laying the groundwork for future advancements. As the industry works toward full disaggregation, ORW provides a scalable, open platform that can evolve with the needs of AI workloads.
By providing a bridge to the future, ORW enables the industry to innovate today while preparing for the next wave of data center evolution.

Fermilab’s Silvia Zorzetti explains how quantum computing and sensing are evolving, where they outperform classical systems, and what’s next for the field.

For the last several years, the media industry has framed its future as a codec war: free versus licensed, open versus proprietary, AV1 versus HEVC and its successors. On the surface, the debate feels rational. Compression efficiency has always mattered, and it still does. Without it, global streaming at scale would not exist.
But the codec fixation has become a distraction.
The market is no longer defined by how efficiently bits are compressed in isolation. It is being reshaped by whether entire systems can guarantee experience behavior end-to-end. By “system,” I mean the full chain: encoding, transport, wireless edge, client buffering/playout, and the control loops that coordinate them. Consumers do not churn because of subtle compression artifacts; they churn because experiences fail—buffering during a live touchdown, audio drifting out of sync, latency breaking immersion. These failures are not codec failures. They are system failures.
Efficient bits cannot compensate for fragile delivery.
For two decades, the industry optimized the payload. Engineers worked relentlessly to represent more information per bit while preserving perceptual quality and creator intent. The results were extraordinary: lower bitrates, higher fidelity, and an explosion of global video delivery.
That work succeeded because the environment allowed it to succeed: media consumption was largely passive, buffers could mask uncertainty, and users tolerated occasional degradation when networks misbehaved.
That environment no longer exists.
In the agentic AI era, media consumption is no longer passive. It is increasingly mission-critical, and failures are no longer cosmetic—they can be catastrophic. Experiences now span real-time interaction, immersion, and safety-adjacent workloads where timing and continuity are non-negotiable.
Today’s dominant failure modes are not caused by insufficient compression. They are caused by path fragility, especially at the wireless edge. Interference, congestion, multipath fading, and contention are not engineering oversights — they are physical realities. Even the most deterministic core network cannot repeal the laws of radio physics.
If a media experience depends on a single path behaving perfectly, it does not matter how advanced the codec is or how efficient the compression may be. When that path degrades, the experience suffers—and too often, it breaks.
The codec debate keeps asking one component to solve problems that belong to the system.
Much of today’s codec discourse centers on cost. Royalty-free codecs are often presented as the inevitable future, eliminating licensing friction and unlocking innovation. For hyperscalers with vast engineering budgets, this trade can be rational. Royalties are exchanged for compute and internal optimization. But for much of the ecosystem, the economics are more complicated.
As the old systems engineering adage goes, complexity is conserved.
In any large system, removing one form of complexity does not make it disappear — it displaces it. When standardized licensing frameworks are removed, complexity migrates into less visible, more variable domains. Encoding efficiency often requires more compute. Hardware acceleration becomes fragmented across silicon platforms. Integration, validation, and debugging burdens shift from the ecosystem to individual product teams. IP risk moves from a shared framework onto each adopter’s balance sheet.
“Free” codecs do not eliminate cost; they transform a known, predictable expense into a distributed operational tax that grows with scale.
The real decision is not between free and paid. It is a choice about where complexity lives, and whether it is managed once at the ecosystem level or repeatedly inside every organization.
As media evolves toward real-time, immersive, and safety-adjacent use cases, the competitive frontier is moving decisively upstream. Differentiation no longer comes from compression efficiency alone. It comes from whether the system can guarantee behavior under non-deterministic, hostile edge conditions.
This is the defining transition underway: media is no longer optimized as a signal, but engineered as a system.
Instead of asking codecs to compensate for unpredictable networks, systems must be designed to tolerate unpredictability by construction. Reliability can no longer depend on a single path behaving perfectly. It must emerge from coordination across multiple paths and layers.
Redundancy becomes the new reliability.
Today’s media delivery architecture is largely an act of faith. The cloud compresses content. The player buffers it. The network does its best. Each layer operates with limited awareness of the others’ constraints or priorities.
The codec does not know when the Wi-Fi link is about to degrade.
The network does not know the next frame carries a safety alert.
The player hopes the buffer is deep enough to hide the chaos.
This architecture was sufficient for a world of passive viewing. It is insufficient for a world of precision and mission-critical applications.
Coded Multisource Media Format (CMMF) represents a critical architectural pivot. Rather than treating delivery as a single fragile stream, it enables cooperative, multisource systems where media can be reconstructed from multiple paths simultaneously.
In plain terms, CMMF is an industry-standard container that enables robust, low-latency media streaming by allowing content to be delivered simultaneously from multiple network sources, like different CDNs or network paths. Instead of sending identical copies of the data, CMMF uses linear, network, or channel coding to split media into coded “symbols”. A client can then pull unique coded pieces from several locations and reassemble the original stream once enough pieces are collected. This approach increases reliability, improves throughput, and reduces rebuffering—without the inefficiency of storing full duplicate streams everywhere, making it ideal for modern multisource and multipath delivery architectures.
This is not about making one pipe bigger. It is about orchestrating multiple pipes intelligently.
Unlike basic connection bonding, multisource coding avoids redundant traffic while dramatically improving effective Network QoS. Wi-Fi and cellular links become a unified connectivity pool rather than mutually exclusive choices. The client assembles the experience from whichever paths are healthy at any given moment.
Physics remains hostile — but it is rarely hostile everywhere at once.
AI further amplifies this shift. Traditional streaming protocols are reactive by design. Quality drops after packets are lost. Buffers drain before adaptation begins. For real-time and immersive experiences, that response comes too late.
A cooperative system can observe conditions continuously, predict degradation, and adapt preemptively. Critical frames are rerouted before failure becomes visible. The experience does not stall or degrade — it simply continues.
The technology to do this exists today. The challenge now is not invention; it is adoption: moving cooperative delivery from standards and trials into repeatable, mass-market deployment.
Advocates of “good enough” media often argue that consumers will not pay for this level of precision. And for TikTok dance videos or Instagram streams watched on a bus, they are right.
But the growth engines of the next decade are not passive or disposable. They are high-consequence, real-time, and immersive experiences where failure is not a minor annoyance—it is a liability. These are the domains where guarantees become the product.
In the automotive cockpit, media becomes mixed-criticality. Entertainment and safety signals coexist on the same system. A collision warning cannot buffer behind a map update or a game download. Entertainment can degrade; safety cannot.
In live sports, latency is no longer a technical metric — it is a business metric. When fans learn about a touchdown from social media before seeing it on screen, value is destroyed. Determinism sells time.
In XR and spatial computing, the governing constraint is biological. Motion-to-photon latency and its variance determine whether an experience feels natural or induces nausea. There is no buffer in XR. Timing must be exact, every time.
Across these domains, the pattern is unmistakable. “Good enough” fails not because quality is too low, but because time is no longer negotiable. These are the markets where determinism moves from a technical aspiration to a commercial and experiential requirement—and where system-level cooperation becomes the only viable path forward.
Vertically integrated stacks can deliver exceptional experiences when one company controls the entire pipeline. That model works — but it does not scale across global ecosystems of creators, silicon vendors, OEMs, operators, and platforms.
History is clear: when industries hit a complexity wall, they standardize.
Wi-Fi did not achieve mass adoption through proprietary turbo modes. It scaled when interoperability became the baseline and innovation moved up the stack. Media delivery is approaching the same inflection point.
Deterministic, cooperative delivery cannot scale as a collection of proprietary silos. It requires shared assumptions, reference behavior, certification, and long-term stewardship. Standards turn fragile integrations into predictable markets. They allow creative intent and timing guarantees to survive the journey intact — regardless of who built each layer.
Without standards, cooperative delivery remains a premium feature. With standards, it becomes infrastructure.
The era of competing on cheaper bits is ending. The era of competing on guaranteed experience has begun.
Over the last decade, the industry rebuilt the nervous system — more deterministic networks, faster optics, better wireless. Now it must upgrade the signal itself.
The codec, the transport, and the player are no longer independent optimization problems. They are a single system, and they must be designed as one.
Value is migrating from components to architecture.
From efficiency to reliability.
From isolated optimization to cooperation.
The winners of the next era will not be those who compress bits most aggressively, but those who ensure experiences arrive intact, on time, and without compromise — even when the environment in between is hostile.