X

Don’t Skip Leg Day

Data Center
Lynn Comp
October 4, 2024

Weightlifting, like many sports, has a distinct culture that I started learning about when one of my sons started working towards maxing out his scores for a military PT test. He was oriented entirely around one objective and set of exercises until the test was updated. That military branch realized they needed to test for proficiency in movements more typical in the field. At this point, skipping leg day was not an option since chicken legs aren’t efficient or effective at transporting a payload amounting to 180 lbs of sheer upper body muscle.

Computing services have also tripped over the same over-engineering of a single element, because there’s no time, no cash or no clarity on what commercially viable uses a technology or system will have. It is inevitable that if you get enough processing power to solve a problem, your next challenge will be keeping that processor from stalling because – like a no leg day curlbro with chicken legs – the transport layer is too tiny to move the payload except at a snail’s pace. Recently a friend at one of the hyperscaler cloud providers told me “we have demonstrated we can multiply numbers effectively with AI GPUs. Now it’s a mere matter of keeping the beast fed.”

Having had the opportunity to work on 3DXP memory technology that would sustain a data lake the depth of Lake Tahoe and was inhibited by its connection to the rest of the world being the width of a drinking straw – I do wonder who tames whom. Will the unique movement, uses, types and amounts of data that feed the AI beast transform the network? Or, given how much the network has clapped back at compelling but unsustainable business models (true cloud gaming, for example), perhaps I should wonder what the network will do to the AI beast.

AI requires enormous amounts of data to support model advancements. Omdia recently stated that by 2030 75% of all network application traffic will involve AI content generation, curation or processing. When I was running a “Visual Cloud” business in 2020, Cisco’s annual networking report made a near identical claim that video traffic represented 75%+ of all internet traffic. Despite the dominance of video, there were a number of network characteristics that video-based workloads have to work around to improve delivery of their payloads. The hardware at the end points processing these video payloads had a nearly inconsequential role in the service – the network in between enabled or eradicated video business economics.

When it comes to AI bandwidth “more is more” – yet anyone who tells you bandwidth is cheap and plentiful is, in the words of the Dread Pirate Roberts – selling something. AI imposes the requirement for near real-time responses. Network responsiveness correlates to distances – in the datacenters, between data centers and on the internet. AI requires lossless delivery networks. In the 1980’s “Sneakernet” was coined because interoperability and lossless communications were safer if transporting data on physical discs between compute networks. In the 2020’s, “command streaming” seemed like the best way to make cloud gaming economically feasible, yet GPUs couldn’t reliably count on a perfect, lossless transmission and often displayed garbage due to the occasional packet drop.

If this isn’t a tall enough order, the network must be zero-trust, since the data used in AI is often sensitive, while sustaining extensive east-west traffic exchanges. All that, plus adhere to a broad industry standard with a diversity of suppliers, while reaping the benefits of high volume manufacturing economics.

I would love your perspectives here:

-Will AI force the industry to get serious about network “leg day,” accepting that networks are a foundation for AI infrastructure as much as legs are a foundation for a fully optimized human body?

-Or, similar to the infamous “penguin walk” after leg day, are the past 30 years of network architecture buildout such a disincentive to change that AI infrastructure will have to adapt to the network?

Weightlifting, like many sports, has a distinct culture that I started learning about when one of my sons started working towards maxing out his scores for a military PT test. He was oriented entirely around one objective and set of exercises until the test was updated. That military branch realized they needed to test for proficiency in movements more typical in the field. At this point, skipping leg day was not an option since chicken legs aren’t efficient or effective at transporting a payload amounting to 180 lbs of sheer upper body muscle.

Computing services have also tripped over the same over-engineering of a single element, because there’s no time, no cash or no clarity on what commercially viable uses a technology or system will have. It is inevitable that if you get enough processing power to solve a problem, your next challenge will be keeping that processor from stalling because – like a no leg day curlbro with chicken legs – the transport layer is too tiny to move the payload except at a snail’s pace. Recently a friend at one of the hyperscaler cloud providers told me “we have demonstrated we can multiply numbers effectively with AI GPUs. Now it’s a mere matter of keeping the beast fed.”

Having had the opportunity to work on 3DXP memory technology that would sustain a data lake the depth of Lake Tahoe and was inhibited by its connection to the rest of the world being the width of a drinking straw – I do wonder who tames whom. Will the unique movement, uses, types and amounts of data that feed the AI beast transform the network? Or, given how much the network has clapped back at compelling but unsustainable business models (true cloud gaming, for example), perhaps I should wonder what the network will do to the AI beast.

AI requires enormous amounts of data to support model advancements. Omdia recently stated that by 2030 75% of all network application traffic will involve AI content generation, curation or processing. When I was running a “Visual Cloud” business in 2020, Cisco’s annual networking report made a near identical claim that video traffic represented 75%+ of all internet traffic. Despite the dominance of video, there were a number of network characteristics that video-based workloads have to work around to improve delivery of their payloads. The hardware at the end points processing these video payloads had a nearly inconsequential role in the service – the network in between enabled or eradicated video business economics.

When it comes to AI bandwidth “more is more” – yet anyone who tells you bandwidth is cheap and plentiful is, in the words of the Dread Pirate Roberts – selling something. AI imposes the requirement for near real-time responses. Network responsiveness correlates to distances – in the datacenters, between data centers and on the internet. AI requires lossless delivery networks. In the 1980’s “Sneakernet” was coined because interoperability and lossless communications were safer if transporting data on physical discs between compute networks. In the 2020’s, “command streaming” seemed like the best way to make cloud gaming economically feasible, yet GPUs couldn’t reliably count on a perfect, lossless transmission and often displayed garbage due to the occasional packet drop.

If this isn’t a tall enough order, the network must be zero-trust, since the data used in AI is often sensitive, while sustaining extensive east-west traffic exchanges. All that, plus adhere to a broad industry standard with a diversity of suppliers, while reaping the benefits of high volume manufacturing economics.

I would love your perspectives here:

-Will AI force the industry to get serious about network “leg day,” accepting that networks are a foundation for AI infrastructure as much as legs are a foundation for a fully optimized human body?

-Or, similar to the infamous “penguin walk” after leg day, are the past 30 years of network architecture buildout such a disincentive to change that AI infrastructure will have to adapt to the network?

Lynn Comp

Head of Global Sales and GTM, AI Center of Excellence Vice President

Subscribe to TechArena

Subscribe