X

The Future of SuperComputing: Unlocking Access

Data Center
Allyson Klein
November 18, 2022

It was incredible to be back at a full-fledged in person SC’22 in Dallas this week. After two years of pandemic-limited interaction, the conference felt vibrant and essential to the sharing of ideas and innovation. I’m back in Portland and reflecting on the advances the largest research institutions have made in the past year with new entries in the Top500, a heightened focus on research collaboration spurred by a period of acute scientific demand from humanity, and a hope for additional collaboration from the industry towards new heterogeneous systems to fuel the proliferation of Exascale computing and beyond. Farther afield, I’m keeping my eye on the advancement of chiplet architectures and how they’ll shape future systems.

Some quick takes from me on the silicon front. Yes, we’re seeing advancement by AMD taking the top spot with the Frontier system and inclusion in over 20% of the newest list of top supercomputers. This was expected, but for me the real story to watch in the coming year is the advancement of heterogeneous systems powered by CXL providing more flexibility in design for matrix and vector processing requirements. The answer is no longer which silicon but what compliment of silicon to provide the flexibility required for diverse HPC workloads. We also saw the announcement of the UCIe 1.0 specification providing an industry standard chiplet interconnect. We’ve talked about chiplets for a while now, but with support from all the major logic vendors AND many of the major cloud providers and integration with CXL for near term volume attach I am anticipating to see some vendor news on integration of UCIe into future products soon. The net net? The customer wins with more flexibility of silicon choice for computing needs and industry innovation accelerates with a standards-based playing field.

Then there’s data. The takeaway is that researchers have a lot of it and need to manage it. I published my discussion with Jeff Denworth, co-founder of Vast Data, on their new universal storage solutions, all flash NAS that creates an efficient and scalable storage alternative. Jeff thinks this will disrupt the memory storage paradigm, and we already know that with CXL invading platforms we’ll see “far memory” designs creating new opportunity for lower latency data delivery as well. In Turing award winner Jack Dongarra’s lecture at the conference, he laid out that this is the bottleneck for HPC systems today which is why I was equally intrigued to see the advancements in the IO500 systems as I was for the Top500. The IO500 organization is publishing interesting data on not only what systems are delivering best bandwidth, metadata performance, and overall performance, they provide a cross-section of which storage platforms were submitted for analysis (with Lustre being the predominant class of storage system for this report). If you’re not familiar yet with IO500, I’d encourage you to dig into the results and review the presentation that they delivered at SC’22.

Finally, there’s the research itself, and this is what makes SuperComputing such an inspirational conference. To hear directly from scientists on the challenges they’re solving with the help of supercomputing is always impressive. One example was Karissa Sabonmatsu’s discussion on her institute at Los Alamos’ progress in unlocking genomes at the atomic level. She described the holy grail of cell level research as studying a single human cell for ten days and requiring 1012 Yottaflops of compute power. The complexity? A single gene represents over a billion atoms, and measuring molecular dynamics for a gene requires > 100 million calculations per second. Sabonmatsu is famous for her study of Ribosomes, those biological elements that connect mRNA and tRNA to synthesize polypeptides and proteins and are central to understanding how living systems operate as well as how drug and vaccine therapies work. The ribosome is a central player in how COVID-19 vaccinations protect us from the virus, and its continued study (and the underlying compute innovation required to continue unlocking it) will assist with creation of other therapies to combat a myriad of diseases.

We also heard from NASA about their research in air pollution and its effect on the planet. My discussion with NASA researcher Megan Damon provided insight in how their supercomputing center is furthering our understanding of the human and natural contributors to air pollution, how these aerosols and particulates travel across the globe, and how they contribute to climate change and human health. One in eight pre-mature deaths are partially attributed to air quality today, so the impact of this research will help us better understand the interrelation between what’s in our air and how we can mitigate impact on humanity. Again, computing has a main role to play in delivering insight to fuel scientists working on this study.

And that’s a wrap from SC’22! Thanks for engaging - Allyson

It was incredible to be back at a full-fledged in person SC’22 in Dallas this week. After two years of pandemic-limited interaction, the conference felt vibrant and essential to the sharing of ideas and innovation. I’m back in Portland and reflecting on the advances the largest research institutions have made in the past year with new entries in the Top500, a heightened focus on research collaboration spurred by a period of acute scientific demand from humanity, and a hope for additional collaboration from the industry towards new heterogeneous systems to fuel the proliferation of Exascale computing and beyond. Farther afield, I’m keeping my eye on the advancement of chiplet architectures and how they’ll shape future systems.

Some quick takes from me on the silicon front. Yes, we’re seeing advancement by AMD taking the top spot with the Frontier system and inclusion in over 20% of the newest list of top supercomputers. This was expected, but for me the real story to watch in the coming year is the advancement of heterogeneous systems powered by CXL providing more flexibility in design for matrix and vector processing requirements. The answer is no longer which silicon but what compliment of silicon to provide the flexibility required for diverse HPC workloads. We also saw the announcement of the UCIe 1.0 specification providing an industry standard chiplet interconnect. We’ve talked about chiplets for a while now, but with support from all the major logic vendors AND many of the major cloud providers and integration with CXL for near term volume attach I am anticipating to see some vendor news on integration of UCIe into future products soon. The net net? The customer wins with more flexibility of silicon choice for computing needs and industry innovation accelerates with a standards-based playing field.

Then there’s data. The takeaway is that researchers have a lot of it and need to manage it. I published my discussion with Jeff Denworth, co-founder of Vast Data, on their new universal storage solutions, all flash NAS that creates an efficient and scalable storage alternative. Jeff thinks this will disrupt the memory storage paradigm, and we already know that with CXL invading platforms we’ll see “far memory” designs creating new opportunity for lower latency data delivery as well. In Turing award winner Jack Dongarra’s lecture at the conference, he laid out that this is the bottleneck for HPC systems today which is why I was equally intrigued to see the advancements in the IO500 systems as I was for the Top500. The IO500 organization is publishing interesting data on not only what systems are delivering best bandwidth, metadata performance, and overall performance, they provide a cross-section of which storage platforms were submitted for analysis (with Lustre being the predominant class of storage system for this report). If you’re not familiar yet with IO500, I’d encourage you to dig into the results and review the presentation that they delivered at SC’22.

Finally, there’s the research itself, and this is what makes SuperComputing such an inspirational conference. To hear directly from scientists on the challenges they’re solving with the help of supercomputing is always impressive. One example was Karissa Sabonmatsu’s discussion on her institute at Los Alamos’ progress in unlocking genomes at the atomic level. She described the holy grail of cell level research as studying a single human cell for ten days and requiring 1012 Yottaflops of compute power. The complexity? A single gene represents over a billion atoms, and measuring molecular dynamics for a gene requires > 100 million calculations per second. Sabonmatsu is famous for her study of Ribosomes, those biological elements that connect mRNA and tRNA to synthesize polypeptides and proteins and are central to understanding how living systems operate as well as how drug and vaccine therapies work. The ribosome is a central player in how COVID-19 vaccinations protect us from the virus, and its continued study (and the underlying compute innovation required to continue unlocking it) will assist with creation of other therapies to combat a myriad of diseases.

We also heard from NASA about their research in air pollution and its effect on the planet. My discussion with NASA researcher Megan Damon provided insight in how their supercomputing center is furthering our understanding of the human and natural contributors to air pollution, how these aerosols and particulates travel across the globe, and how they contribute to climate change and human health. One in eight pre-mature deaths are partially attributed to air quality today, so the impact of this research will help us better understand the interrelation between what’s in our air and how we can mitigate impact on humanity. Again, computing has a main role to play in delivering insight to fuel scientists working on this study.

And that’s a wrap from SC’22! Thanks for engaging - Allyson

Subscribe to TechArena

Subscribe