TechArena Forum | AI, Cloud & Innovation Community Discussions

Aug 8, 2024

Contextual AI and WEKA Partner for Enterprise Class AI Activation

The TechArena are fans of what WEKA is delivering in the market, and we’ve covered their data platform since last year as the company has unveiled innovations to help speed enterprise adoption of AI in market. It was, therefore, no surprise to us to see Contextual AI select WEKA as a strategic partner for delivery of enterprise AI services on Google Cloud this week. Contextual AI has made a name for itself with RAG 2.0, delivering enterprises fine-tuned models that provide foundational tools for enterprises wanting to build and customize specialized AI applications.

When you consider the application of contextual language models, a step beyond traditional RAG with integration across pre-training, fine tuning, and alignment with human feedback. Traditional RAG models uses an off-the-shelf model embedding, vector database for retrieval, and a distinct language model for generation, stitched together through an orchestration framework and therefore limiting both the final value of the application and difficulty and efficiency in delivery of the model.

RAG 2.0 was delivered, you guessed it, by the same leadership team that first delivered RAG at Facebook AI Research, so they know a bit about the RAG approach and how to make it better. RAG 2.0 has delivered proof of benefit across various industry benchmarks showcasing improved accuracy vs. RAG and Vanilla RAG models based on GPT-4 and Mixtral). That’s cool stuff.

When you consider how much enterprises are focusing on RAG implementations to bring AI to the mainstream, you can understand why Contextual AI is garnering significant attention for delivery of its solution to customers. And when you consider the scale of data required for RAG 2.0, WEKA emerges as a perfect partner to support data pipeline requirements.

WEKA has delivered a fantastic data management platform to optimize GPU utilization delivering peak efficiency in model performance as well as value to the customer funding the Google Cloud instance. Contextual AI has deployed a total of 100TB thus far of WEKA data platform capacity to fuel data requirements and has seen increased developer productivity and faster model training times with the new solution. The delta in performance is eye-opening with a stated 3X performance gain across key AI use cases and 4X faster AI model checkpointing. They’ve done all this while reducing cost 38% per terabyte.

What’s the TechArena take? We love stories about taking technology innovation and actually implementing to customer value. What Contextual AI and WEKA are delivering here are useful tools that enterprises can tap today for real adoption of AI within their environments. We can’t wait to see more advancement on the contextual language models as the industry at large continues its bandwagon of support of RAG type frameworks for AI fine tuning and deployment. And we’re delighted to see WEKA’s data management solutions continue to garner momentum in market for this once-in-a-lifetime moment of AI-era computing advancement.

Stylized car with circuits running all over the outside body

Edge

Robert Bielby

Chameleon Semiconductor

Aug 8, 2024

Article

Automotive Meets Chiplets: Robert Bielby’s Perspective on the Impact of Level 3 ADAS on Emerging Semiconductor Tech

While Level 3 ADAS (conditional automation) requires the driver to be present and engaged, the resultant heterogeneous workload, a mix of compute and AI processing, is driving the need for new system-level architectures and state-of-the-art system-on-chips (SoCs) based upon leading-edge semiconductor processes and packaging technologies. More specifically, solutions for Level 3 ADAS and above are driving the automotive industry to embrace chiplets to most efficiently and effectively address the demanding workloads by allocating the right task to the right compute engine optimized for a given task in a footprint that is most efficient from both a power and area perspective. It is also the case that chiplet-based solutions can be more cost-effective when compared to an equivalent solution based on monolithic technologies.

Chiplet technology, still in its relative infancy, enables disparate technologies and semiconductor dice to be combined in a single package through the use of die-to-die interconnect which is physically connected via a package substrate while enabling performance that is equivalent to that of a monolithic device.

In March 2022, a universal interface for the interconnection of chiplets was released. Dubbed UCIe 1.0 (Universal Chiplet Interconnect Express), the introduction of this standard, in part, reflects the broader industry awareness that the traditional benefits of scaling through process technology are being challenged. To be clear, the major semiconductor foundries continue to invest heavily in developing advanced process nodes which continue to offer improved power, performance, and area benefits. However, the economics of these advanced nodes present significant barriers to adoption for a large percentage of the ASIC / ASSP design community. Additionally, the benefits of scaling associated with migrating to the most advanced semiconductor process node do not apply uniformly across all circuit types - most notably analog circuits.

As such, while scaling through Moore’s law is still a very important vector for the semiconductor industry, the move to chiplets and the use of advanced packaging technologies are rapidly becoming an important vector for the industry - and in fact, hold the promise of spawning a new industry where chiplets from different vendors can be readily interconnected via UCIe to rapidly build new products that combine best in class technologies into a single package delivering fast time to market, low risk, and low development cost. The benefits of chiplets are quite meaningful enabling the following benefits:

Ability to build SoCs that are larger than the reticle of the semiconductor manufacturing equipment
Building SoCs with higher yield and correspondingly lower cost vs. monolithic solutions
Reduced R&D cost vs. employing advanced nodes
Accelerated time to market
Smaller footprint - vs. discrete two-chip solution
Higher performance vs. discrete two-chip solution
Lower power vs. discrete two-chip solution

UCIe is an open specification that defines the interconnect between chiplets within a package, enabling the formation of a chiplet ecosystem with a common interconnect footprint at the package level. In effect, through industry standardization, the equivalent of a common footprint, similar to Lego®, is being established. The UCIe standard is endorsed by 72 contributing members, and 26 adopting members at the time of this writing.

The initial focus of the UCIe 1.0 specification was in the following areas

Physical Layer: Die-to-Die I/O with industry-leading Key Performance Indicators (KPIs)
Protocol: CXL/PCIe for near-term volume attach
Well-defined specification: ensure interoperability & evolution

The UCIe physical layer supports I/Os that can provide up to 32 Giga Transfers per second (GTs) with 16 to 64 lanes and uses a 256-byte Flow Control Unit for data, similar to PCIe 6.0. The protocol layer is based on Compute Express Link (CXL). In short, UCIe is leveraging tried and tested technologies that have a strong legacy of adoption across many markets.

As a testament to the importance of UCIe to the automotive industry, In August of 2023, the UCIe consortia introduced a 1.1 version of the specification with a focus on the following areas:

Enhancement for automotive - including runtime health monitoring and repair for high-reliability applications
New usages for streaming protocols with full UCIe stack
Protocol support with end-to-end link layer functionality
Cost optimization for advanced packaging resulting from new bump maps
Enhancements for compliance testing

While all of these components of the 1.1 specification reflect a keen focus on the critical next-level details associated with chip-to-chip interconnectivity and device-to-device communication, the specific considerations for automotive applications in the UCIe 1.1 specification underscore the strong anticipated adoption of chiplets in safety-critical automotive applications.

Addressing those safety concerns, the UCIe 1.1 version of the specification includes preventive monitoring to ensure that the die-to-die signaling “eye” height and width are optimal and can be re-trained as needed. (As a note, the “eye” diagram is a way to measure the signal integrity of the link. The more open the “eye” is, the greater the signal integrity). There is also the addition of run-time testability of link health which includes the periodic parity Flit injection checking for the health monitoring of each lane with the ability to repair as required.

The adoption of state-of-the-art technologies marks a sea change in the automotive industry where historically, automotive electronics employed mature products that were typically based on mature semiconductor process technologies. With the advent of ADAS and AD, this has now changed. AI performance requirements that can reach PetaOPs levels for Level 5 ADAS are best addressed via heterogeneous computing solutions employing high energy-efficient, high-TOPs AI accelerators in a chiplet form factor.

At the time of this writing, the UCIe consortia is preparing to release a 2.0 version of the specification. When the results are made publicly available, we will take a look and see what additional features and considerations have been made to the specification and place a focus on those that specifically address the automotive market.

Data Center Industry Veteran Lynn Comp Shares 3 Patterns Observed Over 30 Years in Tech

Learning to learn in a structured manner is the only way to maintain a long career in technology. I tell those I mentor they should settle into the reality they will likely have 5+ careers in this industry.

“Don’t be scared, just stay curious,” I say. “Do everything in your power to be the positive influence on every collaboration you engage in.”

Technology domains are small communities, and being the individual who consistently demonstrates high EIQ, dignifies vs. degrades others (regardless of the role they play) is a superpower for the long haul.

Many of us with a longer tenure in technology see patterns in industry transformations, both for good and for ill. Example: years ago, I was informed by a long tenured mainframe/UNIX expert that virtualization was a mainframe feature for decades. If a server solution was considered to be equivalent to “…a toy you find at the bottom of a cereal box”, why bother with virtualization technology that limits performance? (Extra credit: Name the CEO that gave us the soundbite comparing servers to Fruit Loops without doing a web search!)

To kick off my blog series with Tech Arena, I decided to share three patterns I have consistently seen across ~30 yrs of technology inflections.

The “best” solution in the eyes of a customer rarely offers the best raw performance.

As a hardcore engineer fresh out of college, I got my hands dirty coding some of the most complex SOCs with firmware and compiled languages. I was convinced RISC was the ‘best’ architecture. I was so excited to promote the first >100MHz System on Chip (SOC) for the ARM ISA, thinking it would be easy to sell ‘better technology.’ I instead learned about legacy code and ‘good enough’ x86/68K/PowerPC incumbency. “The best technology” may cause such an operational hassle, customers will admire, but never purchase “the best.” A company can’t maintain a profitable business on glowing reviews from technology analysts or journalists.

Corollary Lesson: Being boring, reliable, easy to debug, offering ways to sample operational data without impacting run time performance is surprisingly underrated by technology providers. To service providers, this list is REALLY important to deliver a good customer experience. Using a car analogy, a Lamborghini may be the “best” auto on a racetrack as measured by its top end speed, but when driving grade school carpools, a Lamborghini is limited to the same 25 MPH zones that a Toyota Corolla or Minivan comfortably achieves.

“What compute performance giveth, the network can taketh away.”

Maintaining optimal compute/throughput/storage is a never-ending battle. Equilibrium with enough compute power that processes data, balanced by enough storage in the right tiers, plus the right network capacity to keep the data moving – it just never lasts for long.

There are so many examples, I’ll just point you to two deeper articles to explore. Both The Register on high core count CPUs and GPUs, and Broadcomm’s article with Meta’s data on AI instance efficiency offer depth on the consequences of imbalanced systems. Now, imagine the ‘inside voice’ conversations within infrastructure operators who pay ~$1B for AI processing hardware and only see ~$500m of work done on it.

“Immutable ‘laws’ vs. Ideology and narrative”

A favorite comic in my family included something related to exercise in every routine. A classic quote is “I don’t do ups, I do DOWNS, because gravity is the law, and I obey the law!” In commercial airplanes, applied physics temporarily suspend the effects of the law of gravity, but eventually gravity wins, and the plane must land. Similarly, in every technology era, exuberance for a hot technology seems to temporarily suspend the laws of economics and the principles of supply and demand. Countless breakthroughs have promised a new utopia, to replace or better humanity, and to keep humanity from…well, being itself! The internet evangelists in 2001 claimed ‘borders didn’t matter’ and ‘information wants to be free!” Here in 2024, how many paywalls have you run into this week? Economics clapped back on utopia, because it turns out that you need to make more money than you spend to stay in business. There are a few hard lessons in the fact that customers/consumers will not spend at the level of the difficult engineering that built the product. Do consumers save more money? Do enterprises see benefits well beyond what they must spend to afford the product?

Concluding thoughts

Many of these ‘rules of thumb’ don’t need to be stated when business and technology together offer a solution that legitimately solves more problems than it causes. That often happens years after the discovery of a new technology innovation – so stay curious, keep learning and share your thoughts in the comments!

Lynn A. Comp Bio

Lynn A. Comp is a Vice President and Director of the Microsoft Global Account Team in the Intel Sales and Marketing Group. Lynn’s mission is to align the unique benefits that Intel and Microsoft offer their customers through access to unparalleled technology at scale, enabling the broadest ecosystems to innovate solutions to the most challenging business problems. Lynn returns to Intel from AMD where she served as the VP and GM of the EPYC CPU Cloud Business and the VP of EPYC Product Marketing.

Lynn has a wide range of experiences spanning her ~30 years in the tech industry, from strategic planning and go to market of RISC SOCs for both communications infrastructure and mobile phones, to software offers laying the groundwork for rapid video-based services innovation, to pioneering the foundational libraries that paved the way for ‘software defined’ networking with telecommunications operators.

Lynn has extensive experience in marketing, product management, product planning, and strategy development across software, hardware, cloud, and communications service providers (CoSPs).

Lynn has a Bachelor of Science in electrical engineering from Virginia Technology and an MBA from University of Phoenix.

Welcome tothe Forum

Infleqtion and the Rise of Hybrid CPU GPU Quantum Systems

Expanso's David Aronchick on Data Gravity and Pipeline Debt

TechArena Wins 2 Communicator Awards for Editorial Excellence

Your Password is Being Cracked By Something That Never Sleeps

AI’s Power Crunch Is Making Brownfield the New Greenfield

Taking AI to the Diagnostic Lab with Care

qBraid on Quantum Computing: From Hype to Developer Reality

Submer's Gabriel Lazar on Heat Reuse and AI Sustainability

Laura St. John on Financial Discipline in the AI Era

Data Insights: Krishna Subramanian on Data & AI Costs

ZeroPoint Technologies Identifies the Real AI Infra Bottleneck

Dell is Building the Classical Foundation for the Quantum Era

AI’s Power Crunch Is Making Brownfield the New Greenfield

Taking AI to the Diagnostic Lab with Care

qBraid on Quantum Computing: From Hype to Developer Reality

Submer's Gabriel Lazar on Heat Reuse and AI Sustainability

Scaleway's Albane Bruyas on Europe's Sovereign AI Cloud Era

OCP EMEA Summit 2026: The AI Data Center Gets Its Blueprint

OCP Advances the Open Data Center Ecosystem Vision at EMEA Summit

Data Insights: Enterprise AI with Adity Dokania

Dana Bos: Culture Isn’t a Perk. In AI, It’s Infrastructure

Why Ultra Ethernet Matters for the Software-Defined Vehicle

Laura St. John on Financial Discipline in the AI Era

Data Insights: Krishna Subramanian on Data & AI Costs

ZeroPoint Technologies Identifies the Real AI Infra Bottleneck

Contextual AI and WEKA Partner for Enterprise Class AI Activation

Automotive Meets Chiplets: Robert Bielby’s Perspective on the Impact of Level 3 ADAS on Emerging Semiconductor Tech

Data Center Industry Veteran Lynn Comp Shares 3 Patterns Observed Over 30 Years in Tech

Iceotope Delivers Innovative Chemistry to Fuel Migration to Liquid Cooling – Data Insights Sponsored by Solidigm

Dog Days of Summer – Looking forward to OCP Summit

It’s Gettin’ Hot in Here: Industry veteran Jim Fister unpacks the history of the data center

How Health Technology Innovations, Inc. is using AI to accelerate pharma research

Sneak peek: What to expect at the OCP conference with Schneider Electric’s Alex Rakow and Intel’s Eric Dahlen

Sema4.ai Delivers Agents of Change

What is ADAS? A Primer From Automotive Expert Robert Bielby

Cornelis Networks Aims for AI Fabric Growth with Lisa Spelman at the Helm

Uncovering the Future of AI Agents with Sema4

Cheetah RAID Delivers Rugged Servers and Storage at the Edge

Advanced Semiconductor Packaging: The Secret Hero for the AI Infrastructure Era

All Things Auto…Let’s Begin

CheetahRAID Talks Scalable Storage at the Edge - a Data Insights Podcast Sponsored by Solidigm

Taboola Talks Re-Architecting the Marketing Landscape with AI - Data Insights Sponsored by Solidigm

The State of Open Compute Infrastructure Innovation

Microsoft and AMD Extend AI Collaboration at Build

Microsoft Stuns with Speed of AI Innovation and Integration at MS Build

Sprinting Towards an AI Future with Microsoft’s Nidhi Chappell

Supermicro Leverages Its Heritage of Innovation to Fuel AI Era Compute

Driving Sustainable Data Center Infrastructure for the Next Wave of Greenfield Buildout with Jon Summers

Disruptive Chiplet Innovation Delivers the Next Wave of Data Center Performance with Palo Alto Electron’s Jawad Nasrullah

Accelerating European Data Center Buildout with OCP’s Raul Alvarez

Welcome to
the Forum