Are we Headed for ‘LLMs on Wheels?’
Since today’s modern car is a data center on wheels, it should come as no surprise that the neural networks that appear to be gaining all the attention these days are also deployed in the automobile, and have been for some time now.
In a previous blog titled, “Why a Self-Driving Car Might Run a “STOB” Sign: To ViT or Not To ViT,” I wrote about some of the different classes of neural networks addressing vision processing. In this blog, I’m going to discuss another class of neural networks employed in the vehicle, referred to as Large Language Models (LLMs).
As a brief genealogy, an LLM is a type of neural network referred to as a transformer model, which falls under the larger generative AI umbrella. Keeping up with the latest developments in these models is difficult, as the number of neural networks being introduced into the public domain is growing at an explosive rate. As a proxy for this growth, over 100 machine learning (ML) white papers are being published daily.
LLMs used for natural language processing (NLP) most likely first caught the awareness of the broader population with Apple’s introduction of Siri. Suddenly, the futuristic concept of computers recognizing speech had gone from being an idea right out of Star Trek to a hand-held reality. And while these initial versions still had room to improve, Siri’s capabilities were nonetheless quite impressive. From personal experience using an early speech-to-text program after a bicycle accident left me unable to type for several months, the improvements were profound.
More recently, the release and growth of ChatGPT, which provides an accessible frontend for various OpenAI LLMs, has marked the next major milestone in raising the broader population’s understanding of the power, potential, and capabilities of AI. This, in turn, has caused a frenzied uptick in the pace of AI research, both in areas of neural networks and semiconductor architectures that improve power efficiency and performance. It has also given rise to the explosive growth of AI in the data center and the “AI bubble” that we are currently experiencing. The power of AI, and of LLMs, in particular, is so profound that governments are aggressively evaluating different approaches to potentially manage and regulate this technology.
So how does this all fit into the automobile? NLP had its humble beginnings a few decades ago with the introduction of simple keyword recognition, that when properly enunciated, would typically allow for control over the body control or stereo system. Similar to my early experiences with speech-to-text processing, the capabilities were clumsy, yielded inconsistent performance, and at some point, it was easier to simply adjust the car’s thermostat manually vs. continuing to repeat keywords over and over while waiting for a response.
Fast forward to today, where NLP in the car is on par in capability with Alexa-enabled devices. In fact, Alexa is finding its way into the car, as there is a land grab underway to capture as much information regarding user behavior as possible. Beyond Alexa, multiple other digital assistants are typically resident in the high-end vehicles, allowing for voice control over the vehicle, searching the user manual, in addition to voice-controlled navigation, etc. Despite all this original equipment manufacturer (OEM) attention to NLP, however, it’s been shown that these features don’t see much use today.
One of the greatest challenges that manufacturers face in deploying NLP to accomplish more useful tasks stems from onboard compute limitations. The compute demands for NLP are so significant that the data is generally sent to the cloud for processing, avoiding the need for very high-performance and power-hungry AI processors to reside in the vehicle. All is well when connectivity to the cloud can be guaranteed, but when driving in remote locations where cell service may be spotty or non-existent, the NLP will fail. Hence, there is a race underway to design more power-efficient compute architectures that can harness local compute to run LLMs.
While NLP accounts for the majority of the use cases for LLMs in today’s automobile, there are other key areas where they will soon be employed to take on more significant tasks that will yield greater value than voice control over the car’s stereo or heater. Through the combination of computer vision and other AI technologies, LLMs can be used for scene understanding, which can lead to greater levels of driver confidence and safety in ADAS systems. This is similar to more advanced adaptive cruise controls that employ the heads-up display to communicate with the driver that the cruise control does indeed “see” the car that it is following and will keep a safe distance. It does this by providing insights through image captioning on a central display so the driver can have confidence that the vehicle understands the surroundings and is explaining the rationale behind a given action that is going to take place.
This is typically referred to as contextual analysis, and the concept is similar to the way a human typically takes in the entire environment, pedestrians, street signs, road construction, road hazards, etc., and takes action accordingly. Contextual analysis, in conjunction with NLP, will also allow for more human interaction with the vehicle – such as, “What was that building we just passed,” or, “What brand of car is in front of us?” The driver could then ask “How much does this car cost, and where can I purchase it?” (That’s Alexa’s fantasy, anyway.) I’m sure you get the picture by now (pun intended).
Contextual awareness can also enable the vehicle to more closely mimic a driver’s behavior, showing caution when entering an intersection that has a dense population of pedestrians, despite having the full right-of-way. And while V2X (vehicle-to-vehicle communication) offers the promise of understanding traffic ahead etc., to date, deployment has been spotty. With contextual awareness, the driver can be alerted that there is a traffic jam ahead. When V2X is fully deployed, it is not only belt-and-suspenders, but it can also “see traffic” in zero visibility. Here again, if the driver is suddenly experiencing extreme deceleration with no understanding as to why, contextual awareness, in conjunction with V2X, can communicate to the driver that there is a pile-up ahead and that’s why the brakes are being aggressively applied.
So will LLMs end up in the car? I believe so. I also believe that, as they become more integral to the safety of the vehicle, compute architectures that can deliver the requisite performance in vanishingly small power envelopes will come to market to address this need – necessity being the mother of invention.
For now, relying on a connection to the cloud with unpredictable, spotty coverage and indeterminate latency for such an integral function will be a show-stopper. Power is a topic that the AI community, in general, likes to avoid talking about, but it’s currently an “inconvenient truth.”
Based upon current AI compute architectures, GPT-3, which is based on an LLM, requires roughly 1,300 megawatt-hours of electricity for training, and a single query is estimated to consume almost 3 watt-hours of energy. These are astonishingly high levels of energy and hence why data centers are now being considered to be immersed in the SF bay to secure a low-cost means of cooling and the 3-mile island nuclear power plant is being spun up to power a data center. As is happening in the memory industry, it is said that automotive is now the technology driver for memory, it can well be the same for future AI compute architectures.
Since today’s modern car is a data center on wheels, it should come as no surprise that the neural networks that appear to be gaining all the attention these days are also deployed in the automobile, and have been for some time now.
In a previous blog titled, “Why a Self-Driving Car Might Run a “STOB” Sign: To ViT or Not To ViT,” I wrote about some of the different classes of neural networks addressing vision processing. In this blog, I’m going to discuss another class of neural networks employed in the vehicle, referred to as Large Language Models (LLMs).
As a brief genealogy, an LLM is a type of neural network referred to as a transformer model, which falls under the larger generative AI umbrella. Keeping up with the latest developments in these models is difficult, as the number of neural networks being introduced into the public domain is growing at an explosive rate. As a proxy for this growth, over 100 machine learning (ML) white papers are being published daily.
LLMs used for natural language processing (NLP) most likely first caught the awareness of the broader population with Apple’s introduction of Siri. Suddenly, the futuristic concept of computers recognizing speech had gone from being an idea right out of Star Trek to a hand-held reality. And while these initial versions still had room to improve, Siri’s capabilities were nonetheless quite impressive. From personal experience using an early speech-to-text program after a bicycle accident left me unable to type for several months, the improvements were profound.
More recently, the release and growth of ChatGPT, which provides an accessible frontend for various OpenAI LLMs, has marked the next major milestone in raising the broader population’s understanding of the power, potential, and capabilities of AI. This, in turn, has caused a frenzied uptick in the pace of AI research, both in areas of neural networks and semiconductor architectures that improve power efficiency and performance. It has also given rise to the explosive growth of AI in the data center and the “AI bubble” that we are currently experiencing. The power of AI, and of LLMs, in particular, is so profound that governments are aggressively evaluating different approaches to potentially manage and regulate this technology.
So how does this all fit into the automobile? NLP had its humble beginnings a few decades ago with the introduction of simple keyword recognition, that when properly enunciated, would typically allow for control over the body control or stereo system. Similar to my early experiences with speech-to-text processing, the capabilities were clumsy, yielded inconsistent performance, and at some point, it was easier to simply adjust the car’s thermostat manually vs. continuing to repeat keywords over and over while waiting for a response.
Fast forward to today, where NLP in the car is on par in capability with Alexa-enabled devices. In fact, Alexa is finding its way into the car, as there is a land grab underway to capture as much information regarding user behavior as possible. Beyond Alexa, multiple other digital assistants are typically resident in the high-end vehicles, allowing for voice control over the vehicle, searching the user manual, in addition to voice-controlled navigation, etc. Despite all this original equipment manufacturer (OEM) attention to NLP, however, it’s been shown that these features don’t see much use today.
One of the greatest challenges that manufacturers face in deploying NLP to accomplish more useful tasks stems from onboard compute limitations. The compute demands for NLP are so significant that the data is generally sent to the cloud for processing, avoiding the need for very high-performance and power-hungry AI processors to reside in the vehicle. All is well when connectivity to the cloud can be guaranteed, but when driving in remote locations where cell service may be spotty or non-existent, the NLP will fail. Hence, there is a race underway to design more power-efficient compute architectures that can harness local compute to run LLMs.
While NLP accounts for the majority of the use cases for LLMs in today’s automobile, there are other key areas where they will soon be employed to take on more significant tasks that will yield greater value than voice control over the car’s stereo or heater. Through the combination of computer vision and other AI technologies, LLMs can be used for scene understanding, which can lead to greater levels of driver confidence and safety in ADAS systems. This is similar to more advanced adaptive cruise controls that employ the heads-up display to communicate with the driver that the cruise control does indeed “see” the car that it is following and will keep a safe distance. It does this by providing insights through image captioning on a central display so the driver can have confidence that the vehicle understands the surroundings and is explaining the rationale behind a given action that is going to take place.
This is typically referred to as contextual analysis, and the concept is similar to the way a human typically takes in the entire environment, pedestrians, street signs, road construction, road hazards, etc., and takes action accordingly. Contextual analysis, in conjunction with NLP, will also allow for more human interaction with the vehicle – such as, “What was that building we just passed,” or, “What brand of car is in front of us?” The driver could then ask “How much does this car cost, and where can I purchase it?” (That’s Alexa’s fantasy, anyway.) I’m sure you get the picture by now (pun intended).
Contextual awareness can also enable the vehicle to more closely mimic a driver’s behavior, showing caution when entering an intersection that has a dense population of pedestrians, despite having the full right-of-way. And while V2X (vehicle-to-vehicle communication) offers the promise of understanding traffic ahead etc., to date, deployment has been spotty. With contextual awareness, the driver can be alerted that there is a traffic jam ahead. When V2X is fully deployed, it is not only belt-and-suspenders, but it can also “see traffic” in zero visibility. Here again, if the driver is suddenly experiencing extreme deceleration with no understanding as to why, contextual awareness, in conjunction with V2X, can communicate to the driver that there is a pile-up ahead and that’s why the brakes are being aggressively applied.
So will LLMs end up in the car? I believe so. I also believe that, as they become more integral to the safety of the vehicle, compute architectures that can deliver the requisite performance in vanishingly small power envelopes will come to market to address this need – necessity being the mother of invention.
For now, relying on a connection to the cloud with unpredictable, spotty coverage and indeterminate latency for such an integral function will be a show-stopper. Power is a topic that the AI community, in general, likes to avoid talking about, but it’s currently an “inconvenient truth.”
Based upon current AI compute architectures, GPT-3, which is based on an LLM, requires roughly 1,300 megawatt-hours of electricity for training, and a single query is estimated to consume almost 3 watt-hours of energy. These are astonishingly high levels of energy and hence why data centers are now being considered to be immersed in the SF bay to secure a low-cost means of cooling and the 3-mile island nuclear power plant is being spun up to power a data center. As is happening in the memory industry, it is said that automotive is now the technology driver for memory, it can well be the same for future AI compute architectures.