The automotive industry is seeing significant advancements with the integration of AI in autonomous driving, enhancing the scalability, generalizability, and robustness of these systems. As autonomous driving becomes more complex, the need intensifies for innovative methods to train, test, and validate these technologies across varying conditions.
A recent paradigm shift in the autonomy industry, fueled by emerging generative AI and machine learning technologies, has transformed both the onboard autonomy stack and its offboard development pipeline. Powered by large-scale data and advanced computing, these innovations promise greatly enhanced efficiency and expanded capabilities.
This blog post explores three research areas expected to play key roles in the evolution of autonomous vehicles.
Multimodal Foundation Models for Differentiable Autonomy Stacks
Autonomy stacks are increasingly shifting toward a more data-driven and differentiable paradigm, consolidating conventional perception, prediction, and planning modules. However, it remains a challenge to enhance the generalizability of the differentiable stack so that it can not only utilize vast amounts of internet-scale data but also interpret and understand the underlying logic of driving decisions. Language modality shows strong potential in bridging such demands by enhancing the differentiable stack with multimodal foundation models (MMFM). MMFM takes into account vision, language and various other modalities, harnessing large-scale internet and driving data to tackle corner cases and align decisions with human thoughts, and it shows great potential towards generalizable autonomy orchestrated with differentiable stack.
Closed-Loop Simulation and Data Engine with Generative AI
To train and test a differentiable autonomy stack in a scalable way, a photo- and behavior-realistic simulation toolchain in a closed-loop fashion is expected. The toolchain also must have features such as generating reactive behavior of road users creating or modifying diverse, realistic scenarios, and generating synthetic data from multi-modal sensors. To meet such demands in a scalable and automated manner, the emerging generative AI and 3D reconstruction technologies are intensively incorporated to transform the creation in simulation into a data-driven and AI-powered paradigm. In combination with such a reactive data generator, MMFM also demonstrated great potential in curation and mining of real-world data towards an automated and efficient set of data engine and simulation toolchain.
Reinforcing and Aligning Autonomy Stacks in Closed Loop
While imitative pre-training for differentiable autonomy stacks can exploit large-scale data with real-world demonstrations, it is still an open question on how to robustify, refine and customize the stack to tackle out-of-distribution cases and meet under-demonstrated requirements. Reinforcement learning in closed loop offers the potential to improve the autonomy stack beyond the existing and collectable data, and it also triggers the demand for rewards well depicting the driving requirements with human preference well aligned. Approaches being explored would allow the AI stack to learn by trial and error without predefined reward structures, thereby addressing such challenges.
Applied Intuition’s Approach
Applied Intuition collaborates with top industry partners to enhance our autonomy stacks, vehicle software, and development platform for autonomy. We are committed to advancing AI technologies to deliver superior products and solutions.
In advancing the cutting-edge AI research revolutionizing the autonomous systems, Applied Intuition is assembling an AI Research team which includes Wei Zhan, who recently joined the company as chief scientist and is co-director of Berkeley DeepDrive. The AI Research team is working to pursue multimodal foundation models and differentiable stack integrating language with vision and other modalities in scalable autonomous systems by leveraging large-scale data from various sources to develop robust, mass-producible autonomous solutions.
By leveraging the industry-leading simulation toolchain created by Applied, together with cutting-edge 3D reconstruction and generative AI, the AI Research team creates the AI technology towards the next-generation closed-loop simulation and data engine with high fidelity, usability, scalability, and controllability.
Developing fully autonomous driving systems capable of functioning in real-world conditions demands continuous innovation and technology integration. Our work on closed-loop simulations and language-integrated autonomy stacks is progressing well towards this objective.
“The key to advancing autonomous driving technology lies in our ability to harness large-scale, multi-modal data from both real world and AI,” Zhan said. “Our ongoing research is focused on creating and transforming such data into usable assets and actionable insights toward scalable and safe autonomous systems in the new era.”
Visit Applied Intuition's Careers page to explore opportunities with the AI research team.