Achieving autonomous driving safely requires nearly endless hours of training software on potential scenarios before putting a vehicle on the road. Historically, autonomous companies have collected a lot of real-world data on which to train their algorithms, but it’s impossible to train a system on how to handle edge cases based on real-world data alone. Not only that, but it is time-consuming to collect, sort and label all the data in the first place.
Most self-driving vehicle companies, such as Cruise, Waymo, and Wabi, use synthetic data to train and test models with real-world data at incredible speeds and levels of control. Parallel Domain, a startup that has built a data generation platform for autonomous companies, says artificial data is a critical component in scaling AI, which powers vision and perception systems, and prepares for the unpredictability of the physical world.
The startup has closed a $30 million Series B round. March Capital, with participation from return investors Costanoa Ventures, Foundry Group, Calibrate Ventures and Ubiquity Ventures. Parallel Domain is focused on the automotive market, developing advanced driver assistance systems for some major OEMs and advanced self-driving systems for autonomous driving companies. Now, Parallel Domain is poised to expand into drones and mobile computing, says co-founder and CEO Kevin McNamara.
“We’re also doubling down on generative AI approaches for content generation,” McNamara told TechCrunch. “How can we use some of the advances in generative AI to bring a much wider range of things and people and behaviors into our world? Because again, the hard part here is really, once you have a physically accurate descriptor, how do you build the million different scenarios that the car is going to experience?”
According to McNamara, the startup is looking to hire a team to support its growing customer base in North America, Europe and Asia.
Building a virtual world
In the year When Parallel Domain was founded in 2017, the startup focused on creating virtual worlds based on real-world map data. Over the past five years, Parallel Domain has added to its world generation, populating it with cars, people, different times of day, weather, and all those other features that make the world interesting. This enables clients – of which parallel domain counts. Google, Continental, Woven Planet, and Toyota Research Institute — want to accurately train and test their vision and perception systems to generate dynamic camera, radar, and lidar data, McNamara said.
The parallel domain synthetic data platform consists of two modes: training and testing. During training, customers specify high-level parameters – for example, driving on the highway with 50% rain, 20% at night and an ambulance in each sequence – they want to train their model, and the system generates hundreds of thousands. Examples to meet these parameters.
On the experimental side, Parallel Domain provides an API that allows the client to control the placement of variables in the world, which can be attached to their simulator to test specific scenarios.
Waymo, for example, is interested in using synthetic data to test for different weather conditions, the company told TechCrunch. (Disclaimer: Waymo is not a certified Parallel Domain client.) Waymo sees weather as a new lens that can be applied to all the miles you’ve driven in the real world and in simulation, because it’s impossible to remember every random encounter. Weather.
Whether for testing or training, parallel domain software can automatically generate labels associated with each simulated agent each time it creates a simulation. This helps machine learning teams perform supervised learning and testing without going through the tedious process of sorting data themselves.
The parallel domain depicts a world where autonomous companies use synthetic data for most of their training and testing needs. Today, the ratio of synthetic to real-world data varies from company to company. More established businesses with historical resources that have collected a lot of data are using synthetic data for 20% to 40% of their needs, while companies earlier in their product development process rely on 80% artificial and 20% real world. According to McNamara.
Julia Klein, a partner at March Capital and now one of Parallel Domain’s board members, said she thinks artificial intelligence will play an important role in the future of machine learning.
“Getting the real-world data needed to train computer vision models is often a hurdle, and there are stops to get that data in, label that data, and get it right where it can be used,” Klein told TechCrunch. “What we’ve seen with Parallel Domain is that process. They’re accelerating exponentially, and they’re solving problems that you can’t even find in real-world datasets.”