[ad_1]
Proteins are functional molecules in nature, and there is an entire industry dedicated to successfully modifying them and producing them for a variety of uses. But doing so is time-consuming and unusual; Cradle, an AI-powered tool that aims to change that, tells scientists what new structures and sequences make a protein do what they want. The company came out of hiding today with a huge seed.
AI and proteins have been in the news lately, but mostly because of the efforts of research outfits like DeepMind and Baker Lab. Their machine learning models take easily collected RNA sequence data and predict the structure of a protein—a step that used to take weeks and expensive specialized equipment.
But while this ability is amazing in some domains, it’s a departure in others. Tuning a protein to make it more stable or bind to another molecule involves more than just understanding its overall shape and size.
“If you’re a protein engineer and you want to design a property or make it into a protein, it’s not just about knowing what it looks like. If you just have a picture of a bridge, it doesn’t tell you whether it’s going to collapse or not,” explains Cradle CEO and Founder Stef Van Greeken.
“Alphafold takes the sequence and predicts what the protein will look like,” he continued. “We’re the startup brother of that: you pick the properties you want to engineer, and the model creates sequences that you can test in your lab.”
Predicting what proteins – especially new ones to science – do on site It’s a difficult task for many reasons, but the biggest issue in the context of machine learning is the lack of sufficient data. So Cradle brought a lot of his own data stored in a wet lab, testing protein after protein and seeing what changes in sequence led to what results.
Interestingly, the model itself is not biotech-based, but derived from the “larger language models” that generated text generation engines like GPT-3. Van Grieken notes that how these models understand and predict information is not limited to language, an interesting “general” feature that researchers are still exploring.
The protein sequences that Cradle imports and predicts are not in any language that we know of, but they are relatively straightforward textual sequence translations. “It’s like an alien programming language,” Van Greeken said.
Protein engineers are not powerless, but their work necessarily involves a lot of guesswork. One can know for sure that it is a combination of 100 different sequences
The model works in three basic layers, he explained. It first assesses whether the given sequence is “natural”, i.e. a meaningful sequence of amino acids, or just random. This is similar to the language model of being able to say a sentence in English (or Swedish, in van Grieken’s example) with 99 percent confidence, and the words are in the correct order. It knows this by “reading” millions of such sequences determined by laboratory analysis.
It then looks at the actual or potential translation of the protein into a foreign language. “Let’s say we give you an order, and this is the temperature at which the order breaks down,” he said. “If you do this for a lot of sequences, you can say not only ‘this looks natural,’ but ‘this looks 26 degrees Celsius.'” This helps the model focus on which parts of the protein.
The model can suggest sequences to go in—educated guesses, basically, but a much stronger starting point from scratch. And the engineer or the lab can test it and bring that data back to the Cradle platform, which can be re-entered and used to adjust the model for the situation.
Modifying proteins for a variety of purposes is important in biotech, from drug design to biomanufacturing, and the path from a vanilla molecule to a customized, effective, and efficient molecule can be long and expensive. Any way to shorten it will be welcomed by laboratory technologies that have done hundreds of experiments to get at least one good result.
Cradle has been operating in the dark, and now it has emerged after raising $5.5 million in a seed round co-led by Index Ventures and Kindred Capital, with participation from angels John Zimmer, Feike Sijbesma and Emily Leprost.
Van Greeken said the funding will allow the team to scale up data collection — even better when it comes to machine learning — and work to make the product “more self-service.”
“Our goal is to reduce the cost and time to bring a bio-based product to market on a large scale, so that anyone – even two kids in a garage – can bring it to market,” Van Grieken said in a press release. Marketing a bio-based product.
[ad_2]
Source link