Betterdata uses synthetic data to protect real data


Singapore-based Beterdata, a startup that uses programmable synthetic data to keep real data safe, announced today that it has raised $1.55 million. The seed round, which was oversubscribed, was led by Investable, with early participation from Franklin Templeton, Excel Next, Singapore University of Technology and Design, Bonn Accelium, Tentative, Plug & Play and Entrepreneur.

The startup was founded in 2021 by Dr. Uzair Javaid CEO and technologist Kevin Yee and aims to make data sharing faster and safer as data protection regulations around the world increase. The company is currently in a research and development partnership with two major universities in Singapore and the United States (whose identities cannot be publicly disclosed). Its clients include the Shanghai Pudong Development Bank.

Betadata says it differs from traditional data sharing methods that use data anonymity because it uses generative AI and privacy engineering instead.

Yee explained to TechCrunch that programmatic synthetic data uses generative models to create and augment new data sets, including deep learning models, transformers used in deep faxing, and diffusion models used in stable distribution.

These synthetic data sets have the same characteristics and structure as real-world data without revealing sensitive or personal information about individuals.

“The idea is to create a virtual version of a real dataset that can be used securely for a variety of purposes, including protecting confidential information, reducing bias, and improving machine learning models,” he said.

Programmatic artificial intelligence helps developers in many ways. A few examples include helping to protect sensitive data, helping to comply with data protection regulations such as GDPR and HIPAA, increasing data sharing between teams, creating additional records for underrepresented groups to train more data, test and validate machine learning models, and resolve data inconsistency issues. or parts.

Betherdata’s funding will be used to launch the product and enhance its programmable synthetic data tech stack, including support for single-table, multi-table and time-series datasets. These are different types of tabular datasets and Yee explains that the main differences are their structures and the problems they were created to solve.

For example, single-table datasets focus on independent tables, multi-table datasets consider relationships between multiple tables, and time series datasets deal with data collected over time.

Beterdata plans to hire more people, including sales and marketing staff, and expand beyond Singapore into the Asia-Pacific region in the next one to two years.

In a statement about Investable’s investment, CEO Khairu Rajal said, “BetterData solves one of the biggest issues facing the AI ​​industry today: the lack of high-quality data that meets privacy requirements. With its powerful platform, Betdata generates synthetic data that mimics real-world data without compromising quality and privacy, helping businesses meet global compliance and privacy laws at scale.



Source link

Related posts

Leave a Comment

18 − two =