How Can the Life Sciences Benefit from Using Simulated Data?

How Can the Life Sciences Benefit from Using Simulated Data?

The life sciences, like many sectors, are putting a higher
and higher premium on good data. In our digital world, access to reliable data
is a necessity, whether it’s healthcare providers counting on data analytics to
inform patient care or drug developers requiring insightful data to create
effective new therapies.

But there are many obstacles surrounding access to and use
of real-world data. It is often siloed, hard-to-find, hoarded by competitors
and exists in numerous different formats. While there are now information
solutions and platforms that address some of those challenges, privacy and
security concerns are still problematic, particularly when it comes to
sensitive health and medical information.

Computer-Created Data

Given these complications, there is a growing interest in
simulated or synthetic data, which TechCrunch
describes as “computer-generated data that mimics real data.” Software
algorithms can take in data samples (e.g. clinical trial data, medical records,
etc) from the real world and generate simulated datasets that researchers can
work with—without compromising patient confidentiality.

In a piece in The
titled “How Fake Data Could Protect Real People’s
Privacy,” author Viviane Callier offers a case study on how university
researchers using Census data sought to avoid privacy issues by modifying the
dataset. “In their approach, the researchers feed the original Census data,
which is kept confidential, into a complex statistical model that generates a
simulated population that has the same general features as the original data,”
writes Callier. “If you have a confidential dataset of 100 individuals’ ages
and incomes, for example, a corresponding synthetic dataset composed of 100
imaginary individuals would have the same mean age and mean income as the

A Vision of How Simulated Data Could Boost
Development of Therapies

The implications for the life sciences, where the demand
for patient data far outpaces the supply, is clear. In Virtusa’s white paper Unlock
the Power of Simulated Data to Accelerate Research
author Santanu Sen considers the possibilities for rheumatoid arthritis (RA),
an area where better therapies are much needed:

“If we can generate simulated data and simulate our study
area (for example, the entire US population for RA), that will lead to novel
insights and multiple research areas,” explains Sen. “This is where AI
models/algorithms can help determine patient profile-specific therapeutic
regimens, leverage evidence-based predictive model to help identify and
mitigate side effects on therapies, optimize drug synergy in multi-drug
regimens, and further perform drug-dose optimization to simultaneously realize
maximal therapeutic efficacy and clinical safety.”

The Many Benefits of Simulated Data

The advantages that using simulated data provide to
companies and researchers are significant and include:

* The ability to maintain people’s privacy

* Fewer concerns about regulations

* More cost-effective and efficient

* Flexible, customizable and easier to control

* Makes it possible to generate and work with much larger
datasets than what might be found in the real world

* Provides a leg-up to start-ups that don’t already have
access to large collections of expensive data like their established

Brave New Synthetic World

These advantages could apply across sectors ranging from
tech to the social sciences. But how will synthetic data apply specifically to
life science organizations? According to Dane Stout’s white paper titled The
New Synthetic
, simulated data could be used to accelerate
clinical studies, improve patient safety post-market, and incorporate care data
earlier in the medical device design and development process.

“A promising example from clinical research may be to serve
as a synthetic control arm for clinical trials,” adds Stout. “Rather than
collecting data from patients who have been assigned to the control or
standard-of-care arm, it’s possible to model those comparators using real-world
data that has previously been collected.”

Simulated data represents a new horizon, and it is one that
is still being tentatively explored. Real-world data is already complicated
enough to deal with, so there is still much to learn about utilizing synthetic
data in its place. But the immense advantages that it could provide for
industries like pharma and biotech are so significant that it shouldn’t be long
until we see its use more broadly adopted.


Leave a Reply

Your email address will not be published. Required fields are marked *