Getting closer to cancer research's holy grail: The Clinico-Genomic Database

By Pierre Valette, Emmy and Peabody-Award winning journalist and TV producer

It’s been called the Holy Grail of medicine: combining genomic and clinical data to understand what drives disease. A new partnership gets us one step closer.

The irony of precision medicine, is that it requires scale. Massive scale.

This is especially true of cancer, where the treatment paradigm is moving away from a one-size-fits-all standard-of-care approach, to more targeted treatments based on the unique characteristics of each patient’s disease. This is the promise of personalised healthcare.

“We're snowflakes when it comes to cancer,” explains David Fabrizio, Vice President of Product Development at Foundation Medicine, whose mother died from cancer before the advent of precision medicine. “To improve outcomes for patients like my mother,” he continues, “we have to understand what makes us unique, what drives our cancer, what drives the disease.”

In order to accelerate the pace of drug discovery and approval, researchers need access to large databases so that even for the rarest forms of cancer, they have a sample size large enough to better understand the drivers of the specific cancer.

For the past nine years, Foundation Medicine, a molecular insights company in Cambridge, Massachusetts, has been building such a database—one of the largest in the world, containing the genomic profiles of more than 300,000 cancer patients.

However, a patient’s genomic profile only tells part of her story. What about her clinical and personal history—like how much she exercises, her diet, past and current illnesses and the efficacy of her treatments over time? Although doctors have been capturing this information for more than a hundred years – they’ve recorded it in disparate ways, and, unless a patient is part of a clinical trial, the data is usually lost.

Bobby Green, M.D., Flatiron Health’s Chief Medical Officer, who continues to treat patients part-time at his clinic in Florida, sees this loss of data not just as a setback for researchers, but for cancer patients. Patients who are diagnosed with cancer are incredibly altruistic,” Green says. “At some point they’ll invariably ask me, ‘Is there any way other people can learn from what I’m going through?’”

The answer, he says, “has routinely and repetitively been, ‘No.’” Flatiron Health and Foundation Medicine are working to change that.

While Foundation Medicine has been building its genomic database, Flatiron has spent the past seven years developing a process for curating and harmonising clinical data from the electronic health records of hundreds of thousands of patients in American academic medical centers and community clinics. The result is one of the largest, most-representative de-identified clinical data sets in the United States, which, like the Foundation Medicine database, is used by cancer researchers worldwide.

In 2014, as Foundation Medicine and Flatiron continued to build and refine their respective data sets, they partnered on an ambitious project to combine a subset of their data into the first-of-its-kind clinico-genomic database, or CGDB. This partnership is now being supported by Roche Group through a recent acquisition of both companies.

“This data set has been the holy grail of all data-oriented medicine for as long as I can remember,” says Gaurav Singal, M.D., Chief Data Officer at Foundation Medicine and a faculty member at Harvard Medical School. “The notion that we can have high quality, research-grade, even regulatory-grade longitudinal clinical data, combined with high quality genomic diagnostic data, has been the dream that can transform the field.”

Overcoming the challenges of combining clinical and genomic data - the “Eureka Moments”

It’s one thing to dream of CGDB’s potential, it’s another to make it a reality. While Flatiron Health and Foundation Medicine had already overcome major technological and logistical hurdles in developing their separate datasets, combining them presented new challenges.

The most complex challenge was linking the data at the patient level, while ensuring patients’ privacy and protection, says Amy Abernethy, M.D., Flatiron’s former Chief Medical & Scientific Officer. She was interviewed at Flatiron on January 31st, 2019, her last day at the company before taking on her new role as the U.S. Food and Drug Administration’s Principal Deputy Commissioner.

She speaks proudly of the team’s first “eureka moment,” when they figured out how to generate patient-specific tokens representing individuals in the Flatiron and Foundation Medicine databases, allowing the study of de-identified, patient-level information.

“That was something that had been felt to be almost insurmountable. But wow,” she recounts enthusiastically, “we did it!"

A second “eureka moment,” she says, was realising the pace at which the CGDB could grow.

“Often when you think about these datasets growing in size,” she explains, “they sort of inch along. They get five or ten percent bigger every year. But here—just because of the rapid growth of Flatiron and Foundation Medicine simultaneously—the dataset is nearly doubling in size every two years. So it's a massive increase in the number of patients represented.” As of February 2019, the CGDB includes data from nearly 50,000 patients, including thirty-eight tumour types.


Singal recalls when he first recognised the power of the database. In early 2017, to demonstrate the CGDB’s research potential, Foundation Medicine and Flatiron created a proof-of-concept study.

Specifically, they examined how lung cancer patients responded to an approved immunotherapy treatment based on two biomarkers: PDL1 and Tumour Mutational Burden (TMB). Using a sample size of just over 2,000 patients with non-small cell lung cancer, they discovered that high versus low TMB has a far greater impact than high versus low PDL1 on response to immunotherapy.

Their results were nearly identical to those derived by a drug manufacturer from a post-hoc analysis of a failed clinical trial. The striking difference was that using the CGDB, the researchers completed their work in a matter of weeks at a relatively low cost.

Singal and his colleagues, inspired by this case study, conducted a systematic comparison of analyses of the CGDB with over a dozen seminal findings in the molecular treatment of lung cancer discovered through traditional approaches. They found that they were able to recapitulate each and every one of these findings. This validation study, published on April 9th, 2019 in The Journal of the American Medical Association, helps establish the necessary groundwork for this dataset to be used to advance cancer research.

Accelerating bringing precision medicine to the world

The CGDB presents a range of opportunities for supplementing and amplifying existing research. One potential use case is label expansion of existing treatments to rare cancer types for which the treatment hasn’t been approved. “Those small groups of patients,” Abernethy explains, “are not only hard to identify, but you have to be able to identify enough of them so that you can randomise them to two different treatments. So conducting a meaningful trial is almost impossible.”

In this use case, researchers could use retrospective data from the CGDB to create or supplement a de facto control arm—so they’d only need to find enough real-world patients for the experimental arm of the study. The CGDB would further drive efficiencies by helping researchers quickly identify patients best suited for the experimental arm.

“This type of use,” Singal says, “could massively accelerate our ability to bring precision medicine to the world in a responsible and scientifically appropriate way.”

Dr. Green elaborates that researchers could use the CGDB to study biomarkers they might not have thought were significant, and determine their importance to cancer treatment. Or they could look beyond the traditional “gold standard” metric of overall survival in a clinical trial, to overall survival in the real world which Dr. Green considers far more relevant.

“All that information,” he continues, “can be leveraged in developing novel treatments and more efficient clinical trials.”

Better data, more informed decisions

To truly unlock the potential of the data, Flatiron and Foundation Medicine are working to make CGDB available to researchers worldwide. But for all of the far-reaching possibilities of the project, Green says, the most important benefit will come at the provider-and-patient level.

Today, doctors do their best with the data they have, Green says. That means they’re making imperfect comparisons — making decisions, say, for an 85 year-old cancer patient with diabetes based on a patient population in a clinical trial that might have looked very different.

“What we’re hoping to build to is a world where we can eventually start to answer the key questions for patients: ‘What’s the best treatment for me? How am I going to do?’” Green says. “Better data will help doctors and patients make better, more informed decisions.”

Discover more

Join us!

This website contains information on products which is targeted to a wide range of audiences and could contain product details or information otherwise not accessible or valid in your country. Please be aware that we do not take any responsibility for accessing such information which may not comply with any legal process, regulation, registration or usage in the country of your origin.

ContactLocationslinkedinfacebooktwitterinstagramyoutubeCovid-19Pharma solutionsRoche careersMedia libraryAnnual Report 2023Privacy policyLegal statement