skip to content
series

Going with the flow: faster, more-flexible sequencing data analysis

ScienceDiagnosticsInnovationResearch and developmentData
Pieces of sequencing analysis data with a musical conductor

Part of series

Diagnostic innovation starts here
View series overview

Data are the puzzle pieces of the DNA sequencing world. When a sequencer finishes its work, billions of genetic fragments pile up as terabytes of raw data. Imagine dozens of freight train containers filled to the brim with jigsaw puzzle pieces. That’s the amount of data generated from a single high-throughput sequencing run.

DNA sequencing data analysis is the behind-the-scenes process that transforms this chaos into organized, usable information — turning those masses of digital puzzle pieces into insights. That includes capturing the raw signal; turning it into readable genetic code; analyzing, organizing and storing those files; and sharing the data so scientists can use it in their research.

In a hospital setting, the data analysis process can provide doctors with the sequencing results of a patient’s tumor sample. And if you’ve ever been waiting anxiously by the phone to get a call back from your doctor, you know how important it is to get answers quickly.

Jaedon Scott

Senior International Product Manager for Sequencing Platform Management Software and Bioinformatics at Roche

With so many critical diseases like cancer still in need of effective treatments, that same urgency is true for researchers.

Speed is just one of several fundamental barriers that currently prevent researchers from achieving efficient and cost-effective sequencing workflow in genomic discovery.

“Genomics researchers are constantly needing to balance cost efficiency, flexibility and turnaround time," Scott explains. "In order to drive cost efficiency, researchers often maximize the number of samples they can get from a sequencing run, and that can often mean waiting until you have all of those samples and maximizing the amount of data generated.”

Once a sufficient number of samples are acquired, it can take one to two days just to generate the raw data. And, the entire run needs to be complete before downstream analysis can begin.

Flexible sequencing

Rethinking the (work) flow

Roche is working to improve DNA sequencing data analysis. One element in this process is the development of sequencing by expansion (SBX),a new category of next-generation sequencing (NGS)that will enable faster and more-flexible sequencing.

John Mannion, Vice President of Computational Sciences, Molecular Lab Systems, and his team are innovating new approaches that fundamentally change when and where DNA sequencing data gets processed. Their aim? To analyse data as it’s generated, where it’s generated.

The approach, which leverages the inherent nature of SBX, centers on accelerated real-time downstream analysis, while offering flexibility to users through "data offramps." These offramps are changeable exit points within the data analysis workflow that let researchers choose exactly what they need.

If a user wants to work with ‘raw reads,’ ‘consensus reads’ or ‘aligned reads,’ they're able to. We provide different offramps for them based on their application and development needs.

John Mannion

Vice President of Computational Sciences, Molecular Lab Systems, at Roche

Researchers can use various offramps to fit their needs, regardless of batch size. Additional features of the approach allow researchers to more specifically identify data of interest, enabling labs to discard unwanted data at the source, which helps to mitigate downstream data transfer and storage challenges.

Hours, not days

These flexible options show great promise in helping researchers with a broader range of data analysis and management needs. But what about that speed? 

Working with an early access partner, Roche demonstrated at European Human Genetics Conference (ESHG) in May 2025 that the technology is able to go from library preparation to identifying genetic variants for a single whole genome sample in under five hours¹ — a process that can often take one to two days.

"Within seconds of sequencing starting, we were doing downstream analysis," Scott explains.  

For researchers working on critical applications — such as breakthrough cancer research — this real-time data processing, combined with flexible batch sizes and data management features, enables the flow state that Scott envisions: A future where we spend less time focusing on how to generate and manage data without sacrificing data quality, allowing undivided focus to the results and insights it can deliver.

In a field where breakthroughs could transform millions of lives, achieving that flow state isn't just about novel approaches to sequencing and data analysis — it's a shift that can help unleash human potential to tackle the questions that matter most.

Removing barriers

Given the unique properties of SBX reads, the Roche team has developed a suite of permissive, free, SBX–optimised open source analysis tools (XOOS) to serve as the foundation upon which the SBX ecosystem will be built.

In addition to accessibility and transparency, Chen Zhao, Vice President of Computational Biology, Molecular Lab, and his team are continuously innovating to deliver high–performance, accurate and scalable tools: “Our goal is to empower the NGS community and lower the barrier of adoption for SBX technology.”

In parallel, Zhao’s team has been working closely with the Google DeepVariant team to adapt DeepVariant for SBX, achieving strong performance on Whole Genome germline variant analysis.

SBX-Duplex Data Webinar


Join us for an in-depth webinar on SBX-Duplex data and whole-genome germline small variant calling. Register now to expand your knowledge just in time for the release of our public WGS data set.

Get ready to analyze SBX-D data with confidence.

Register today

See you on September 10 at 9am PDT/6pm CET!

Decorative illustration

Disclaimer: The SBX technology and the analysis tools are in development and not commercially available. The content of this page reflects current study results or design goals.

References

  1. F. Hoffmann-La Roche Ltd. Refining the SBX Fast workflow for reproducible speed and performance [Internet; cited 2025 August 21]. Available from: https://sequencing.roche.com/global/en/article-listing/sequencing-platform-technologies.html