Addressing Key Data Issues Arising from Next-Generation Sequencing

What is the scale of next-generation sequencing’s data problem?

Sponsored content brought to you by

Dell logo

As the use of next-generation sequencing (NGS) increases in the life sciences industry, and the practice is further adopted in healthcare for personalized medicine, data generation is growing at an accelerated rate. NGS has the ability to create massive data sets. With each analysis averaging 120 GB/genome, a single sequencer can generate up to 4 TB of data per day. That data increases when looking for variants or comparing genomic data from tumors. As many facilities employ multiple sequencers to support workloads, the data generated has the potential to grow exponentially.

Then consider that sequencing the genome is only part of the process: analysis, annotation, and analytics also generate data and add to the overall IT requirements of a sequencing environment.

Dell Technologies & NVIDIA Clara Parabricks illustration
Credit: Dell

Why is IT infrastructure so important?

Big data sets pose big questions. Where should the data be stored? How should the data be analyzed? How can analysis be accelerated? The answer is that IT infrastructure can help. To make informed decisions, speed analysis, and draw clinical findings, an organization needs an IT environment that can keep up.

To ensure that the rate of secondary analysis keeps pace with the rate of raw NGS data generation, organizations should—at a minimum—ensure they have sufficient computing and storage resources matched to the output capacity for a fleet of sequencing instruments. Without this, the organization risks an analysis backlog.

An organization’s archive capacity requirements will vary according to organizational goals and processes, but that capacity is typically determined by the organization size and type, data access types, frequency of access, retention periods, and intended use—for example, research or clinical use.

What is Dell’s part in genomic sequencing?

As an IT vendor, Dell Technologies is a trusted and strategic partner to many companies in the healthcare, life sciences, and pharmaceutical industries. Dell offers not only the individual pieces of the IT puzzle, but we also offer a trusted and validated design for genomics that leverages Dell PowerEdge servers with NVIDIA® Ampere GPUs, NVIDIA Clara™ Parabricks® software, Dell PowerSwitch networking, and Dell PowerScale file storage. This architecture combines IT resources required for various forms of genomic data analysis in a compact, easily scalable solution. A typical solution is capable of processing 20 human genomes per day (50x coverage).

Who are your customers? How do you service your customers?

Dell Technologies is a trusted partner to our customers. We are uniquely positioned to offer true end-to-end solutions with monitors, PCs, HPC, data storage, and data protection. We service large and small life sciences companies including pharmaceutical, med/tech, food processing, and environmental, as well as higher-ed, private, and government research institutions, in addition to many healthcare customers who have started their transformation to delivering personalized medicine.

What strategic relationships does Dell have that enable the company to be a leader in the space?

Dell Technologies has an extensive partner ecosystem that better allows us to service our customers. In the field of genomics, we have strong relationships with companies such as NVIDIA and PetaGene.

  • NVIDIA uses Clara Parabricks, a GPU-accelerated computational genomics application framework that can greatly accelerate analysis.
  • PetaGene specializes in genomic data compression, offering up to 60–90% reduction, while remaining lossless with transparent readbacks.

What other workloads does Dell focus on in the life sciences?

In addition to sequencing, we focus on image file and object management workloads including digital pathology, cryo-EM, medical imaging, and other data-intensive tasks. Dell can support any computational workload in the life sciences space with industry subject matter experts and end-to-end solutions from the edge to the core to the cloud.


Learn more


Also of Interest