A wealth of biological data has enabled scientists and healthcare researchers to use powerful computational methods to discover information. High-performance computer systems and domain-specific software frameworks balance this revolution in digital biology.
Two of the most powerful supercomputers in the world, based on the NVIDIA DGX SuperPOD standard architecture – NVIDIA’s Cambridge-1 and Recursion’s BioHive-1, were recently named in the TOP500 ranking of the most powerful systems. Last year, researchers at the Athinoula A Martinos Biomedical Imaging Center used a variety of NVIDIA AI systems, including NVIDIA DGX-1, to speed up their computational research. severity of lung disease using x-ray images to predict outcomes in COVID patients.
While these advancements are underway, Next Generation Sequencing (NGS) activities are now being fueled by the NVIDIA Clara Parabricks suite of genomic libraries and reference applications to push back the barriers of genetic research.
What is Clara Parabricks?
NVIDIA Clara Parabricks is a computer framework for genomic applications ranging from DNA to RNA. It creates GPU-accelerated reference application libraries, pipelines, and workflows for primary, secondary, and tertiary analysis using CUDA, AI, HPC, and NVIDIA data analysis stacks. It is a complete solution for genomics laboratories to facilitate the development of new applications.
Based on the Broad Institute Genome Analysis Toolkit (GATK), Clara Parabricks pipelines enable GPU-accelerated GATK and other third-party tools such as Google’s DeepVariant caller. Clara Parabricks maps, aligns, filters and calls variants for the detection of germline or somatic variants, starting with DNA sequencing reads. STAR and STAR-Fusion align sequencing reads for RNA-based projects, allowing the reads to be split into exon or intron boundaries, followed by a variant call.
Speed up research with Clara Parabricks
Parabricks pipelines are a robust set of genetic tools that can be customized to meet the demands of scientific research and laboratories. Researchers use NVIDIA GPU systems, ranging from desktop desktops to GPU-accelerated clouds and some of the world’s fastest supercomputers, to run Parabricks Pipelines workloads.
Shanghai-based Mingma Biotechnology became the first research center in China to launch Clara Parabricks Pipelines this month to support its work in precision medicine. It follows large-scale genomics programs launched earlier this year in Thailand and Japan.
Even Houston-based Greffex is leveraging Parabricks Pipelines and NVIDIA Clara Discovery to ramp up efforts to create a universal flu vaccine just weeks after starting with an NVIDIA RTX data science workstation. The startup combines genomic sequences, molecular dynamics techniques and wet lab research to study how influenza strains develop over time and the impact of these changes on vaccine effectiveness.
To monitor the course of influenza, Greffex collects tens of thousands of influenza genomes from around the world and performs massive queues on NVIDIA RTX 8000 GPU recognize changes in the genetic code of the virus. The company saves up to 13 hours per sample while running genomic workloads on GPUs, allowing its team to refine alignment results.
Genomic information for population studies
On NVIDIA GPUs, researchers using Parabricks Pipelines can speed up DNA and RNA-based projects up to 50 times, allowing scientists to extract as much usable information as possible from hundreds of gigabytes of data of instruments generated every day. This acceleration is particularly important for public health institutions and research laboratories carrying out population studies involving tens of thousands of genomes to be processed.
Mingma Biotechnology has set up Parabricks pipelines and NVIDIA T4 Tensor Core GPU to accelerate its sequencing and multi-omic analysis of data. The company helps medical institutions, pharmaceutical companies and researchers conduct medical research by providing genetic information to identify and explore the root causes of diseases.
Fueling genetic analysis
Apart from that, Genomics Thailand is powered by NVIDIA DGX A100 system at Thailand’s National Biobank providing genomic medicine as a standard health service. The research organization analyzes genetic variants using whole genome sequencing data from 50,000 Thai participants using Parabricks Pipelines.
NVIDIA DGX A100 is the world’s first five petaFLOPS AI system, delivering unprecedented compute density, performance and flexibility. The NVIDIA DGX A100 features the world’s most advanced accelerator – the NVIDIA A100 Tensor Core GPU – allowing businesses to combine training, inference, and analysis into a single, easy-to-deploy AI infrastructure with access direct to NVIDIA AI experts.
Combining the DGX system with Parabricks Pipelines reduced the project’s genomic data processing time by four months. The results of the study will help researchers better understand genetic variance in the Thai population.
In Japan, the Human Genome Center at the University of Tokyo recently introduced SHIROKANE, the country’s fastest supercomputer for the life sciences. The DGX A100-powered system uses Parabricks Pipelines to sequence the entire genomes of 92,000 patients, resulting in a database that will serve as the cornerstone for precision medicine efforts in cancer and other complicated diseases.
As NVIDIA’s Parabricks pipeline acts as a powerful tool in the hands of genetic researchers, it will be interesting to see how it paves the way for genetics and genomics with future developments and updates.
Join our Telegram group. Be part of an engaging online community. Join here.
Subscribe to our newsletter
Receive the latest updates and relevant offers by sharing your email.