The UC Santa Cruz Genomics Institute announced on Monday that it is working with Amazon Web Services (AWS) to integrate Dockstore, a repository for scientific and biomedical workflows, with the recently released Amazon Genomics Command Line Interface (CLI).
Dockstore is a joint development between the UC Santa Cruz Genomics Institute and the Ontario Institute for Cancer Research. It provides a global cloud library of analytical workflows, so that researchers can easily find and use existing analysis tools, facilitating large-scale biomedical research collaborations. The resource is set up as an app store for bioinformatics tools, that allows researchers worldwide to access and use a wide range of genomics workflows.
“Dockstore’s ability to share bioinformatics workflows has already proven critical in federally funded projects such as NHLBI BioData Catalyst and NHGRI AnVIL that allow for secure, cloud-based genomics analyses,” said Benedict Paten, associate director of the UC Santa Cruz Genomics Institute in a press release. “We are excited about this new collaboration, as it unlocks an entirely new category of users that can quickly utilize available workflows in the cloud to accelerate their research.”
By working to integrate Dockstore on the Amazon Genomics CLI, the workflows and data will align with the technical standards across genomics research projects as established by the Global Alliance for Genomics and Health (GA4GH). The ultimate goal is to help facilitate faster research discoveries by providing interconnectivity between different computational platforms via a set of Application Programming Interfaces (APIs).
In particular, the integration allows Dockstore to execute workflows by using the GA4GH Workflow Execution Service (WES) API. Amazon Genomics CLI provides the WES endpoint that Dockstore utilizes, which allows researchers to launch analysis on AWS cloud resources with little coding or intervention.
“Amazon Genomics CLI promises to simplify genomics analysis in the cloud,” said Taha Kass-Hout, director of machine learning at Amazon Web Services. “Our new collaboration with UCSC allows quick utilization of existing bioinformatics workflows via the Dockstore repository, and will further enhance the opportunities for computational biologists to rapidly ramp up new research directions while using AWS’s proven global infrastructure.”
Amazon Genomics CLI is an open-source tool for genomics and life sciences built to simplify and automate the deployment of cloud infrastructure. The goal is to provide an easy-to-sue tool that will reduce the time needed to set up genomics workflows in the cloud. According to AWS this will allow software developers and researchers to automatically provision, configure and scale cloud resources to enable faster and more cost-effective population-level genetics studies, drug discovery cycles, and enable precision medicine.
It promises to automate the configuration and deployment of workflow engines, creates data access policies, and tunes compute clusters for operation at scale. By running genomics workflows at higher scale and in less time, AWS says the CLI will reduce the time to acquire useful insights such as variant identification and disease diagnosis.