Subscribe Here!

Data Management Trends: Machine Learning, Artificial Intelligence, and High Performance Computing in the Life Sciences Industry

by Catherine Chiang – April 10, 2018

Explosive data growth in the life sciences industry is nothing new; what’s truly exciting is how this data can be used!

Research and healthcare organizations have been generating huge amounts of data due to developments in scientific equipment, but for years lacked the tools to use their data to its fullest potential. Now, thanks to technological advancements, including machine learning, artificial intelligence, and high performance computing, scientists are harnessing the power of all that data.

ML, AI, and HPC are Revolutionizing the Life Sciences

Enormous amounts of data enable organizations to perform deeper analytics, build more accurate machine learning algorithms, and develop better artificial intelligence models.

One area of the life sciences where using more data has improved outcomes is healthcare. A Fortune article describes how patients can use personalized health data to treat chronic diseases, hospitals use AI to send crisis victims to facilities best prepared to treat them, and pharmaceutical companies use vast stores of genetic data to validate new drugs.

The huge scale of data and the computational power required for ML/AI has pushed life sciences organizations to utilize HPC.

Aaron Gardner, Director of Technology at BioTeam, says, “People are processing more samples, doing more studies, more genomes, more data sets that are larger and larger. All this is creating scale pushing for larger HPC resources in core compute, storage, and networking even though analytics is the key application.”

Data Management for ML/HPC Workflows

We have customers utilizing Igneous Hybrid Storage Cloud to manage petabytes of scientific data, such as tumor imaging data and brain scan data, that must be processed or computed. Automation of data protection, data movement, and easier analysis yields faster time to market and quicker results for our users.

Managing the massive scale of data, administering machine learning processes, and having the IT skill set to manage an HPC environment are all common challenges for life sciences organizations. Furthermore, the cost of IT resources for this scale and new workflows for many create a need to align consumption of resources to project or pipeline.

Gardner says, “I think for consumers of HPC, the skills required to kind of get going and do your science are decreasing, but, would say that to properly provide HPC on premise or in the cloud, the skills and the knowledge required are increasing.”

In the case of machine learning workflows, Igneous Hybrid Storage Cloud acts as a strong, API-driven archive tier that not only ingests enormous amounts of data, but also enables the data to be used. This means that Igneous Hybrid Storage Cloud can plug into customer applications that utilize the data, or easily send data to public cloud for computing.

Importantly, Igneous Hybrid Storage Cloud is delivered as-a-Service. Our remotely managed infrastructure and software reduces management overhead for our customers, so they can keep their IT departments lean and focus on research.

In addition, having a secondary storage solution capable of highly efficient data movement, like Igneous Hybrid Storage Cloud, enables organizations to take advantage of pay-as-you-go cloud economics for HPC in public cloud.

If you would like to learn more about how Igneous Hybrid Storage Cloud can support your machine learning and high performance computing workflows, contact us!

Contact us

Related Content

The Economic Benefits of the Igneous Channel Program

June 12, 2019

A Special Note from Igneous' new VP of Channel Sales

read more

PAIGE and Igneous Build Industry-Leading Compute Cluster for Healthcare AI

January 16, 2019

PAIGE’s mission to revolutionize the diagnosis and treatment of cancer through machine learning requires an extremely large dataset of high resolution slide images. To do so, they are building an industry-leading compute cluster for healthcare AI. The team needed to not only protect all of this unstructured data, but also programmatically move and process small subsets of the overall dataset on demand for high performance computations.

read more

Accelerating Image Analysis and Cancer Diagnosis with AIRI from Pure Storage and Igneous

November 28, 2018

Artificial Intelligence (AI) has various applications today, from self-driving vehicles to optimizing workflows in manufacturing operations to detecting malware on the internet. Deep learning is a form of AI where multi-layer neural networks are utilized to transform input data into progressively more defined and useful outputs. Deep learning differs from machine learning (ML) in that ML focuses on the development of task-specific algorithms that can be applied to specific problems, while deep learning focuses on extracting information at multiple levels.

read more