Blog

Subscribe to Email Updates

Data Management Trends: Machine Learning, Artificial Intelligence, and High Performance Computing in the Life Sciences Industry

by Catherine Chiang – April 10, 2018

Explosive data growth in the life sciences industry is nothing new; what’s truly exciting is how this data can be used!

Research and healthcare organizations have been generating huge amounts of data due to developments in scientific equipment, but for years lacked the tools to use their data to its fullest potential. Now, thanks to technological advancements, including machine learning, artificial intelligence, and high performance computing, scientists are harnessing the power of all that data.

ML, AI, and HPC are Revolutionizing the Life Sciences

Enormous amounts of data enable organizations to perform deeper analytics, build more accurate machine learning algorithms, and develop better artificial intelligence models.

One area of the life sciences where using more data has improved outcomes is healthcare. A Fortune article describes how patients can use personalized health data to treat chronic diseases, hospitals use AI to send crisis victims to facilities best prepared to treat them, and pharmaceutical companies use vast stores of genetic data to validate new drugs.

The huge scale of data and the computational power required for ML/AI has pushed life sciences organizations to utilize HPC.

Aaron Gardner, Director of Technology at BioTeam, says, “People are processing more samples, doing more studies, more genomes, more data sets that are larger and larger. All this is creating scale pushing for larger HPC resources in core compute, storage, and networking even though analytics is the key application.”

Data Management for ML/HPC Workflows

We have customers utilizing Igneous Hybrid Storage Cloud to manage petabytes of scientific data, such as tumor imaging data and brain scan data, that must be processed or computed. Automation of data protection, data movement, and easier analysis yields faster time to market and quicker results for our users.

Managing the massive scale of data, administering machine learning processes, and having the IT skill set to manage an HPC environment are all common challenges for life sciences organizations. Furthermore, the cost of IT resources for this scale and new workflows for many create a need to align consumption of resources to project or pipeline.

Gardner says, “I think for consumers of HPC, the skills required to kind of get going and do your science are decreasing, but, would say that to properly provide HPC on premise or in the cloud, the skills and the knowledge required are increasing.”

In the case of machine learning workflows, Igneous Hybrid Storage Cloud acts as a strong, API-driven archive tier that not only ingests enormous amounts of data, but also enables the data to be used. This means that Igneous Hybrid Storage Cloud can plug into customer applications that utilize the data, or easily send data to public cloud for computing.

Importantly, Igneous Hybrid Storage Cloud is delivered as-a-Service. Our remotely managed infrastructure and software reduces management overhead for our customers, so they can keep their IT departments lean and focus on research.

In addition, having a secondary storage solution capable of highly efficient data movement, like Igneous Hybrid Storage Cloud, enables organizations to take advantage of pay-as-you-go cloud economics for HPC in public cloud.

If you would like to learn more about how Igneous Hybrid Storage Cloud can support your machine learning and high performance computing workflows, contact us!

Contact us

Related Content

Why Isilon Users Need Multi-Protocol Support in Their Data Protection Solution

August 21, 2018

As you may have heard, we’ve added support for multi-protocol on Dell EMC Isilon OneFS—making Igneous the only modern scale-out data protection solution for large enterprise customers with this capability.

read more

Igneous Announces Industry “Firsts” Integrations with Modern NAS Providers

August 21, 2018

We are excited to announce three new integration “firsts” with primary network-attached storage (NAS) systems: Dell EMC Isilon OneFS, Pure Storage FlashBlade, and Qumulo File Fabric (QF2)!

read more

In the Machine Learning Era, Unstructured Data Management is More Important Than Ever

July 31, 2018

For most of IT history, the focus of data protection and management has been on structured data. This is what most of us find familiar when we conceptualize “data”: numbers, strings, in neat rows and columns.

read more

Comments