Subscribe Here!

Accelerating Image Analysis and Cancer Diagnosis with AIRI from Pure Storage and Igneous

by Shaun Walsh – November 28, 2018

Artificial Intelligence (AI) has various applications today, from self-driving vehicles to optimizing workflows in manufacturing operations to detecting malware on the internet. Deep learning is a form of AI where multi-layer neural networks are utilized to transform input data into progressively more defined and useful outputs. Deep learning differs from machine learning (ML) in that ML focuses on the development of task-specific algorithms that can be applied to specific problems, while deep learning focuses on extracting information at multiple levels.

The application of deep learning to radiomics can provide significant value in the diagnosis of disease and other medical conditions, as well as in the design of therapy for these conditions. The use of deep learning for radiomics is particularly well-developed in oncology, where it is used to support diagnostics and treatment formulation for various forms of cancer. Paige.ai is one of the pioneers in the application of deep learning to cancer detection and diagnosis, with the goal of providing a decision support tool to aid medical professionals in the rapid diagnosis and treatment of oncology cases.

Paige.ai’s name exactly states their mission: to combine Pathology images and Artificial Intelligence to provide a Guidance Engine that accelerates the clinical diagnosis of cancer cases. Paige.ai utilizes a novel artificial intelligence/deep learning model based on a database of hundreds of thousands of anonymous digital images and case notes with a best-in-class scale-out architecture to provide a clinical decision support system for oncology pathologists, clinicians, and researchers.

At the heart of the Paige.ai solution is Pure Storage's Artificial Intelligence Ready Infrastructure (AIRI), the neural network that is utilized to run the Paige.ai deep learning models during training and for clinical diagnosis. AIRI comes in two sizes (“Mini-AIRI” and the full-size AIRI model), and combines Pure Storage FlashBlades, an Arista-based 40GbE/100GbE network, and multiple NVIDIA DGX-1 platforms (each DGX-1 has eight Nvidia Tesla V100 GPUs). The full-size AIRI system can hold up to 179TB of data (usable space before data reduction). The AIRI system is shown below:

Paige.ai currently scans about 30,000 pathology slides each month into their database, and the AIRI configuration is sized to support the scanning of up to 100,000 archive cancer slides per month. The role of the Igneous data management software is to manage the access to this data (including the movement of the data through the workflow), and the protection and long-term retention of this data. Igneous enables Paige.AI to protect the work product of the machine learning process through easy, automated policies, allowing for possible reuse or data restoration in case of accidental deletion. The combination of Pure Storage and Igneous provides high performance storage for the AI/ML learning process, as well as an easy to-use method to protect and store the highly valuable results of the machine learning process. The workflow can be broken down into the following steps:


This is one of many companies that Nvidia, Pure Storage, and Igneous are working with in the Life Sciences space. One other example is Altius Institute for Biomedical Sciences. To learn more, read the case study.

read case study

Related Content

Archive Calculator: How to Save Money with Archive to the Cloud | Igneous

December 6, 2019

Efficiently utilize cloud tiers to mitigate storage costs

read more

The Economic Benefits of the Igneous Channel Program

June 12, 2019

A Special Note from Igneous' new VP of Channel Sales

read more

Navigating the Unstructured Data Management Challenges of Next Generation Sequencing Workflows

May 1, 2019

Life science organizations face many challenges when it comes to the informatics component of their research. Scientific instrumentation is generating unstructured data at an unprecedented rate, and existing first tier storage systems can quickly reach capacity.

Next Generation Sequencing (NGS) is currently the largest consumer of storage capacity in the life sciences, but adding to expensive high performance storage as demands increase is not a sustainable or cost effective solution. Let’s look at the unstructured data management challenges of NGS workflows and possible solutions.

read more

Comments