If your organization tackles machine learning, artificial intelligence, or high-performance computing applications, you’re also dealing with a ton of unstructured data. Here are the best sessions at Supercomputing 2018 that will provide industry expertise and insights to help you effectively manage your data and architect your infrastructure.
Supercomputing 2018 will be held in Dallas, TX, at the Kay Bailey Hutchison Convention Center. While you're at #SC18, be sure to stop by our booth #4244!
1. Data Protection Solutions for ML/AI
Wednesday, November 14 2pm - 2:30pm
ML and AI applications hold enormous opportunities, but these workflows require massive amounts of data. How do you store, protect, move, and manage this data? Igneous’ own Allison Armstrong, VP of Marketing, and David Clements, Solutions Architect, will present real life examples of data protection for ML/AI.
Igneous has customers building artificial intelligence models that revolutionize the diagnosis and treatment of cancer and other life-threatening diseases. Their machine learning workflows utilize a proprietary storage solution as an archive tier, working in conjunction with high-performance storage and computing.
Igneous acts as the central repository for petabytes of imaging data. Small subsets of data (< 2% of the entire dataset) are active at any one time for high-performance compute, which acts as a “hot edge” for the data to be processed by image processing software running on a high-performance deep learning platform. In addition, Igneous acts as the central repository to archive all computational results, enabling the “hot edge” to be cleared for subsequent workloads. Delivered as-a-Service and remotely managed, Igneous enables organizations to keep their IT departments lean so that they can focus on groundbreaking research instead.
2. Dac-Man: Data Change Management for Scientific Datasets on HPC Systems
Thursday, November 15 3:30pm - 4pm
Scientific data is growing rapidly and often changes due to instrument configurations, software updates, or quality assessments. These changes in datasets can result in significant waste of compute and storage resources on HPC systems as downstream pipelines are reprocessed. Data changes need to be detected, tracked, and analyzed for understanding the impact of data change, managing data provenance, and making efficient and effective decisions about reprocessing and use of HPC resources. Existing methods for identifying and capturing change are often manual, domain-specific, and error-prone and do not scale to large scientific datasets. In this paper, we describe the design and implementation of Dac-Man framework, which identifies, captures, and manages change in large scientific datasets, and enables plug-in of domain-specific change analysis with minimal user effort. Our evaluations show that it can retrieve file changes from directories containing millions of files and terabytes of data in less than a minute.
3. Analytic Based Monitoring of High Performance Computing Applications
Tuesday, November 13 4pm - 4:30pm
The complexity of High Performance Computing (HPC) systems and the innate premium on system efficiency necessitate the use of automated tools to monitor not only system-level health and status, but also job performance. Current vendor-provided and third party monitoring tools, such as Nagios or Ganglia, enable system-level monitoring using features that reflect the state of system resources. None of those tools, however, are designed to determine the health and status of a user’s application, or job, as it executes.
This presentation introduces the concept of job-level, analytics-based monitoring using system features external to the job, like those reported by Ganglia. Preliminary results show these features contain sufficient information content to characterize key behaviors of an executing job when incorporated into a job-specific, application-neutral analytic model; transitions between computational phases, onset of a load imbalance, and anomalous activity on a compute node may each be detected using this approach.
4. Managing the Convergence of HPC and AI
Wednesday, November 14 1:30pm - 2pm
HPC environments are the most significant source of processing capacity in many organizations, and more users want to leverage the power of the “SuperComputer" for their workloads to get performance beyond the single box. These "new customers" for your HPC cluster may have little knowledge on how to access, configure and deploy workloads where typical open-source cluster management solutions are used, driving a significant amount of handholding for administrators. In particular, users wanting to experiment with or deploy AI training on the cluster may simply have data and a desire to train, without the technical expertise to configure scripts, resources, frameworks, libraries, etc. to run. Bringing these new users into the HPC environment is a significant opportunity to grow and expand the value of your infrastructure – but only if it is easy to use and easy to manage, and consistent for both HPC and AI workloads.
5. HPC Meets Real-Time Data: Interactive Supercomputing for Urgent Decision Making
Thursday, November 15 12:15pm - 1:15pm
Big data and digitalization are creating exciting new opportunities that have the potential to move HPC well beyond traditional computational workloads. Bearing in mind the availability of fast growing social and sensor networks, the combination of HPC simulation with high velocity data and live analytics is likely to be of great importance in the future for solving societal, economic and environmental problems. The idea of this BoF is to bring together those interested in the challenges and opportunities of enhancing HPC with real-time data flows and interactive exploration capacities to support urgent decision making.
While you're at Supercomputing, be sure to swing by booth #4244 or schedule a time with us!