Machine learning…yes, it’s all over the industry press and every tech vendor uses the term in their marketing to jump into the chatter. So, here is yet another blog from another tech vendor on machine learning! Well, not quite.
I’m more interested in the intersection between this technology-based marvel—which is poised to help solve some very gnarly and real problems (from cybersecurity to cancer prevention)—and the people involved in building and using the elaborate systems underneath the flash of machine-based learning outcomes.
More specifically, how are today’s I&O leaders educating, preparing, strategizing and executing against what will be a very different unstructured, file data-centric IT environment?
Some facts to lay the groundwork:
- Machine learning (ML) and data-science initiatives are real and have strong momentum across industries: ML patents grew at 34% compound annual growth rate (CAGR) between 2013 and 2017, the 3rd fastest growing category of all patents granted. Moreover, 61% of organizations most frequently picked ML/AI as their companies most significant data initiative for next year (Source: 2018 Outlook: Machine Learning and Artificial Intelligence, A Survey of 1,600+ Data professionals)
- Organizations are investing…heavily: International Data Corporation (IDC) forecasts that spending on
ML is creating value today: Google and MIT recently completed a study showing the impact of ML on an organization’s competitive advantage. 35% of organizations using ML state they can complete faster data analysis delivering greater insight and acuity to their org’s, and another 35% state that Machine Learning is enhancing their R&D capabilities for product development. (Source: Google & MIT Technology Review Study: Machine Learning: The New Proving Ground for Competitive Advantage)
- Many I&O leaders are in early learning on how to enable data management within ML pipelines: Gartner’s Chirag Dekate published a great report titled “Three Elements of a Scalable Enterprise Machine Learning Infrastructure Strategy” in which he summarizes that I&O leaders “struggle with data management challenges due to the inability to locate/access data across organizational silos, execute cleanup and seamlessly integrate data pipelines.”
Questions I&O leaders are—or should be—asking to equip a Machine Learning shift:
The uncertainty around how I&O can support—and proactively enable—ML use cases for their organizations is a key barrier to I&O taking the leadership position. So, what questions are leaders grappling with and what guidance is being given?
1. Should we architect for Scale or Data Accessibility out of the gate?
Gartner’s Chirag Dekate’s advice center’s on architecting for data movement: “Deliver scalable compute and data infrastructures by aligning systems architecture with underlying workload requirements and adopting technologies that accelerate data movement.”
According to an article in Information Age, “Flexibility for AI means addressing data maneuverability. As AI-enabled data centers move from initial prototyping and testing towards production and scale, a flexible data platform should provide the means to independently scale in multiple areas: performance, capacity, ingest capability, lash-HDD ratio and responsiveness for data scientists.”
2. How much of our legacy infrastructure and data management investments can be used for new machine learning initiatives?
Guidance here is an “it depends.” Lee Beardmore, VP and CTO at Capgemini Business Services, states that “businesses must ensure they have a well-structured architecture framework that enables CIOs to respond with the flexibility required to incorporate the new and replace the old.”
Lenny Pruss from Amplify Partners takes a much stronger stance, advocating for a full revamp of infrastructure (see the Amplify Partners Infrastructure 3.0 stack below) to support ML pipelines: “Going forward, an entirely new toolchain is necessary to unlock the potential of ML/AI, to make it operational and usable—let alone approachable—for developers and enterprises.”
3. What about our data protection strategy—does this need to change to address machine-generated data?
This question is bigger and more involved than a single blog can address, but…the short answer is YES. Jennifer Nelson at Network World wrote a great article last year addressing this—not from a technical or architectural perspective, but from a business lens. “The secret to knowing which data to retain is knowing your company and what value you and your customers get from each individual piece of data.” She continues by reminding us of the simple—albeit easier said than done—steps to take: “Once you know which KPIs are critical to understanding your bottom line, identifying the data from which those KPIs are derived is the next step. Retaining any other analytic data not related to those KPIs will consume unnecessary resources.”
There are other data protection considerations surrounding the privacy of the data used machine learning and AI models as well, but that’s for another blog.
How are your teams evolving?
So, what are the steps your team is taking to conquer the new infrastructure landscape demanded by machine learning initiatives and the applications being created to drive them?
We’d love to hear from you on the top questions and considerations your teams are investigating and the learnings you’re cultivating.
Allison Armstrong is Igneous’ new VP of Marketing. Prior to Igneous, Allison was VP, Product and Technical Marketing at Alert Logic. She has successfully lead teams and growth strategies in the data protection, IT analytics, cloud security and IT transformation spaces, with companies including Apptio, Quantum and Symantec.