Blog

Subscribe Here!

The Backup Window is an Outdated Concept

by Steve Pao – July 5, 2017

Backup windows for unstructured file data will go the way of the rotary phone. Why? Backup windows were designed when tape was the primary backup target, before the proliferation of unstructured data. Let’s explore how legacy backup software backed into the concept of backup windows and how a modern approach eliminates them.

Originally Designed for Tape

Prior to disk-based backups, the primary medium for backup was tape. The medium had some interesting constraints because data had to be serially written to – and read from – tape. Moreover, the job of backup systems was to keep the data flowing at a continuous rate as the tape’s physical reels rotated.

As a result, single-threaded streams and continuous rates evolved as fundamental concepts in legacy backup protocols (such as Network Data Management Protocol or NDMP) used in primary storage systems and in legacy backup software that supports those protocols. Even after disk became a viable backup target, these concepts persisted in legacy backup software and protocols.

The Backup Window Emerges

All backup software reads data from the primary storage system, potentially impacting the performance of user and application access to data. With the concepts of single-threaded streams (or “jobs”) and continuous data rates, backup administrators had to choose between how many concurrent jobs they wanted to run (impacting system performance), against how frequently they wanted to backup their data.

To meet daily backup requirements, IT administrators generally ran backups at night, and for weekly backups, they ran backups over weekends. This practice enabled those administrators to run as many backup jobs as their systems could handle during these discrete periods.

The Growth of Unstructured File Data

Historically, when data sizes were small and when users and applications accessed primary storage systems only during regular working hours, the practice of scheduling backup windows outside those hours largely worked.

However, humans are no longer the only ones generated unstructured file data in the form of Word documents or Excel spreadsheets in their home directories. Most of today's data is generated by machines (e.g., medical equipment, cameras, and now autonomous vehicles) and software applications (e.g., design automation software, image rendering, and scientific computing).

The growth both in file count and in total data volume strains the concept of jobs utilizing single threaded streams. Often, there's simply too much data to move during a backup window!

When backup windows extend into working hours, user complaints often force backup administrators to turn off backups altogether, often resulting in no complete backup set for the day or week.

In some organizations, continuous processing and 24/7 operations challenge the notion of backup windows because data must be available to users and applications every minute of the day.

Rethinking Backup

It's time to rethink backup. A modern approach can eliminate backup windows altogether. Consider these two approaches:

  • Multi-streaming: Without the requirement to single stream data serially to tape, data can move faster in dynamic, parallel streams without administrators having to manually split backups into separate, discrete jobs.
  • Latency awareness: Without the requirement to stream data at a continuous rate, data can move faster when the primary system load from users and applications is low, and “back off” when users and applications are accessing the data.

With these approaches, it is possible to run backups all the time without impacting users or applications. In essence, backup jobs run at maximum speeds when usage of the primary systems is low, and automatically slow down when needed.

The ideas here are pretty straightforward, and they work! Igneous customers use Igneous Hybrid Storage Cloud to backup primary storage file systems that couldn't previously backup during acceptable backup windows.

The trick here is making these concepts work together, and this is where our engineering comes in. Look for future posts about how we overcame challenges to implement our unique secondary storage approach, including:

  • Removing the reliance on NDMP to track changes
  • Integrating with NAS systems to enforce read consistency
  • Providing a horizontally scalable and performant backend target for the multi-streaming

Download our "Secondary Storage for the Cloud Era" whitepaper for more insights on today's secondary storage challenges and solutions for overcoming them. 

Read whitepaper

Related Content

Three Benefits of Backup-as-a-Service (BaaS) for Managing Unstructured Data

March 27, 2019

Ah, managing backups: A necessary, but notoriously tedious task that most IT administrators would happily hand off to someone, anyone else. In today’s increasingly automated and machine-driven age, that someone else could be...software.

read more

Top 10 IT Trends for 2019

February 19, 2019

In 2019 and beyond, 451 Research sees a key shift in the world of IT—the breaking apart and coalescing of old silos of technology. Today, technological advances feed off each other to drive innovation. With this new paradigm of technological innovation, 451 Research shares 10 IT trends they predict for 2019.

read more

PAIGE and Igneous Build Industry-Leading Compute Cluster for Healthcare AI

January 16, 2019

PAIGE’s mission to revolutionize the diagnosis and treatment of cancer through machine learning requires an extremely large dataset of high resolution slide images. To do so, they are building an industry-leading compute cluster for healthcare AI. The team needed to not only protect all of this unstructured data, but also programmatically move and process small subsets of the overall dataset on demand for high performance computations.

read more

Comments