Subscribe Here!

Thinking About Archiving to Cloud? Read This First.

by Andy Ferris – April 19, 2019

Archiving data for long-term retention is a common use for cloud storage, with compelling benefits. However, as with many data protection strategies, cloud archive encounters problems when datasets are large—at the scale of hundreds of terabytes or petabytes.

Read on to discover the pros and cons of cloud archive, and how to solve the problems of cloud archive at scale.

What is Cloud Archive?


Cloud archive involves tiering old data to the cold storage tiers of a cloud provider for long-term retention.

Benefits of Cloud Archive: Cost and Accessibility


Traditional archive solutions rely on either disk or tape for long-term storage, but both have significant drawbacks that make cloud archive a more appealing solution.

An archive strategy using disk will involve archiving old data to a lower-performant tier of NAS that is typically less expensive than traditional primary NAS storage. However, even the most cost effective NAS is still going to be extremely expensive when the true cost of ownership also includes managing the datacenter, ongoing software and hardware maintenance, cooling, power, and leasing or buying the physical space for the datacenter.

The other traditional solution for long-term storage is tape, which is cheaper per unit. However, data recovery with tape is a tedious, time-consuming, and error-prone process, making archived data on tape effectively inaccessible by end users. This results in end users that are highly resistant to archiving their data to tape because they feel as though they’ll never be able to get it back. As a result, organizations end up keeping hundreds of terabytes or petabytes of cold data on expensive primary storage, driving up storage costs and defeating the purpose of having an archive solution at all.

Cloud archive offers the best of both worlds, with a cheaper price point than any on-premises disk solution and far better accessibility than tape. Cost-wise, AWS’ recently launched Deep Glacier Tier  is priced at $0.0099 per gig per month, which adds up to about $12 per TB per year—far less expensive than just the hardware and software for any on-premises disk storage. From a data recovery standpoint, recalling data can be simplified to make data easily accessible for end users without requiring labor intensive workflows or error-prone processes, as required by solutions relying on tape

Disadvantages of Cloud Archive at Scale


Unfortunately, however, cloud archive is not without its disadvantages. Although cloud archive helps organizations save time and money compared to a labor-intensive solution like tape, it does become cumbersome to manage once data grows to the scale of hundreds of terabytes or petabytes.

Moving data to the cloud becomes labor- and time-intensive when there’s lots of data. One problem is that cost actually becomes an issue because cloud providers charge a put cost for each object moved to cloud, so if IT moves millions or billions of files to cloud, that racks up huge ingress fees. Another challenge is just keeping track of what data is in the cloud, requiring massive IT effort to track individual files as they are move to buckets in the cloud so that IT knows where to find a specific dataset amongst billions of files when recovery is needed. Additionally, transactional costs for puts, gets, retrievals, lists, scans, transfers, and a whole host of other actions in the cloud can accumulate quickly if an organization isn’t managing data efficiently.

Cloud archive at scale requires an intelligent system or service to simplify the process of moving data to cloud, keep track of the data moved to cloud, and minimize the total costs of archiving data to cloud.

Solving the Problems of Cloud Archive at Scale


Igneous was designed to handle the problems of moving and managing unstructured data at scale. In the case of cloud archive, a number of Igneous features address the drawbacks that come up when data grows to the scale of hundreds of terabytes or petabytes.

  • Direct API Integration: Integrate seamlessly with all tiers of AWS, Microsoft Azure, and GCP.
  • Automated, Policy-Driven Workflows: All IT needs to do is identify what data to archive and define policies.
  • Efficient Format: Many files are grouped to single objects called blobs, minimizing total put cost to extremely low amounts.
  • Metadata Index: Keeps track of what is stored in cloud.
  • Metadata Access through Read-Only NFS: Users can directly access metadata through read-only NFS to verify what files they need, so they identify the correct files to restore and save IT time.

To learn more about how Igneous DataProtect handles backup and archive of massive unstructured datasets, download the datasheet.

learn more

 

Related Content

How Partners Can Leverage (and make money) in the Cloud With Igneous

October 29, 2019

To state that the cloud is one of the most transformative technology offerings created this century would be an understatement. It is impacting every facet of every industry and will ultimately touch every business, vendor and reseller partner (if it hasn’t already). 

read more

Announcing Backup as-a-Service Direct to AWS

September 23, 2019

Data Protection Built for the Cloud

read more

Cloud Backup and Archive for File Data: 3 Myths Debunked

September 19, 2019

For years it’s been said that you can’t take advantage of the cloud for file data because it would be too expensive, or too hard. But as file data growth continues, the mechanics and economics of the cloud have continued to evolve as well. While the cloud may have been a pipe-dream for enterprises in the past, the simplicity and price-points available for cloud-based backup and archive solutions can no longer be ignored.

read more

Comments