4 min read

AWS DataSync, AWS Storage Gateway and Igneous DataProtect

By Scott Stanton on June 30, 2020

“If your only tool is a hammer, every problem looks like a nail.”

One question we hear a lot at Igneous is; what’s the difference between AWS’ DataSync, Storage Gateway, and our DataProtect backup solution? The answer is simple, it’s about having the right tool for the job.

A small tacking hammer and a 10 lb. sledgehammer are both hammers - designed to drive things into other things. Could you use the sledgehammer to drive tacks? Yes. Would it be overkill? Yes. Now, consider the reverse: can you drive a splitting wedge with a tacking hammer? No, probably not. They share the same basic function, but were each built for radically different tasks.

Like Igneous DataProtect, both AWS DataSync and AWS Storage Gateway are delivered as virtual machines that you deploy in your own datacenter (Storage Gateway and Igneous can both also be hosted on dedicated hardware, but this comparison will focus on the VM versions only). So what’s the difference? 

Moving Data at Different Scales

DataSync/Storage Gateway and DataProtect can both move data to the cloud, but just like a tack vs sledge hammer, they are very different in terms of the job they are designed to do - specifically the scale of data they are designed to move. This distinction is most evident when you consider how quickly each solution is able to move data. At Igneous we ran a couple of experiments in our testing environment to see what the practical, real-world performance would be for each product. Since all of these products are deployed in the form of a VM, it was easy to test them in the same environment. 

We started with a dataset that represents a typical workload for data-centric organizations. It consisted of 32 million files with an average size of 125KB each (about 4TB total). The source NAS was a Pure Flashblade and we were moving the data to AWS using a 10Gb/s DirectConnect. The VMs were configured with the recommended resources for each product.

aws_datasync_storage-gateway_performance_versus_igneous_dataprotect


The chart above shows how much time it took to move the data, and the effective data transfer rate each solution delivered. It is clear that scale plays a huge role in how effective each product will be in moving data. The more data you have, the more you need a tool that is designed to handle scale.

Igneous’ DataProtect was built for scale. It scans and moves files at the same time, which is especially important if you have small to medium sized files. Compare that to DataSync which doesn’t maintain an index, so it has to scan both the source and the target each time it does an incremental move. As a result of this limitation, AWS recommends that each VM instance be limited to 20 million files or less.

Other Important Considerations

Protecting data is more than just moving it to a backup system - especially when you are dealing with petabytes or more of data. AWS DataSync is designed to sync files between on-prem primary storage and an AWS cloud bucket. It doesn’t offer basic backup functions like file versioning or even targeting their least expensive storage tiers (Glacier and Glacier Deep Archive). Datasync also doesn’t keep track of where it has moved data, so finding that data when you need to restore could be challenging.

Storage Gateway is intended to trick your legacy, cloud-unaware data management tools into thinking that the cloud is a local storage system like a backup NAS system or a tape library. That means you’ll still need a separate backup software solution to handle the actual data movement. But remember, as the performance data shows, you’ll need enough local storage cache to handle your largest backup image, or it will never keep up.

DataSync is more like legacy systems where each share and target needs to be individually set up. Igneous DataProtect, by comparison, features an intuitive and easy-to-use dashboard. Add API integration into the most popular NAS systems and setup is literally a 15-second task. The ability to create default backup policies means that entire NAS systems can be bulk-enabled for backup all at once. In addition to a more simple setup, Igneous testing found that DataProtect is much faster than DataSync, meaning that you can protect more data in less time.

When it comes to restores, DataSync is able to restore single files or whole shares but nothing in between. With Igneous you can search for a single file or sub-directory, and it only takes a single click to restore to any attached system (regardless of data source).

Igneous is also specifically engineered to limit exposure to high transaction costs associated with moving files to and from AWS cloud storage. Igneous software intelligently groups small files and then compresses them to reduce put and get costs. Igneous can also automatically delete files from cloud storage based on user-defined policies, and do so cost-effectively by minimizing the number of individual delete requests.

Simply put, neither DataSync nor Storage Gateway can match Igneous DataProtect for backup and archive at scale. Igneous advantages include:  

  • Minimize Costs: Cloud-native architecture ensures that costs other than just raw storage costs are taken into account and controlled

  • Flexibility: Any source, any target, any protocol - on-prem or in the cloud

  • Simplicity: Configuration and management is designed for scale, and super simple

  • Success Assurance: Remote monitoring and ‘as-a-service’ management to ensure SLAs are met.

Using the right tool saves you time, money, and pain.

Having the right tool for the job isn’t just more convenient, sometimes it makes the difference between success and failure. If you have to split a cord of wood and all you have is a tack hammer, you are not likely to succeed. 

Scott Stanton

Written by Scott Stanton

Subscribe for Updates

Get the latest Igneous blog posts delivered to your inbox.