Blog

Subscribe Here!

How Igneous Selects Weekly Release Candidates for Production

by Rob Homan – August 14, 2018

Streaming out a weekly software update brings joy to customers and engineers alike. Customers receive cutting-edge features and timely bug fixes, while engineers transform bright ideas into production realities with minimal turnaround.

At the same time, validating each weekly software update is no small feat. It takes thoughtful development and serious automation to select a high-quality release candidate each week and to ensure it is free from critical bugs or performance degradations.

At Igneous, we rely on a variety of automated techniques to identify a release candidate and further validate that it is ready for the production primetime spotlight. Today, I’ll cover how we use comprehensive test automation to identify high-quality release candidates amongst the barrage of changes that are integrated into our source code each day.

Test, Test, Test

Testing is fundamental to any software organization’s success, and as such, testing is baked into our DNA here at Igneous. We’ve invested heavily in integrating testing with our developers’ workflows in a powerful yet streamlined fashion, which routinely pays quality and efficiency dividends. It has led to the creation of an in-house continuous integration automaton dubbed “Iggybot,” which manages all test execution grunt work for us. Thanks to the unique way in which Iggybot integrates with developer workflows, our process contains multiple layers of defense against the introduction of bugs and regressions.

Our first layer of defense engages for every proposed code change. Whenever a proposed change is posted, Iggybot automatically dispatches an extensive series of test suites. For a code change to even be eligible to merge into mainline, it must pass every single one of these checks. These test suites include everything from unit tests to component-level tests to end-to-end tests of customer workflows. Each of these tests are executed across an excruciating number of product variants in massive parallelism.

iggybot-automated-checks

A visualization of the automated checks dispatched by Iggybot for every proposed code change. If one of these checks had failed, the proposed code change would not be allowed to merge.

Once the proposed change passes all checks and a thorough code review, it is marked “ready” for merge into mainline. We delegate the honors to Iggybot, however, to actually execute the merge and dispatch the test suites once morejust in case a concurrent merge introduces a conflict. This is our second layer of defense and automating it facilitates a major advantage: It allows us to halt merges at the first sign of trouble. Iggybot smartly monitors for test failures on recent merges. When it observes too many, it refuses to merge further changes until those failures are addressed.

The nightly tests are where things really get interestingour third layer of defense. Each night at midnight, Iggybot selects a daily release candidate and slams it with multiple repetitions of simulated testing in order to root out pesky transient failures. Concurrently, it brutalizes that version by spinning it up on real hardware and running load tests, stress tests, and upset tests. These tests verify that our system fails over properly and degrades gracefully and safely in catastrophe scenarios. Finally, Iggybot runs the selected release candidate on real hardware systems to verify that backup and restore performance meets our benchmarks. When failures occur, we triage every single one and meticulously address every possible product concern. All told, 419 machine hoursover 17 machine daysare spent validating our product each night.

merged-code-changes

A series of merged code changes to mainline is displayed above. The entry for “Merge pull request #11602” was subjected to nightly tests. All 5 test failures were investigated for bugs. Since no critical product bugs caused any of the failures, this software version is eligible for selection as a release candidate.

Enter the Pipeline

After all this testing workthank you Iggybot!we are now ready to select a high-quality release candidate for our pipeline. We do this by starting with a software version that performed well in our nightly testing and triaging all issues to ensure there are no critical product bugs. If it looks clean, then we submit this release candidate into a validation pipeline.

From here, the release candidate journeys through a series of pre-production environments and automated checks before it becomes eligible for production deployment. This journey and the automation that drives it bring forth yet another collection of defense mechanismslongevity testing, deployment testing, and internal rehearsals of our Critical Incident Process to name a fewall of which will be covered in-depth in future blog posts.

If you enjoyed learning about our production pipeline, be sure to check out our open positions!

explore careers

Related Content

Coming Soon: A New Approach to Protecting Datasets

December 17, 2018

Unstructured data has grown at an annual compounded rate of 25% for the past ten years, and shows no sign of slowing. For most organizations, “data management” for unstructured data has really just meant capacity management, i.e. increase capacity to keep up with data growth. This model worked at moderate scales, but as datasets have increased in size, complexity, and quantity, it has pushed the scales into petabytes of data with billions of files, and overwhelmed budgets. Enterprises are now asking for data management strategies that do more than just provide continuously increasing capacity.

read more

Data Protection at Scale: How Igneous Integrates with NetApp

November 19, 2018

One of Igneous’ key benefits is how we integrate easily with any primary NAS system, streamlining data protection and freeing customers from legacy solutions, and vendor-specific data silos.

read more

How Igneous Optimizes Data Movement

August 22, 2018

Our co-founder and Architect, Byron Rakitzis, recently wrote an article over at DZone called "Parallelizing MD5 Checksum Computation to Speed Up S3-Compatible Data Movement."

read more

Comments