Rapid Vulnerability Management in the Cloud with Blue/Green Deployments

When a critical vulnerability announcement affects a majority of your cloud infrastructure how do you mitigate quickly with minimal downtime? Vulnerability management is challenging in a Cloud IaaS environment because many systems are ephemeral and launched from baseline images (Gold Disk, AWS AMI, etc.). We must assess vulnerabilities on baseline images, as well as on running systems to gain better visibility into our dynamic cloud environments.

Through assessing baseline images for vulnerabilities we close security gaps introduced through the use of ephemeral instances that automatically scale. What happens if a software repository used for updating software and security packages is unavailable when a new system is launched? If the baseline image is out of date and this 'patch after deployment’ method fails, we could be putting our organization at risk. Quick ‘restacking’ of infrastructure in the cloud environment is necessary to avoid reintroduction of old vulnerabilities.

When a new vulnerability is released we can quickly create a new, patched baseline image, and share it with the entire organization. This process consists of launching a system from the previous baseline, installing security updates, and creating a new baseline from this system. When new security patches are released, the security team should coordinate with the organization’s cloud infrastructure team to create and release a new image. Once this image is created it should be thoroughly tested including vulnerability scanning. Its availability should then be communicated to the entire organization with instructions for rapid integration into production applications.

Blue/Green deployment is a popular application automation strategy. Blue/Green deployments involve running two identical production stacks (environments) simultaneously, with the ability to quickly switch production workloads between them. While one environment runs in production the other stack would be used to test final changes or updates to the application. Upon the completion of testing, routing of production traffic is switched to the new stack.  Several different Blue/Green deployment architectures are used throughout organizations, but the end goal of each remains the same: rapid updates to applications with minimal downtime.

Using Blue/Green deployments to assist with our vulnerability management strategy, the entire organization can be running on the patched baseline image in production environments extremely quickly.  As an example, let’s say an application is currently running on a ‘Green’ environment. When a new baseline is released with appropriate security patches, each application team would stand up a ‘Blue’ environment built off of this new image. If application deployments are completely automated (using CloudFormation, or Chef, for example) this should require minimal time and effort. After rollout of the Blue stack is completed, QA testing and vulnerability scanning can be performed before switching the new Blue environment to production. When the application team is confident in the stability of this new environment, the current Green environment can be destroyed.

Although knowing that a system is using an up-to-date baseline image will not allow us to know whether its software packages have been downgraded, or whether other vulnerabilities have been introduced after deployment, it does provide a level of confidence that the system was not vulnerable when it was first launched. By using a Blue/Green deployment strategy, alongside automated restacking on baseline images, we feel safer because our patching doesn’t rely on external systems. With this method, the security team can quickly develop a baseline vulnerability risk score of their entire cloud environment by determining the percentage of systems running on vulnerable images. Correlating this score with real-time vulnerability data from running instances can provide enhanced visibility for the organization.

How are you performing vulnerability management in your Cloud or virtualized environment?