Automating your VPC with Terraform and Jenkins

04 Apr 2016

In DevOps, they say once you have to complete a task more than once manually, it’s time to automate the process. One of the tasks we do in AWS is duplicate environments for Dev, QA, Staging, and Production. So it’s time to create some templates and automate the process with Terraform and Jenkins.  This will allow us to implement push-button creation of our infrastructure as needed, on demand, with thoroughly tested code that is predictable, efficient, and reliable.

First, we need to decide how we are going to separate environments in AWS. In our experience, the best way to do so is to use different VPCs for each special use case:

  •         Infrastructure – VPC
  •         Production – VPC
  •         Development – VPC
  •         Staging – VPC
  •         QA – VPC

The first VPC we want to create is for our company infrastructure. We want to model our current legacy/bare metal or VMWare environment to fit our new cloud infrastructure as closely as possible while upgrading and improving as needed. This means we will have a mixture of Windows and Linux servers, so plan accordingly. Now is a great time to plan for OS upgrades and security update automation. I mean, who doesn’t like a Greenfield Project with the flexibility to fix all the mistakes of one’s current infrastructure environment? One thing to keep in mind is to use replication and/or master slave setups so that your local office does not go down if the link to AWS goes down. After all, it’s way up in the cloud :)

Now, let’s set up our initial Terraform script to create our Infrastructure VPC. We are going to use the AWS stack from segment.io for all our Terraform automation because it is thoroughly tested code and works out of the box. This is a good place to start learning Terraform if you have never used it. All you have to do is put in a default setting for most of the configuration files, and you won’t have to import an output file from the interactive input. So we want to go over to Github and clone the segment.io AWS Terraform stack so we can customize it to fit our needs. You can always fork it and point the scripts to pull from your own repository. Oh, and don’t make the rookie mistake of pushing your AWS keys up to Github or you will be hosting a botnet in a matter of minutes. It’s best to put your AWS keys in your ~/.bash_profile as environment variables and use them that way.

https://github.com/segmentio/stack

When setting up the Infrastructure VPC, public and private subnets, NAT gateway, initial security groups, and basic routes, keep our current network subnets in mind so we do not overlap networks, because that will break connectivity to the network subnets. Remember that AWS provides multiple ways to connect to the cloud network, like direct connect or VPN. In most enterprise networks, you would have multiple private subnets, and our VPC would have additional private or public subnets that would extend that network.

We're hiring!

We also want to decide what region to designate for the Infrastructure VPC. We can duplicate it in multiple regions for HA/DR and can load balance between multiple availability zones as well. Replication and backups will have to be planned accordingly because AWS does not provide this capability out of the box, unless you’re using RDS or some other managed Infrastructure.

Make sure you plan the security for future support of our infrastructure environment by creating different security keys to access the servers. We can stage the security keys on the Jenkins server so that we can use them for deployments, connect to needed resources, and upload them to other regions if necessary.  Just make sure you lock down the files and folders on the Jenkins server so only root and the Jenkins user have access in case the server is compromised.  When creating the first Terraform script for your Infrastructure VPC, you may have to refactor your code a few times until you get it perfect. We will be keeping the Terraform state file in an S3 Bucket in AWS, so our scripts will always have access to the current objects and names in AWS.

The VPC Terraform file from segment.io only creates the base network plumbing, which is what we want so we can duplicate the environment on the most basic level and modularize the build as needed. After the initial plumbing is in, we can start building infrastructure servers. The first server you want to build is your Configuration Management server. Many DevOps Engineers automate the build of as many infrastructure servers as possible because it is predictable and reliable. After your Configuration Management server is set up and running, start downloading cookbooks, recipes, and playbooks to automate the set up of the rest of your servers.

My recommendation is to start with a t2.medium and scale as needed. Here is an easy way to create instances with Terraform and script the install of your configuration management server. If the infrastructure server uses a shared database or backend and separate frontend web servers, you can easily create multiple frontend web servers and put them behind an ELB for HA. Graylog is one server that allows you to easily complete this task.

resource "aws_instance" "chef" {
   count = 1
   ami = "ami-a9a8e4c9"
   instance_type = "t2.medium"
   key_name = "chef_server.pem"
   subnet_id = "${element(aws_subnet.external.*.id, count.index)}"
   vpc_security_group_ids =  ["${aws_security_group.chef.id}"]
   connection {
     user = "ubuntu"
     key_file = "~/.ssh/chef_server.pem"
   }

   provisioner "remote-exec" {
     inline = [
       "sudo apt-get update && sudo apt-get upgrade"
       “wget https://web-dl.packagecloud.io/chef/stable/packages/ubuntu/trusty/chef-server-core_12.8.0-1_amd64.deb”
       “sudo dpkg -i chef-server-core_*”
       “sudo chef-server-ctl reconfigure”
       “sudo chef-server-ctl status”      
     ]
   }
}

We will be using Route 53 for local DNS resolution and standard DHCP from AWS. After you create a local, private DNS zone in Route 53, make sure you switch out the DHCP profile so that your instances get the correct DNS information of your new zone. Here is how to create records with Terraform:

variable "aws_zone_id" {
 default = "ZASDFASDGASDG"
}

variable "hostname" {
 default = "chef.sphereinclabs.com"
}

resource "aws_route53_record" "chef" {
 zone_id = "${var.aws_zone_id}"
 name = "${var.hostname}"
 type = "A"

 alias {
   name = "${aws_elb.chef_server.dns_name}"
   zone_id = "${aws_elb.chef_server.zone_id}"
   evaluate_target_health = false
 }
}

If you’re currently using a Linux-only infrastructure, you will want to set up an LDAP server, Configuration Management Server, Email server, OpenVPN (unless you have direct connect), File Server, Central Logging server like Graylog, Monitoring Server like Sensu, Prometheus, or Nagios, and whatever else you currently use in your bare metal/VMWare environment.

We're hiring!

If Windows is your main infrastructure platform, you will probably have a few DNS Servers, Domain Controllers, an Anti-virus Management Console, maybe SCCM/SCOM, SQL, WSUS or Windows Update Server, and Web Servers for different services.

I won’t get into how to provision those infrastructure servers because you have so many options, but for those of you who have never done this task, here are a few suggestions:

  1. Use Terraform to create the instance. Then use provisioning script to kick off Chef Cookbook/Recipe.
  2. Create an AMI image of the pre-configured server. Then use provisioning script to incorporate the AMI image for the server template.
  3. Create Docker images or deploy preconfigured Docker image.

After you have created your Terraform Infrastructure VPC, we want to automate with Jenkins. We will give the user three options:  Plan (test the code), apply (add or update infrastructure), or destroy (start over). In order to allow multiple users to add or update infrastructure, we will use an S3 Bucket to store our Terraform state file. The state file holds the current objects and names of each AWS resource.

Here is how you set up Jenkins to run the Terraform script while creating an S3 Bucket on demand for our state file. Make sure you have the proper permissions so this process will be successful.

Here is what you need to automate the state file updates to an S3 Bucket. This pushes and pulls your state file from the current workspace in Jenkins.

#!/bin/bash -e

PROJECT="$(basename `pwd`)"
BUCKET="terraform-vpc-state-file"

init() {
 if [ -d .terraform ]; then
   if [ -e .terraform/terraform.tfstate ]; then
     echo "Remote state already exist!"
     if [ -z $IGNORE_INIT ]; then
       exit 1
     fi
   fi
 fi

 terraform remote config \
   -backend=s3 \
   -backend-config="bucket=${BUCKET}" \
   -backend-config="key=${PROJECT}/terraform.tfstate" \
   -backend-config="region=us-west-1"

}

while getopts "i" opt; do
 case "$opt" in
   i)
     IGNORE_INIT="true"
     ;;
 esac
done

shift $((OPTIND-1))

Init

Here is the code you need for your Jenkins job:

node {

   // Mark the code checkout 'Checkout'....
   stage 'Checkout'

   // // Get some code from a GitHub repository
   git url: 'git://github.com:knott-sphere/infrastructure.git'

   // Get the Terraform tool.
   def tfHome = tool name: 'Terraform', type: 'com.cloudbees.jenkins.plugins.customtools.CustomTool'
   env.PATH = "${tfHome}:${env.PATH}"
   wrap([$class: 'AnsiColorBuildWrapper', colorMapName: 'xterm']) {

           // Mark the code build 'plan'....
           stage name: 'Plan', concurrency: 1
           // Output Terraform version
           sh "terraform --version"
           //Remove the terraform state file so we always start from a clean state
           if (fileExists(".terraform/terraform.tfstate")) {
               sh "rm -rf .terraform/terraform.tfstate"
           }
           if (fileExists("status")) {
               sh "rm status"
           }
           sh "./init"
           sh "terraform get"
           sh "echo \$PWD"
           sh "whoami"
           sh "terraform plan -out=plan.out;echo \$? > status"
           def exitCode = readFile('status').trim()
           def apply = false
           echo "Terraform Plan Exit Code: ${exitCode}"
           if (exitCode == "0") {
               echo "Terraform Plan Exit Code: ${exitCode}"
               slackSend channel: '#midwesthackerschool', color: '#0080ff', message: "Plan Failed: ${env.JOB_NAME} - ${env.BUILD_NUMBER} ()"
               currentBuild.result = 'SUCCESS'
           }
           if (exitCode == "1") {
               sh "terraform destroy -force"
               echo "Terraform Plan Exit Code: ${exitCode}"
               slackSend channel: '#midwesthackerschool', color: '#0080ff', message: "Infrastructure Destroyed: ${env.JOB_NAME} - ${env.BUILD_NUMBER} ()"
               currentBuild.result = 'FAILURE'
           }
           if (exitCode == "0") {
               echo "Terraform Plan Exit Code: ${exitCode}"
               //stash name: "plan", includes: "plan.out"
               slackSend channel: '#midwesthackerschool', color: 'good', message: "Plan Awaiting Approval: ${env.JOB_NAME} - ${env.BUILD_NUMBER} ()"
               try {
                   input message: 'Apply Plan?', ok: 'Apply'
                   apply = true
               } catch (err) {
                   slackSend channel: '#midwesthackerschool', color: 'warning', message: "Plan Discarded: ${env.JOB_NAME} - ${env.BUILD_NUMBER} ()"
                   apply = false
                   sh "terraform destroy -force"
                   currentBuild.result = 'UNSTABLE'
               }
           }
           if (apply) {
               stage name: 'Apply', concurrency: 1
               //unstash 'plan'
               if (fileExists("status.apply")) {
                   sh "rm status.apply"
               }
               sh 'terraform apply;echo \$? > status.apply'
               def applyExitCode = readFile('status.apply').trim()
               if (applyExitCode == "0") {
                   slackSend channel: '#midwesthackerschool', color: 'good', message: "Changes Applied ${env.JOB_NAME} - ${env.BUILD_NUMBER} ()"
               } else {
                   slackSend channel: '#midwesthackerschool', color: 'danger', message: "Apply Failed: ${env.JOB_NAME} - ${env.BUILD_NUMBER} ()"
                   sh "terraform destroy -force"
                   currentBuild.result = 'FAILURE'
               }
           }
   }
}

When you have your Infrastructure VPC completed and scripted in Jenkins, you can change the region and availability zones to make another copy in another region for HA/DR.

In my next blog post, I will show you how to create the Terraform scripts to automate your DevOps Pipeline with Jenkins so that your environments are identical, and you can move code to another environment when necessary.

RELATED CONTENT