Leaving the Forge — part 2 - Phylos Bioscience

In our last post, we talked about how we build up the container images needed to develop and run our Laravel app locally, with all the same dependencies across each team member’s environment. To minimize issues that may arise between development and production, we go through a series of steps to promote the exact images we use in local development to those we run in production.

We made some key distinctions between how we want to run things differently in production. Mainly, we didn’t want volume mounts to be happening at all. All the code, libraries, dependencies, interpreters, etc. needed to run should be present in the image itself. This lets us deploy and rollback with added confidence. We wanted 0-downtime deploys, and to be able to scale horizontally, should we encounter large traffic spikes or some other event. With those goals in mind, let’s dive in.

Base Images & Self-Containment

To build images that have all the dependencies and source needed to run the Laravel app, we decided to make it a two-step process. The first step builds “base” images from our local images, which contain the proper versions of dependencies as defined in composer.json. The reason for this is image layer size.

Since we’re creating a new layer each time we copy in the source tree, we want to keep it as small as possible. This has ripple effects throughout the deploy process: smaller image layer deltas mean less to push and pull from repositories, which results in faster deploy times.

Our build process has two steps that we only run when we change dependencies. The first takes the source tree, and installs all the needed dependencies using composer install. We then create an archive of the entire dependency tree. Secondly, we create “base” images with only the vendor/ tree, and supporting files. An important thing to note is that we build this base image and the nginx production image from the same local images we run in development:

FROM local_example_php_fpm:latest

COPY --chown=www-data \
     example/composer.json \
     example/composer.lock \
     example/composer.phar \
     /var/www/example/
COPY --chown=www-data example/vendor /var/www/example/vendor

When building our production images, we copy only the source tree in without dependencies, because we’ve built base images with the dependency files already there. This saves hundreds of megabytes of space in the resulting image’s layers when we copy this source tree into the production image.

FROM base_prod_example_php_fpm:latest

COPY --chown=www-data example /var/www/example/

We now have a PHP FPM image with all the source and dependencies. Once pushed to our repository, it can be deployed without any external requirements.

Deployment using AWS ECS

After looking at a number of different frameworks and technologies for managing and deploying things like this, we settled on Amazon’s Elastic Container Service. The “Service” feature provided on ECS handles everything we need for 0-downtime deploys and scaling.

We start with building the cluster to handle the PHP/FPM containers. Configuring ECS Services to run at least 1 instance of the app, with a maximum 200% capacity gets us a very simple rotation to deploys. The cluster consists of 2 EC2 instances, and the ECS service knows enough to do the following:

Pull the container image(s) specified in the task definition
Run the task on a cluster instance, at levels defined
Register “healthy” tasks with a load balancer target group
Deregister and drain old tasks from the load balancer target group
Stop old tasks after draining is complete

An ECS Task Definition maps somewhat loosely to a Docker Compose YAML. A major difference between our local Docker Compose YAML and our task definitions is scope of the service. We want to be able to deploy each service without affecting others, so we split our ECS Services into groups running on separate clusters behind various load balancers. For example, one task definition describes our nginx tasks, and another describes our PHP/FPM.

Load Balancing is for Lovers

Between our nginx tasks and our PHP/FPM tasks, we have a Network Load Balancer. Since FastCGI is not HTTP, we needed a load balancer that would work at the TCP level. The PHP/FPM tasks register with the NLB’s target group, and our production nginx configuration specifies the NLB as the upstream service to proxy. This lets us update the PHP/FPM service without affecting nginx whatsoever. Private Route53 zones resolve the load balancer names so complicated service discovery is unneeded.

Because our tasks deploy to cluster instances in multiple availability zones, it is important to check this box on the Edit Load Balancer attributes dialog:

Enable cross-zone load balancing if your cluster instances are spread across availability zones.

The nginx ECS service registers with an Application Load Balancer target group, and the ALB handles TLS termination. Amazon’s Certificate Manager provides us with automatically renewing TLS certs and registering them with the HTTPS listener on the load balancer was a snap.

To deploy new Laravel app code, we build and push the new image, then trigger a new deployment on the existing ECS service. Everything else is handled for us from there.

Summary

This concludes our post on how we left the cozy, reassuring world of Forge-managed services for Docker container-based infrastructure running on AWS ECS, garnering better homogeneity from developer laptop to production processes along the way. Our hope is that this story and snippets will help others looking at similar endeavors.

Stay tuned — more DevOps posts coming: how we use Rake and its task dependency model to build images efficiently, template config files with ERB for running homogenous staging environments prior to production, and trigger service deploys using the AWS Ruby SDK.

Leaving the Forge — part 2

Base Images & Self-Containment

Deployment using AWS ECS

Load Balancing is for Lovers

Summary

Subscribe to Phylos News