AWS is relentlessly improving Elastic Container Service

AWS re:Invent 2012 ECS service announcement

AWS ECS announcement at re:Invent 2012

I was in the audience at the 2014 AWS re:Invent introduction of the EC2 Container Service, or ECS.  They showed a compelling visualization of how Docker containers could be used to coordinate different services in a larger application. The vision of what Amazon set out to achieve was on full display, but the initial release was limited. They released it to a preview group only, without a user interface.

The general release wouldn’t happen for six months. You’d be forgiven if you hadn’t checked back since then. You’d be missing out, though, because since that time the service has grown into a powerful, full featured orchestration environment.

Amazon has made improvements over the last two years across their entire platform to better enable customers to run containerized workloads without the heavy lifting. Advancements in Elastic Load Balancing, CloudWatch Logs, Elastic File System, and ECS itself have all contributed to making Amazon Web Services one of the best environments for Docker containers and microservices.

Container load balancing

When ECS was first announced, there was very little coordination with other AWS services, although deeper integration was promised. One of the first updates introduced Elastic Load Balancer support. While the support was helpful, it was not ideal. Because ELB only supported single connections to a given instance, users were unable to run multiple copies of a given service on a single EC2 instance.

Even worse, each service required a different load balancer. At a rate starting at $.02/hour, using ELB for your services would add up quickly.

With the introduction of the Application Load Balancer for ELB or ALB, Amazon has mostly solved these problems. ALB supports target groups that correspond to services running on any available port. It also supports routing of requests by path.

With this change, a single ALB could be used to balance between a larger number of services economically.  It also allows multiple services to run on a single port, and allows different health checks for each target group (or service).

With the introduction of the Application Load Balancer, AWS massively improved the ECS service and relieved many companies from the burden of running a proxy service. ALB is still missing a few key features, like routing based on HTTP header, but they are likely coming soon.

elastic container service logo

 

Security

When I was responsible for a large hosted cloud service, and we were debating the move to the cloud, we often debated how security would be affected. Over time it has become clear that if you are willing to invest in it, security can be better in the cloud than what you are likely to be able to achieve on premise.

One example of this would be the announcement of IAM role support for ECS tasks. With this change, Amazon supports limiting the ability of different containers to access services within ECS. With tight boundaries around each of your containers, you can achieve better security.

Also, if you assign your tasks to roles, you can very closely log what they are up to with the CloudTrail service. These updates and service announcements have improved the security of ECS and applications running on it significantly.

Container monitoring and logging

In the beginning, monitoring services on ECS were limited to a few counters related to the service itself. After launch, though, Amazon started tracking additional metrics in CloudWatch. This update allowed operators to track memory and CPU consumption of services. It also enabled a closely related feature, service task autoscaling. Now containers can be started in seconds based on any metric in CloudWatch.

Another area lacking at launch was support for container logging. User of ECS were on the hook for creating logging infrastructure. I spent a lot of time creating sidecar-containers and libraries to log from ECS. Amazon provided guidance in the form of whitepapers showing how to integrate ECS with CloudWatch, but that was the sort of heavy lifting that AWS eventually often handles.

Not surprisingly, AWS announced support for CloudWatch Logs recently and removed that burden completely, vastly improving ECS yet again.

Storage

It was the same story with the storage layer. Containers requiring state were basically out of luck. It was possible to map containers to volumes, but the volumes had to exist on the instance prior to the task or service running. This ruled out using the ECS service scheduler and required EBS volumes to be attached to cluster instances before starting tasks.

The release of the Elastic File System EFS changed all of this. Now you can mount an EFS shared volume on your cluster instances and make them available to your running containers.

Future areas of improvement

For all the areas that have improved, there are still a few that have a way to go yet. Among them are service discovery and networking.

Service discovery refers to the process that containers use to understand their environment and find their dependent services. Amazon suggests querying ECS itself to find services, which works, but is another service that must be written and maintained.  ECS can only be queried so frequently before throttling becomes a factor, making it necessary to create cached frontend services, all increasing the complexity of the solution.

Many container platforms use DNS as an underlying service discovery system. Perhaps Amazon will integrate with local VPC route 53 DNS at some point. Traditional DNS uses IP addresses, but not ports, which leads to the next area for potential improvement, networking.

Container networking is another area needing improvement. Google and solutions architects at AWS suggest that containers should be run on their own IP address. ECS doesn’t support assigning IP addresses to containers. Amazon recommends in some cases running overlay networks like Weave, but this is another area calling out for Amazon to simplify. Amazon recently added a configuration option to choose container networking types, and so we’re hopeful this will be extended to support an overlay network provided by AWS.

Relatedly, VPC does not yet support IPv6. One day when it does, granting all containers and instances their own public IP addresses will be feasible.

Conclusion

AWS has made huge improvements with the ECS service since launch. Most of these improvements are updates to other services and brand new services within AWS. ECS today is a capable service for running containers in production. If ECS meets your needs, consider using it. You will be able to run an ECS cluster and get started faster and at less expense than if you put together a Mesos or Kubernetes environment.

Have you used ECS in production? Let me know how it went.