Everything about AWS ECS(with hands-on)

anubhav jhalani
9 min readDec 1, 2022

--

This is third article of the series Everything about AWS ECS where I am going to explain the next Component Service. For the other articles in this series please click on following links:

  1. ECS Overview and Task Definition
  2. Cluster
  3. Service
  4. Load Testing
  5. CI/CD Pipeline

3. Service

Service is used to run and maintain a specified number of Tasks simultaneously. If one of your tasks fails or stops, the service scheduler launches another instance of your task definition to replace it. Service can also scale up and down the number of running tasks automatically based on CloudWatch Metrics. It can also be attached to load balancer. Lets start with the first set of configurations of Service. Open your cluster and click on Services tab and then click on Deploy:

Existing Cluster : You select a cluster in which you want to run your Service.

Compute Options : It gives you a choice in how your tasks are distributed across your cluster infrastructure.
With Capacity Provider Strategy, your tasks will be placed on the EC2 instances of a Capacity provider. In our case this provider is an EC2 AutoScaling Group.
With launch type, your tasks are launched directly on the Amazon EC2 instances that you have manually registered to your clusters.
Here I chose the Capacity Provider Strategy and then selected my AutoScaling Group Capacity Provider.

Base : The base value designates minimum how many tasks can run on the specified capacity provider.

Weight : The weight value is significant when there is multiple capacity providers specified. Its value designates that how much percentage of the total number of tasks will be launched under specified capacity provider. For example, if you have a strategy that contains two capacity providers, and both have a weight of 1, and if the base is satisfied, the tasks will be split evenly across the two capacity providers. Using that same logic, if you specify a weight of 1 for capacityProviderA and a weight of 4 for capacityProviderB, then for every one task that is run using capacityProviderA, four tasks would use capacityProviderB.

Further configurations:

Application Type : Although we are creating a Service, still this option gives a choice to run the tasks under a Service or to run as a standalone task. I chose Service.

Task Definition : Here I specified my task definition and its version which will be used by the Service to run the tasks.

Service Name : A Service name which can be anything.

Service Type : This option determines how the Service scheduler places tasks in your cluster. There are two strategy types:
In Daemon strategy the scheduler places exactly one task on each active container instance that meets all of the task placement constraints specified in your cluster.
In Replica strategy the scheduler places and maintains the number of tasks that you specify across your EC2 Instances. I chose Replica strategy.

Desired tasks : This is the number of tasks to run initially when Service launches first time. It only applies when you use the Replica strategy. The value of Desired tasks later changes automatically based on a CloudWatch metric value which we define later.

Further configurations:

Deployment Type : ECS supports rolling update deployment type. In this deployment type the Service scheduler replaces the current running version of the container with the latest version.

Min running tasks : This parameter represents minimum number of your service's tasks that must remain in the RUNNING state during a deployment. This is expressed as a percentage of the Desired tasks that is rounded up to the nearest integer. For example, if your service has Desired tasks value as 4 and Min running tasks value as 50%, then service scheduler will keep at least 2(50% of 4) tasks in RUNNING state and kill the remaining tasks to replace them with upgraded tasks.

Max running tasks : This parameter represents maximum number of your service's tasks that are allowed in the RUNNING or PENDING state during a deployment. It is expressed as a percentage of the Desired tasks that is rounded down to the nearest integer. For example, if your service has Desired tasks value as 4 and Max running tasks value as 200%, then scheduler might start additional 4 new upgraded tasks which makes the total number of tasks 8(200% of 4) before stopping the 4 older tasks.

I set Min running tasks as 100% and Max running tasks as 200%. This means if the current number of tasks are 2 and the number of Desired tasks are 4 then the scheduler will start additionally 4 upgraded tasks and then will kill the older 2 tasks during deployment. This satisfies both Min running tasks and Max running tasks requirements.

Deployment failure detection : You can optionally use deployment circuit breaker logic on the Service, which will cause the deployment to transition to a FAILED state if it can’t reach a steady state. The Service doesn’t reach a steady state if none of the tasks achieves the RUNNING state. It causes the deployment circuit breaker to increase the failure count by one. When the failure count equals the threshold, the deployment is marked as FAILED.
Calculation of that threshold value is explained here.

Rollback on failures : This option means that when the circuit breaker detects a failure, it rolls back to the last completed deployment.

Further configurations:

Now we are connecting an Application load balancer to our Service to distribute the incoming http requests evenly among tasks. I am assuming that you understand the various configurations of an application load balancer so I will skip most of them.

Choose container to load balance : You choose a specific container from the tasks to which you want to send the request from load balancer. I chose blog-container 80:80(host port : container port).

Listener : You create a new listener on Application load balancer which will listen to user’s requests on http port 80.

Further Configuration:

Now we are going to enable AutoScaling of Service. AutoScaling of Service means increasing and decreasing the number of tasks running in Cluster. This AutoScaling is different than the AutoScaling of EC2 Instances which we configured during Cluster creation. AutoScaling of EC2 Instances is based on CloudWatch Metric CapacityProviderReservation but AutoScaling of Service can be based on CloudWatch Metric ECSServiceAverageCPUUtilization or ECSServiceAverageMemoryUtilizationAverage or ALBRequestCountPerTarget.

Now it might be confusing to know how AutoScaling actually happens in ECS. Let me try to simplify it. Firstly, Service auto scales based on above three metrices and changes the Desired tasks value and then EC2 Instances might auto scale based on CapacityProviderReservation metric which is based on number of required tasks(Desired tasks) to run, as explained in previous article. We will see in action later and then the picture will be more clear.

Minimum number of tasks : The lower boundary to which service auto scaling can adjust the value of Desired tasks in the service.

Minimum number of tasks : The upper boundary to which service auto scaling can adjust the value of Desired tasks in the service.

Scaling Policy Type : Just like AutoScaling of EC2 Instances, here we also have Target Tracking policy. With target tracking scaling policies, you select a metric and set a target value. Amazon ECS creates and manages the CloudWatch alarms that start the scaling policy and calculates the scaling adjustment based on the target value. The scaling policy adds or removes tasks as required to keep the metric at, or close to, the specified target value.

Policy Name : You choose a policy name which can be anything.

ECS Service Metric : I chose ECSServiceAverageCPUUtilization. It is calculated as the percentage of CPU used by the all the tasks from a service on a cluster in comparison to the CPU that is specified in the service’s task definition. It is explained in detailed here.

Target Value : I chose 70. This means Service AutoScaling scales the number of tasks to maintain the value of metric ECSServiceAverageCPUUtilization equal to or near to 70.

Scale-out cooldown period : This is the number of seconds to wait for a previous scaling out activity to take effect. It is same as cooldown period of EC2 AutoScaling Group. I left it as blank and let it choose the default values.

Scale-in cooldown period : This is the number of seconds to wait for a previous scaling in activity to take effect. It is same as cooldown period of EC2 AutoScaling Group. I left it as blank as well and let it choose the default values.

Turn-off scale in : If this is checked, then Service will not scale-in back after it is scaled out.

Further Configuration:

Placement templates : It lets you decide how tasks are placed or terminated on instances within your cluster. I chose AZ balanced spread.
This template will spread tasks across availability zones and within the Availability Zone spread tasks across instances during scale-out. When a scale-in takes place, Amazon ECS selects tasks to terminate that maintain a balance across Availability Zones. Within an Availability Zone, tasks are selected at random. More details about other Placement templates are mentioned here.

Now click on Deploy and it will create the Service. This will look like following under the Services Tab:

This service has started with 1 task because we defined Desired tasks as 1. This task is being run on the existing EC2 Instance created by the Cluster because it has more CPU and Memory available than required for the task.

Now click on encircled link and it will open description of the Service:

You will find a link to a load balancer which is attached to this Service which does the health check based on the configuration defined above. You can click on the link of Load balancer name and check the configurations there.

Now you will to have make some changes manually in the Security Groups of the Load Balancer and the EC2 Instance registered as its target.

Note: Firstly, make changes in Inbound rule of EC2 Instance and allow port 80 tcp and choose the security group of Load Balancer as source. And then make changes in Inbound rule of Load Balancer and allow port 80 tcp and choose 0.0.0.0/0 as source. Unfortunately, ECS console doesn’t allow to configure the security groups via console. Thats why we need to manually configure them after they have been created. Now open your browser and type the Load Balancer’s DNS Name and you should be able to see the Nginx page.

Coming back to the Service, The interesting thing to notice here is, on the Logs Tab you will see some logs of the container because we enabled Monitoring and Logging in task definition.

Some of these logs are generated by the health checks done by load balancer on path /.

Now go back to the cluster and click on Tasks Tab and you will see a task running there. If you click on the encircled Container Instance then you will find the Id of the EC2 Instance on which task is running.

Click on the encircled Instance Id and connect to that instance via Session Manager and run the command docker ps .This time you will see your nginx container running apart from the container of ECS Container Agent.

One final thing is remaining and that is the 2 CloudWatch Alarms automatically created by Service AutoScaling. Remember that, 2 Cloudwatch Alarm were also created by EC2 AutoScaling. Now we will have total 4 CloudWatch alarms.

So far we have defined a Task Definition, ran a cluster with an EC2 Instance and deployed a Service to run a single-container nginx task on that EC2 Instance. Everything ran successfully and we are able to access nginx webpage from our browser using load balancer’s DNS name.

In the next article we will do load testing on the EC2 Instances of the Cluster and will see EC2 AutoScaling and Service AutoScaling in action.

--

--