When experiencing a surge in inbound requests, we have to watch the ELB's cloudwatch metrics closely. The metrics are documented in the MonitoringLoadBalancerWithCW
The key metrics are
The key metrics are
- RequestCount
- Latency
- HTTPCode_ELB_5XX
- HTTPCode_Backend_5XX
- SurgeQueueLength
Typically, you will see a linear relationship between 'RequestCount' and 'Latency' metrics. When the load increases, the latency will also increase correspondingly. With default settings you will see a ELB timeout of 60 secs getting invoked if latency intervals are greater than ELB timeout. Latency metric is indicative of the time duration that ELB has to wait for a response from the instance to which it has handed the request. If the instances takes longer to respond (HTTP 200, 4xx or 5xx codes) then we will see increased latency. HTTPCode_ELB_5xx metric indicates the no. of occurrences of ELB failing to handle the incoming request and ELB directly send a http 5xx error code back to the client. HTTPCode_Backend_5xx indicates the occurrences of errors in the backend instances failing to get a valid response from end service. SurgeQueueLength metric indicates the no. of requests that have been queued up by the ELB waiting for a healthy instance to become available.
The CloudWatch metrics of RequestCount and Latency showing linear relationship will look like
NOTE - If your instances behind the ELB are in the same zone, then you may want to disable the "cross zone load balancing" feature of the ELB so reduce the performance overhead by a small amount.
you can also enable access log and access collection as per the AWS documentation links below:-
- http://docs.aws.amazon.com/ElasticLoadBalancing/latest/DeveloperGuide/enable-access-logs.html
- http://docs.aws.amazon.com/ElasticLoadBalancing/latest/DeveloperGuide/access-log-collection.html
Nice work, your blog is concept-oriented, kindly share more blogs like this
ReplyDeleteAWS Online Training Hyderabad