先週のAWS Summit San Fransiscoにて、ついにLambdaがSNSに対応しました。
様々なサービスが発表された中、個人的にはこれが一番のヒットです!というのも、この機能によってAWS間のサービスがより連携しやすくなり、新しいリアクティブなアーキテクチャをどんどん実現できそうだからです。
Alarm Details:
- Name: StatusCheckFailed-Alarm-for-i-xxxxxxxx
- Description: Instance i-xxxxxxxx has failed
- State Change: OK -> ALARM
- Reason for State Change: Threshold Crossed: 2 datapoints were greater than or equal to the threshold (1.0). The most recent datapoints: [1.0, 1.0].
- Timestamp: Friday 17 April, 2015 03:13:09 UTC
- AWS Account: 123456789012
Threshold:
- The alarm is in the ALARM state when the metric is GreaterThanOrEqualToThreshold 1.0 for 60 seconds.
Monitored Metric:
- MetricNamespace: AWS/EC2
- MetricName: StatusCheckFailed
- Dimensions: [InstanceId = i-xxxxxxxx]
- Period: 60 seconds
- Statistic: Maximum
- Unit: Count
Service: AWS Auto Scaling
Time: 2015-04-17T03:13:59.367Z
RequestId: efa97137-fa15-4aa4-9c8c-5241961a2d0e
Event: autoscaling:EC2_INSTANCE_TERMINATE
AccountId: 123456789012
AutoScalingGroupName: as-sg
AutoScalingGroupARN: arn:aws:autoscaling:ap-northeast-1:123456789012:autoScalingGroup:c395c157-3a7e-4d56-287b-5ad9b26eb464:autoScalingGroupName/as-sg
ActivityId: efa97137-fa15-4aa4-9c8c-5241961a2d0e
Description: Terminating EC2 instance: i-xxxxxxxx
Cause: At 2015-04-17T03:13:36Z an instance was taken out of service in response to a user health-check.
StartTime: 2015-04-17T03:13:36.342Z
EndTime: 2015-04-17T03:13:59.367Z
StatusCode: InProgress
StatusMessage:
Progress: 50
EC2InstanceId: i-xxxxxxxx
Details: {"Availability Zone":"ap-northeast-1a","Subnet ID":"subnet-bbbbbbbb"}
通常はCauseがan instance was taken out of service in response to a EC2 health check indicating it has been terminated or stopped.となるのがan instance was taken out of service in response to a user health-check.となっているのでAutoScalingのEC2 Health Checkより前にアクションが起こされた事が分かります。
障害発生からInstanceがリプレースされてInServiceになるAuto Healingのトータル時間は6分ちょいになりました。
EC2 Auto Recoveryを使えば済む場合もありますが、あちらはAWS側の障害に起因するStatusCheckFailed_SystemのみでStatusCheckFailed_Instanceはトリガー対象じゃないのと、特定のインスタンスタイプやVPC等若干制限があります。