You are on-call for an infrastructure service that has a large number of dependent systems. You receive an alert indicating that the service is failing to serve most of its requests and all of its dependent systems with hundreds of thousands of users are affected. As part of your Site Reliability Engineering (SRE) incident management protocol, you declare yourself Incident Commander (IC) and pull in two experienced people from your team as Operations Lead (OL) and
Communications Lead (CL). What should you do next?
Charun
Highly Voted 2 years, 10 months agofrancisco_guerra
Highly Voted 2 years, 10 months agoAzureDP900
1 year, 6 months agojomonkp
Most Recent 4 months, 3 weeks agoJonathanSJ
1 year, 3 months agoFeliphus
4 months, 1 week agofloppino
1 year, 4 months agomoitsu
1 year, 5 months agoAzureDP900
1 year, 6 months agoatkhan
1 year, 6 months agoEricaZhao
1 year, 7 months agoEricaZhao
1 year, 7 months agoGCP72
1 year, 8 months agogomezzang
2 years agoric79
2 years, 1 month agozygomar
2 years, 2 months agobuldas
2 years, 2 months agoFeliphus
4 months, 1 week agoPhilipKoku
2 years, 2 months agopondai
2 years, 3 months ago