Master the principles and practices of Site Reliability Engineering as pioneered at Google and adopted by modern engineering organizations. This intermediate course covers SLOs and error budgets, incident management, observability, toil elimination, and capacity planning so you can run production services that are reliable, scalable, and operationally sustainable. By the end you will be able to design and operate an SRE program that balances feature velocity with reliability commitments.
By Marcus Reid
Marcus Reid
Expert instructor with hands-on industry experience in Devops.
Included in paid plans
This course includes