Chaos Engineering Practices
What is chaos engineering and how would you implement it safely in a production environment?
// interview question
What is chaos engineering and how would you implement it safely in a production environment?
Answer out loud first, then check yourself against the model answer.
More SRE interview questions
Also worth your time on this topic
Running Your First Chaos Engineering Experiment with Litmus
A hands-on walkthrough of installing LitmusChaos on Kubernetes, killing pods on purpose, and watching whether your app actually recovers. Real YAML, real output, no theory.
SLOs, SLIs, and Error Budgets: A Practical Implementation Guide
A step-by-step checklist for defining service level objectives, picking the right service level indicators, and using error budgets to make better decisions about reliability vs. feature velocity.
45-90 minutes
SLO vs SLI vs SLA Differences
Your team just launched a new API service. Your manager asks you to set up SLOs for it. Can you walk me through what SLOs, SLIs, and SLAs are, and how they relate to each other?
junior