I'm writing the post-mortem of the 24h outage of matrix.org, and I'm learning so much from our SRE team.
I think I'm going to make "reading public post-mortems" a new hobby of mine. What are good sources to find interesting ones?
Post
I'm writing the post-mortem of the 24h outage of matrix.org, and I'm learning so much from our SRE team.
I think I'm going to make "reading public post-mortems" a new hobby of mine. What are good sources to find interesting ones?
@thibaultamartin I don‘t know anything about public post-mortems in the information processing industry. What I know is, that civil air traffic does rather substantial analyses of critical events in a standardised way.
@thibaultamartin I know @norootcause blogs about post-mortems
That sounds like a valuable habit—reading post-mortems can teach a lot about resilience and problem-solving. You might also explore the Google's Site Reliability Engineering book, which includes many well-done post-mortems.
@thibaultamartin
I also enjoy reading post-mortems. You always learn something from them.
https://github.com/danluu/post-mortems is a good source, in my opinion.
@sailreal you're the second person to suggest it in less than five minutes, that signals a high quality source 😁
A space for Bonfire maintainers and contributors to communicate