Disclosure: I may earn affiliate revenue or commissions if you purchase products from links on my website. The prospect of compensation does not influence what I write about or how my posts are structured. The vast majority of articles on my website do not contain any affiliate links.

Site Reliability Engineering: How Google Runs Production Systems by Betsy Beyer, Chris Jones, Jennifer Petoff and Niall Richard Murphy
Rating: ★★★★★
Date Finished: February 6th, 2018
Reading Time: A month

“Perfect algorithms may not have perfect implementations.”

And perfect books may not have perfect writers. Site Reliability Engineering is an essay collection that can be rickety at times but is steadfast in its central thesis. Google can claim credit for inventing Site Reliability Engineering and, in this book, a bunch of noteworthy engineers share their wisdom from the trenches.

When it comes to software architecture and product development, I’ve found delight in reading about how startups’ products are built because the stories are digestible. It’s possible for a founder, lead engineer, or technical writer to lay down the blueprint of a small-scale product and even get into the nuts and bolts. When it comes to large tech companies, this is impossible from a technical point of view and improbable from a compliance standpoint.

This is beside the purpose of the book, but arrangements like this one help bridge the gap between one’s imagination and the inner-workings of tech giants. There are plenty of (good!) books that tell you all about how Google the business works, but this one happens to be the best insight into how the engineering side operates. Sure, you have to connect some dots and bring with you some experience, but the result is priceless–you start to feel like you get it.

The essays are almost all useful. If you haven’t spent at least an internship’s worth of time in the workforce, you should probably table this one until you have a bit more experience. I would have enjoyed this book as an undergraduate, no doubt, but most of it wouldn’t have clicked. The Practices section–really, the meat of the book–is where the uninitiated might struggle. When I emerged on the other side I had a list of at least twenty topics that I needed to explore in more detail if I was to become truly great at what I do.

I highly recommend this book to anyone on the SRE/DevOps spectrum as well as those trying to understand large-scale tech companies as a whole.

View this review and others on Goodreads.com

Site Reliability Engineering Review