Charity Majors @mipsytipsy CTO @honeycombio, ex-Parse, Facebook, Linden Lab; cowrote Database Reliability Engineering; loves whiskey, rainbows. I test in production and so do you. 🌈🖤 Feb. 18, 2019 2 min read

okay okay okay hold up

your on call week should not be LESS productive, it should be DIFFERENTLY productive. instead of cranking out the same old same old, you have carte blanche to switch gears and this is not an opportunity one should waste!

- those 200 paging alerts you want to delete/consolidate
- that refactor that makes you hate the would
- those UX notes on deploys
- that backup script that gives you cold sweats
- the proposed on call training process you want to write up
- come on, what else

On call week is sacred. On call week is devoted to the product called "Your Infrastructure".

The point of having two lanes and routing everything into the "degraded but ok, address eventually" lane is not to make week feel like every other. It's so you can ship Infra projects!

This is one reason why the ideal on call rotation is 5-7 bodies, IMO. Enough folks to have primary/secondary pairs, and a constant 1/5-1/7 of your focus dedicated to the infra and process itself.

Frequent enough to stay sharp, enough of a break to recharge and ship product.

On call being the contingency stack that is, no rule can be absolute. But if possible, forbid oncall from doing product work. Just bake it into all your roadmaps and expectations.

They can't do two jobs, and infra needs its time and attention too.

Here are a few neat common consequences:

1) after you firefight your way out of the hole, your diverse and talented team tends to fan out and develop expertise in distinct technical areas, thru opportunism and affinity. Someone will be your expert in db, build pipe, $lang, etc

2) once the alerts are cleaned up and coast is clear, if there's nothing pressing, they turn to their own corner. The db expert might be load testing a new version and writing the orchestration code for the upgrade.

3) whenever the project gets meaty enough, it gets turned over to the group. Like the db rolling upgrade may take weeks, so it becomes part of everyone's shifts. Everyone needs to know how to use and modify orchestration code.

Anyway. Having on call week not be shitty, and devoting it to investment work, is a great way to signal that you actually value this work.

If your rotations are too noisy to focus, consider a two week rotation: one heads up, one heads down. Yes, it's expensive. ¯\_(ツ)_/¯

Not only can you make it not awful, people *look forward to it*. People opt-in.

A break in routine, permission to fix what bugs them, encouragement to look around and exert ownership? And the next Friday off scot free? Folks fucking ❤️love❤️ on call ☺️


You can follow @mipsytipsy.



Bookmark

____
Tip: mention @threader_app on a Twitter thread with the keyword “compile” to get a link to it.

Enjoy Threader? Sign up.

Threader is an independent project created by only two developers. The site gets 500,000+ visits a month and our iOS Twitter client was featured as an App of the Day by Apple. Running this space is expensive and time consuming. If you find Threader useful, please consider supporting us to make it a sustainable project.