Charity Majors+ Your Authors @mipsytipsy CTO @honeycombio; co-wrote Database Reliability Engineering; loves whiskey, rainbows. I test in production and so do you. 🌈🖤Black Lives Matter🖤 Jul. 08, 2020 2 min read + Your Authors

tired: agents
wired: libraries

less facetiously ... anything agent-based will be severely limited in its ability to contribute to observability.

agents and other third-party observers are a monitoring pattern. true observability comes from within.

i'm not trying to be a hard-line absolutist here. there are MANY situations where the observability instrumentation ideal is impossible, and we end up having to approximate it using a variety of tools. including agents, logs... even tcpdump.

something is better than nothing, and nobody is going to go back and fully instrument their entire stack for each tool. nobody.

but the ideal is that your instrumentation should shine a light so you can understand what each request experiences, from the inside out.

so the closer you get to your instrumentation being a native part of your code, the more powerful and precise your telemetry will be.

this is why observability is so effective at aligning eng pain with user pain, which is why it is so key to transforming sociotechnical systems.

monitoring comes in many flavors, i know, but the third-party POV -- entity x checking up on entity y -- was very baked into it from the beginning.

and third-party perspective produces a very different flavor of empathy and urgency than the first-party perspective does.

it's a subtle point, but it matters.

yes, sure, you can use honeycomb with log tailers, agents, etc etc -- it works fine! great! -- but it's worth keeping the reason for your instrumentation in mind over the longer run, and bending in that direction.

a very, very effective pattern for rolling this out in established organizations seems to be this:

1) look for the pain. start with the pain. whatever services are paging you or in crisis and/or active development, add observability there.

2) teach on call to debug o11y-first, meaning any time you get a ticket or an alert, the first thing you do is add telemetry around that issue.

3) develop new services o11y-first. this is a faster, easier, and more sure-footed way to ship code, so it only takes a small push.

4) fast-forward a few months, and you will look up and realize over half your stuff is observability-ified. now pull together a plan to either systematically instrument the rest, or tee the output of your old tools into o11y, or just cruise on. as your budget and staffing allow.


You can follow @mipsytipsy.



Bookmark

____
Tip: mention @threader_app on a Twitter thread with the keyword “compile” to get a link to it.

Enjoy Threader? Sign up.

Since you’re here...

... we’re asking visitors like you to make a contribution to support this independent project. In these uncertain times, access to information is vital. Threader gets 1,000,000+ visits a month and our iOS Twitter client was featured as an App of the Day by Apple. Your financial support will help two developers to keep working on this app. Everyone’s contribution, big or small, is so valuable. Support Threader by becoming premium or by donating on PayPal. Thank you.


Follow Threader