Free o11y tip of the day:
If you use nginx or similar at your edge, wrap all calls out to other services and dbs with a function that logs duration to a header. Then you can swiftly identify *any* source of latency or errors using only nginx/access.log. https://www.honeycomb.io/blog/tell-me-more-nginx/ …
You emit all the header variables containing service hop times in your log, which serves as a shockingly effective poor man's distributed tracing/service map.
You'll want to feed it into a tool that lets you slice&dice; you can set this up on @honeycombio free tier in <15 min.
The hardest part of distributed systems is usually not debugging the code; it's finding where in your system the code you need to debug is.
How many hours have you wasted trying to find where the errors are coming from, or debugging the wrong service?
At Parse, we used the nginx trick with FB scuba ; it dropped our time-to-find-what-to-debug from hours to seconds, reliably.
(At honeycomb, of course, we don't have this problem 😉)
It's actually better than tracing, in some ways. It finds problems like "service y is 25% slower across the board", or "service u is slower but only for requests that issue db read queries to one specific replica in shard z (or writes to all shards with primary in region x)"
You can follow @mipsytipsy.
Tip: mention @threader_app on a Twitter thread with the keyword “compile” to get a link to it.
Enjoy Threader? Sign up.
Since you’re here...
... we’re asking visitors like you to make a contribution to support this independent project. In these uncertain times, access to information is vital. Threader gets 1,000,000+ visits a month and our iOS Twitter client was featured as an App of the Day by Apple. Your financial support will help two developers to keep working on this app. Everyone’s contribution, big or small, is so valuable. Support Threader by becoming premium or by donating on PayPal. Thank you.