I say this with love, as a diehard lover of string processing tools, but: just look at this example. THE WAY YOU ARE DOING IT NOW IS THE HARD WAY.
All the author wants is to group by IP and order by count. A single straightforward question, or.... this...lengthy manual.
And why have we been learning these complicated workarounds and hacks and elaborate cargo cults?
Because monitoring tools can't break down by a high cardinality dimension like IP, and never stored the connective tissue of the event: only a scattering of disconnected metrics.
Structure your data.
Events not metrics.
Computations not strings.
High cardinality is not a nice to have.
(High dimensionality, either)
Flexible schemas and wide, rich rows.
And now you've got observability.
(Plus or minus a few arguably lesser details, like, I might argue that a columnar store is required so you aren't stuck having to choose indexes and thus which questions you may efficiently ask.)
Oh, and absolutely no preaggregation at write time. 🚫🚫🚫 just say no🚫🚫🚫
You can follow @mipsytipsy.
Tip: mention @threader_app on a Twitter thread with the keyword “compile” to get a link to it.
Enjoy Threader? Sign up.
Since you’re here...
... we’re asking visitors like you to make a contribution to support this independent project. In these uncertain times, access to information is vital. Threader gets 1,000,000+ visits a month and our iOS Twitter client was featured as an App of the Day by Apple. Your financial support will help two developers to keep working on this app. Everyone’s contribution, big or small, is so valuable. Support Threader by becoming premium or by donating on PayPal. Thank you.