I say this with love, as a diehard lover of string processing tools, but: just look at this example. THE WAY YOU ARE DOING IT NOW IS THE HARD WAY.
All the author wants is to group by IP and order by count. A single straightforward question, or.... this...lengthy manual.
And why have we been learning these complicated workarounds and hacks and elaborate cargo cults?
Because monitoring tools can't break down by a high cardinality dimension like IP, and never stored the connective tissue of the event: only a scattering of disconnected metrics.
Structure your data.
Events not metrics.
Computations not strings.
High cardinality is not a nice to have.
(High dimensionality, either)
Flexible schemas and wide, rich rows.
And now you've got observability.
(Plus or minus a few arguably lesser details, like, I might argue that a columnar store is required so you aren't stuck having to choose indexes and thus which questions you may efficiently ask.)
Oh, and absolutely no preaggregation at write time. 🚫🚫🚫 just say no🚫🚫🚫
You can follow @mipsytipsy.
Tip: mention @threader_app on a Twitter thread with the keyword “compile” to get a link to it.
Enjoy Threader? Sign up.
Threader is an independent project created by only two developers. The site gets 500,000+ visits a month and our iOS Twitter client was featured as an App of the Day by Apple. Running this space is expensive and time consuming. If you find Threader useful, please consider supporting us to make it a sustainable project.