👏Today is the day👏
Workers ✨automatic✨ tracing is now in open beta!
✅Enable in seconds – no code changes required
🔎View and query trace data directly in the Cloudflare dashboard
📦Export traces (and logs!) to any external destination with an OTel endpoint
This has been what I've been working on for nearly a year now, super excited to finally do the first release of Automatic Tracing. Starting today anyone with Observability enabled will automatically get traces generated for their Workers that are present in the Cloudflare dashboard and can be sent to third party observability providers. We've had internal teams and select customers using this for a while and they've been able to make good performance gains through the increased understanding of what's happening in their Worker.
But, today's release is only the first of many. There are several things that we didn't have time to include in this first release like context propagation for distributed tracing, user-created spans, even span events, or even some CF bindings we didn't get to instrument. All of these gaps are on the tracing roadmap and will be released as they're ready, but in the meantime there's still a lot of useful data for those that have Workers in production.
@johnsqfrench Four different responses from four different councillors who all shrugged at the mayor's attempts to get you to agree and clarify direction to staff. No shadows, but also bigger units, and farther from the river, don't displace tenants, and don't give up any parking or go taller.
There have been some good posts on what "wide events" instrumentation is but not as many on how to go about it, what attributes you should add, or how to work with OpenTelemetry
I put everything I've learned in the last few years into one guide https://t.co/SPZ9GdtyRj
@calvinalkan re: middlewares. My experience is that if your middlewares are not performing any I/O (like auth / caching), just doing some computation, they are very unlikely to have interesting timings. If they are doing I/O, then that would already be captured in spans. YMMV
@calvinalkan Why only add a global middleware timing? Vs a timing attribute for each? The data encoded in each span here is three values: a name, a timestamp, and a duration
I've implemented verbose traces on a platform team, but no one adopted them. They do seem like a good idea!
@calvinalkan What additional information are child spans giving you in this scenario?
Ultimately you do have to look at your code to debug. Tracing helps you skip to "I know it happens in this section under these circumstances". It's not meant to be equivalent to a debugger.
@calvinalkan You can break across dimensions and see if there are correlations. Honeycomb's bubbleup can do this for you. "What do these slow requests have in common?"
@calvinalkan Ex: if there is a correlation between a specific user (or maybe region... or...) and long middleware execution, you're going to struggle to see that if they are their own spans.
@calvinalkan Wide events + tracing can work really well together! But I find people tend to get lost in the the tracing bits and waterfall diagrams before they learn to query across spans, which is where a lot of the leverage is
@calvinalkan That said, sometimes it's useful to wrap a chunk of important functionality in its own span w/ its own attributes, so this is more of a guideline
@calvinalkan@mipsytipsy Having one “wide” event doesn’t mean it’s the only event you emit for a request! You can still emit child spans and see your tracing view. When you put attributes on the “main” span you’re designing the schema you want to query against