Originally posted as this twitter thread.

0/ The easier part of Continuous Delivery (“CD”) is, well, “continuously delivering software.”

The harder part is doing it reliably.

This is a thread about the critical differences between what we’ll refer to as “local CD” and “global CD,” and how observability fits in. 👇

1/ Let’s begin by restating the conventional wisdom about how to do “Continuous Delivery” for a single (micro)service:

i) <CD run starts>

ii) Qualify release in pre-prod

iii) Deploy to prod

iv) If the deployed service is unstable, roll back the deploy

Safe, right? Not really.

2/ The above…

Originally posted as this twitter thread.

0/ When large eng orgs rely on metrics for both monitoring *and* observability, they struggle with cardinality.

This is a thread about “the two drivers of cardinality.” And which one of those we should kill. :)


1/ Okay, first off: “what is cardinality, anyway?” And why is it such a big deal for metrics?

“Cardinality” is a mathematical term: it’s *the number of elements in a set*… boring! So why tf does anybody care??

Well, because people think they need it, then suddenly, “$$$$$$$.”

2/ When a developer inserts a (custom) metric to…

Originally posted as this twitter thread.

0/ I’m tired of hearing about observability replacing monitoring. It’s not going to, and that’s because it shouldn’t.

Observability will not _replace_ monitoring, it will _augment_ monitoring.

Here’s a thread about observability, and how monitoring can evolve to fit in:👇

1/ Let’s start with the diagram (above) illustrating the anatomy of observability. There are three layers:

I. (Open)Telemetry: acquire high-quality data with minimal effort
II. Storage: “Stats over time” and “Transactions over time”
III. Benefits: *solve actual problems*

2/ The direction for “Telemetry” is simple: @opentelemetry. This is the (only) place where “the three…

Originally posted as this twitter thread.

0/ Deep systems have come to the fore in recent years, largely due to the industry-wide migration to microservices.

But weren’t monoliths “deep”, too? Well, yes and no.

And this is all related to tracing, observability, and the slow death of APM.


1/ First, let’s start with monoliths. Of course they’ve been around for a while, and it’s where most of us started. There is plenty of depth and complexity from a monolithic-codebase standpoint, but operationally it’s just one big — and often brittle — binary.

Originally posted as this twitter thread.

1/ APM is dying — and that’s ok.

What happened? And why?

2/ In APM’s heyday (think “New Relic and AppDynamics circa 2015”), the value prop was straightforward: “Just add this one magic agent and you’ll never need to wonder why your monolithic app is broken!”

But then things changed.

3a/ *Systems got deep:* APM was designed for monoliths — where development revolved around a single app server. Monoliths slowed down dev velocity, so we broke them into layer upon layer of services.

Originally posted as this twitter thread.

0/ This is a thread about my experiences building both the Distributed Tracing and Metrics infra at Google.

And, particularly, my regrets. :)

Here goes: 👇

1/ Dapper certainly did some fancy tricks, and I’m sure it still does. If it’s possible to fall in love with an idea or a piece of technology, that’s what happened with me and Dapper. It wasn’t just new data, it was a new *type* of data — and lots of it. So. Much. Fun. …

2/ … And yet: early on, *hardly anybody actually used it.*


Originally posted as this twitter thread.

1/ First things first: Metrics, Logs, and Traces are not “the three pillars of observability.”

They are just the raw materials — the *telemetry* — and we must reframe our discussion of observability around use cases and problems-to-solve.


2/ The conventional wisdom looks like this… Observability is this cool 6-syllable word that you know you want because it’s trendier than monitoring. And you get it (somehow) by purchasing Logs, Metrics, and Tracing.

Originally posted as this twitter thread.

0/ Fundamentally, there are only two types of “things worth observing” when it comes to production systems:

1) Resources
2) Transactions

The tricky (and interesting) part is that they’re entirely codependent. This is a thread about that tricky/interesting part…

1/ But first, some definitions.

Transactions: these are the things that traverse your system and (hopefully) “do something.” The classic example would be an end-user request that propagates across networks and process boundaries.

2/ “Transactions” can be described at wildly different granularities: actions in a mobile app, HTTP reqs, function calls, CPU instructions, etc. …

Originally posted as this twitter thread.

0/ Like most organizational innovations, DevOps is powerful because it allows people to be more productive by crossing fewer org boundaries.

The first set of boundaries are the org chart itself.

The second set of boundaries are less obvious, yet just as important.

1/ So, about that first set of org boundaries: by segmenting Dev and Ops roles, then by taking things a step further and creating separate orgs for each, we guaranteed that software deployment would always require human beings to wait for each other. Not good.


Originally posted as this twitter thread.

0/ If you or someone you love uses Kafka in production, I’m sure there’s been some emotional toil when a single producer floods a topic and creates a cascading failure. This is a thread about how monitoring and observability can make that far less painful.


1/ At a certain level, Kafka is just like any other resource in your system: e.g., your database’s CPUs, your NICs, or your RAM reservations. All resources are finite, and when they participate in transactions, there is a little bit less of them than when they don’t.


Ben Sigelman

Co-founder and CEO at LightStep, Co-creator of @OpenTelemetry and @OpenTracing, built Dapper (Google’s tracing system).

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store