How People Approach Observe

Compared to the Honeycomb baseline.

In a similar way to Honeycomb being “a great way to store and retrieve telemetry”, someone noticed that Snowflake was a great way to store and retrieve telemetry. Observe was born out of that concept, has one datastore with everything poured into it and shaped on the way through. No silos is the whole pitch.

In this post I’m using the persona frame from the baseline post and walking the same three people through Observe.

Disclaimer up front!

I work at Honeycomb, four-plus years now, and it’s deep in my bones to believe in things like “columnar data storage” and “high cardinality” data. Observe has these things!

Normally I point out how “monitoring tools” show a chart of pre-aggregated events and that you can’t really drill down on those since there’s no underlying data. Saying “Honeycomb is an analytics tool for operators” doesn’t differentiate here because Snowflake is an analytics tool for anyone.

Are these resources actually a dashboard?

Observe calls itself data engineering for observability, and that’s not marketing fluff, it’s the actual shape of the thing. Telemetry lands as “observations,” and you use OPAL — their pipe-based query language — to shape those observations into Datasets. A Dataset has a real schema: named, typed columns, time semantics, and relationships to other Datasets. There are three kinds. Events have one timestamp, like a log line. Intervals have a start and an end, which is how a span in a trace gets modeled. And Resources are stateful things that exist over time — a host, a pod, a user, a shopping cart.

That Resource concept is the tell. Observe doesn’t just want your events, it wants a model of your world, with the events hung off of it. The Datasets link together into a graph you can navigate by lineage, and the payoff is that once you’ve shaped a thing, exploring around it is genuinely slick — click from a failing pod to the deployment that owns it to the logs it emitted, because somebody taught the system that those things are related.

Observe is expecting your questions to repeat enough that shaping pays off. You do the modeling work once, and then the same investigation gets cheaper every time someone runs it. Honeycomb expects the opposite! That you’ll always be asking a question nobody anticipated. It’s pointless to force people to pre-shape the data before they’re allowed to look at it. Send wide events, query immediately, no modeling step in the way.

There is a defense for “I have these 3 pods and I want to know what happened to them” observability. Honeycomb’s take is simply too dogmatic, that “once you have an answer you can act on it and why would you ever need to ask that question again?” We made fun of the very concept of dashboards for a decade before caving.

“Send everything” doesn’t really mean “Keep everything”

Observe’s pitch is “keep everything, no sampling.” Snowflake really will swallow the volume — their biggest tenant runs past 200 TB a day. But at terabytes an hour nobody keeps every byte; you only get to choose how you shed it.

Honeycomb leans into sampling, specifically tail-based sampling with Refinery. Only a weighted slice of the most important data leaves your environment. Weighting means it can be corrected at query time to restore count and percentiles to their true values.

Observe ships Drop Filters and suggests trimming up to half your ingest to stay on budget. Their agent ships with both the probabilistic sampler and tail sampling processor, neither of which imbue the kept data with a weight to offset the dropped data.

At some scale, “keep everything” falls apart, even for Snowflake, and it’s time to sample.

Query semantics are all heuristics and proxies

OPAL — Observe Processing and Analysis Language — is pipe-based. You write one verb per line and each verb’s output feeds the next: filter to cut down, make_col to derive a field, statsby to group and aggregate, timechart to bucket over time. If you’ve written Splunk’s SPL or piped your way through a shell one-liner, the mental model is familiar. If you haven’t, it’s a real language with a real ramp, and Observe knows it — they’ve bolted on AI assist that’ll write OPAL from a natural-language comment, which is the kind of thing you only build when the underlying tool needs a translator.

At Honeycomb, we just assume everyone knows SQL so the UI has “Select” and “Where” and “Having” sorts of fields. (The API still calls them “Visualize” and “Breakdown” — same borrowed vocabulary, one layer down.) They don’t work much like SQL. They are magically converted into column store set activities or scans or whatever. It’s fine.

I’d love to rename “Group By” to something like “MORE LINES PLZ” but designers do designs.

So the question isn’t “is OPAL good.” It’s whether the people who need answers from this system are the kind who’ll invest in a query language, or the kind who open a ticket for the observability team to address.

Getting data in, and the second conversation

Observe accepts open OTLP and forces no proprietary SDK on you. Stand up a stock Collector or set plain SDK environment variables, point them at Observe’s endpoint, and data flows. That’s wonderful.

But “the Observe Agent is a distribution of the OpenTelemetry Collector” is the line I’d push back on hardest, because it’s dressed up as portability and it’s the opposite. There are two ways to use OpenTelemetry. One is OTel-as-flexibility: you own a neutral pipeline, you instrument once, and you can tee that data to Observe and Honeycomb and some third thing on a Tuesday because nothing downstream is special. The other is OTel-as-capture: the vendor ships a Collector distro, pre-configured, with proprietary processors that stamp on the attributes their platform wants, and the whole thing funnels into them. It speaks OTel on the wire and it’s still a one-way road. Observe’s agent is the second kind. It emits OTel-shaped data, into Observe, shaped for Observe — not a pipeline you’d also run to somebody else.

It’s only flexible if you bring your own pipeline, and the turnkey pitch — drop our agent, add our Apps, get Datasets and dashboards for free — is exactly the path that trades it away. Take the easy on-ramp and you end up captured, the same as you would with any vendor’s agent, just with a more open-looking format on the wire. The teams who stay portable are the ones who were going to run their own Collector regardless, and those teams could point it anywhere. The agent doesn’t give you that; it quietly assumes you won’t want it.

And the whiteboard version of this argument loses in the room more than I’d like. At large, mature companies, the choice between hand-rolling OpenTelemetry Collector configurations and dropping in an Observe agent that already does the sensible things — logs, host metrics, the Kubernetes furniture — is on its own enough to swing the decision. We make the case I just made to you: it’s just OTel, you can do all of this yourself, and you even keep it portable. It doesn’t always land. Plenty of teams don’t want portable; they want it working immediately so they can turn off some other vendor. Capture isn’t something a vendor sneaks past you — buyers reach for it on purpose, because good defaults beat homework, and “you could send this anywhere” is only worth something to a team that was ever going to.

After the data’s flowing, there’s the modeling conversation: which Datasets, which Resources, how they link, what gets accelerated. Their prebuilt Apps for AWS, Kubernetes, and the usual suspects will auto-create and auto-link Datasets so you’re not starting from a blank page — and that softens the cold start. But the durable value, the part the whole product is built to deliver, comes from shaping your own world into the model. That’s data-engineering work, and it’s ongoing, not a one-time setup.

For a team that already thinks in pipelines — they’ve got dbt, they’ve got a Kafka topic they’re proud of — the second conversation is a feature. They were going to model the data anyway; Observe just gave them a place to do it that’s wired into the telemetry. For a team that wanted to look at a trace and go home, it’s a second tax on top of the first.

So how does an Ops person approach Observe?

This is the persona Observe has to work hardest to win, and it’s the same reason it’s hard for us. The ops person wants a dashboard that already knows what they care about. Honeycomb makes them ask a question. Observe makes them ask a question and it’d really prefer they shaped a Dataset first.

The prebuilt Apps blunt this — drop in the Kubernetes App and you get Datasets and dashboards without modeling them yourself, which is the most Datadog-like thing Observe does and the smartest thing it does for this persona. So the cold start is better than the worldview would suggest. But the moment the ops person needs something the App didn’t ship, they’re in the modeling layer, and that’s a steeper wall than Honeycomb’s query builder for someone who just wanted the four golden signals on a screen.

I genuinely don’t know how the day-two experience holds up here, and I’m not going to fake it. The worry I’d want answered before betting on it: does the worksheet-per-investigation pattern stay tidy, or does it sprawl into a thousand half-finished explorations nobody can find six months later? The first week of any tool flatters it. Over time, I’m sure I’ll have more exposure to these unless it’s the perfect product. I will update this section in 2028 if I never hear from people migrating off Observe.

There’s also a latency question that matters more for this persona than any other. Observe’s interactive query target is in the low single-digit seconds; Honeycomb’s column store is built to come back sub-second on raw events. For an analyst that’s noise. For someone clicking through a live incident, the difference between “instant” and “five seconds” is the difference between staying in flow and losing your train of thought.

So how does a developer approach Observe?

There is no query language in Honeycomb and we get people asking for it every few months. These developers who long for query languages are probably going to have a good time in Observe. They already think in pipelines, and they take to OPAL — if your daily reality is transforms and joins, a pipe-based query language reads like home.

The developer who just wants to look at a trace has a steeper ramp. Observe has a real Trace Explorer now — waterfall view, flame chart, and an aggregate mode that lets you summarize across many traces to spot a regression by service version or pod, which is a useful thing to have. But traces in Observe are Intervals — a Dataset type sitting on a general-purpose data platform. In Honeycomb the developer’s instinct is “click into my service, click a trace, see what the hell happened,” and the tool is built around that reflex. In Observe the trace is one powerful view among many, reached through the data model rather than the next layer in an onion you’re already peeling.

Where Honeycomb still has the edge for this person is the un-modeled question. The developer hunting one weird trace at 4pm on a Thursday doesn’t have a Dataset for the thing that just broke, because nobody knew it was going to break. Honeycomb’s bet pays off precisely here: send wide events with whatever attributes you thought to add, and ask the question that didn’t exist yesterday, right now, without shaping anything first. That’s the moment Observe’s modeling layer is most likely to be in the way — during an active incident, there’s no time to do data engineering.

To be fair to where the moat is draining: Observe’s AI assist writing OPAL, and the broader industry push to make “ask an unanticipated question” something an agent does for you, both chip at this advantage. The gap that’s wide today — un-modeled exploration being awkward in a model-first tool — narrows every time the machine gets better at writing the model for you on the fly.

So how does the mysterious third group approach Observe?

This is where Observe might genuinely be strongest, and it’s also the group I’m farthest from.

My catch-all third group — engineering leads, data people, anyone who didn’t write the code and doesn’t get paged by it — includes a kind of person Observe seems built for: the analyst who wants to shape data and ask new questions of it. The Snowflake-style mental model is familiar to data people in a way Honeycomb’s “wide events” framing isn’t. Datasets, schemas, lineage, a graph of related things — that’s the vocabulary they already speak. Where Honeycomb asks a data person to unlearn the warehouse and think in raw events, Observe hands them a warehouse-shaped tool and says “go.” For that persona, I think Observe is a more natural fit than Honeycomb.

The Resource model is the thing I keep coming back to for this group. Tying telemetry to a model of your actual architecture — this pod belongs to this service belongs to this team — is exactly what an engineering lead trying to understand “how does it all fit together” wants. Honeycomb requires opentelemetry resource attribute wrangling or pipeline transforms to really build out this taxonomy. Observe makes it a first-class object which may be fragile in some environments.

Who are you actually betting on now?

As of February 2026, Observe is a Snowflake company. Snowflake bought it for a reported billion dollars, the largest acquisition in their history. Observe was built on Snowflake’s database from the beginning, so in one sense the database vendor just agreed with the data-lake thesis hard enough to buy the company that proved it — that’s a real endorsement of the worldview. What it means for you as a buyer is more double-edged. The independent vendor is gone, and the roadmap now answers to Snowflake’s priorities, which may or may not be your priorities. If your read on Observe was “scrappy data-lake startup picking off Datadog accounts,” update it: you’d be buying into a Snowflake product strategy now, with whatever that implies for pricing, integration, and which features get attention.

The AI angle is where they’re spending their voice lately — an “AI SRE” agent for incident investigation and an “o11y.ai” assistant for instrumentation and querying, the latter shipping an MCP server that plugs into Claude Code and friends. Honeycomb has taken a similar tack in the past 6 months building out AI capabilities as well. These are still emerging so I’ll have to compare them in the future.

Who am I to tell you what to do?! Oh yeah. I’m Mike.

For exploration of questions you didn’t see coming — the developer chasing a trace nobody modeled, the un-anticipated business question during an incident — Honeycomb’s “don’t make them pre-shape” bet still wins, and the modeling step is the reason. For a shop whose investigations genuinely repeat, run by people who think in datasets and are happy to shape their world once and reuse it, Observe handles it.

The deciding factor isn’t a feature on either side. It’s how shape-able your org honestly believes its own workflows are — and most orgs contain both kinds of work, in the same building, on the same data.

The first step, if you want one

Since the capabilities mostly line up, I think the two big questions to ask are:

  1. Is the data volume high enough that sampling is damaging it?
  2. Do users need a lot of help to get oriented and query the data?

For the first question, Honeycomb’s tail-sampling processor is really good. I spend a lot of time introducing people to how it works and helping them get it optimized.

For the second question, MCPs and LLMs may save the day, but if you are using OpenTelemetry, point it at every vendor every year to see how everyone’s doing.

How many other observability tools you got?

Also, I hate to even mention it, but there are more contestants:

  • Datadog — frictionless agents and a bill split across six budgets nobody owns.

And if you wandered in mid-series, the baseline that started it is where the three personas come from.