Observability versus Monitoring: Which is Better for DevOps?

Kiran Kamity

Kiran Kamity

Passionate serial Silicon Valley entrepreneur. Head of product at Cisco Cloud BU. Founder/CEO at ContainerX (acquired by Cisco). Founder/VP at RingCube (acquired by Citrix). TEDx speaker. Loves nature, travel and food.
hero_trans@2x
Observability vs Monitoring

The more customer meetings we have to discuss our pre-production observability tool, the more we see confusion between the words ‘observability’ and ‘monitoring’. Apparently, many engineering leaders and developers use these words interchangeably. Is there a difference between them? If yes, which one is right for my project? In this blog, we geek out discussing observability versus monitoring with some non-geeky examples. And don’t forget, you can see the difference immediately for yourself by using DeepFactor for free. Register now and read on. 

 

Let’s dive in. Imagine you have two kids – Alice and Bob. 

 

Alice is the dream child everyone wants to have. She is highly responsible. She keeps you updated constantly on what she is doing and where she is at, so you will never have to worry as a parent. “Dad, I just finished dinner.”; “Dad, I am doing my homework now.”; “Dad, I am playing in the front yard with my friend, Devon.” etc… And if you need more detail, you can just ask her, and she will immediately tell you. “Alice, are you doing your math homework now?” Alice instantly replies, “Dad, yes, I just finished math homework and am now doing English homework.” By sending you an ‘event stream’ of her ‘current state’ and responding to your ‘queries regarding her state’, Alice is making herself ‘observable’. Observability, therefore, is her innate quality. 

 

Bob, on the other hand, does not tell you what he’s up to. He may not be misbehaving, but there is no way of knowing unless you constantly check in with him. Also, he will not respond to you even if you specifically ask him. You, as a parent, will have to go watch for yourself – you must stop what you’re doing and walk to his room to check what homework he is working on, then later stop what you’re doing again and walk to the front yard to see who he is playing with, and so on. Hence, it’s the parent’s job here to ‘monitor’ where Bob is and what he is up to, every few minutes. Since Bob is not ‘observable’, you will have to ‘monitor’ him.  

 

Now let’s bring this concept to your cloud applications. Observability is the property of the application. If you’ve made your app ‘tell you’ what it is doing by sending you a stream of ‘events’, then you’ve just made your app observable. And that’s exactly what DeepFactor enables. If, on the other hand, you have installed various tools/sidecars/kernel modules, etc. to watch what your app is doing from the outside, you are likely monitoring
your app. 

 

Both observing and monitoring an application can be used to gather security, performance, compliance or other behavioral insights. There are numerous event streams one can gather from a running application – system calls, library calls, L3 Network activity, HTTP(S), 3rd party APIs, configuration, dependencies, cloud service usage, and many more. But how you gather the event streams and what types of events you gather can determine the number of tools one needs to use, use cases where applicable, and the quality of insights.

 

WHICH IS BETTER?

OBSERVABILITY

MONITORING

For ease of use

You only need one, maybe two good observability tools that instruments an app while it’s running and gathers all relevant event streams right from the source – the app itself. And being
easy to use, you’ll experience a faster adoption rate than having to learn numerous monitoring tools.

You need to use countless monitoring tools to gather comprehensive telemetry from apps – one for system calls, one for network, one for HTTP, and more. 

 

And, some event streams will likely be missing. For example, your process’ memory behavior or calls it makes that do not translate into system calls cannot be easily tracked with an external monitoring tool.

 

For Dev/QA engineers &

DevOps pipelines

For this use case, the focus is primarily on detecting issues with what the app is doing instead of what the hosts are doing. That said, monitoring feeds from various tools that watch for host metrics are less important. 

 

App observability tools are better suited here, not only because they provide a comprehensive set of event streams about what is happening ‘inside’ the app, but also because of the ease
of use. One simple solution. No need to 
install and manage multiple agents.

 

Plus, observability is extremely easy to integrate into Continuous
Integration (CI) pipelines.

For this use case, monitoring is less ideal because of the need to install and manage multiple tools and agents for each type of event stream. And procuring and putting all these tools in the CI pipeline can be cumbersome.

 

Plus, what is happening on the host or network is less relevant for developers or DevOps teams, compared to the code changes that went into the app and what issues they may cause. 

 

The primary focus is what was checked into my GitHub, and what ‘improper’ behavior did that cause in my application.

For Kubernetes and containerized applications

If you pick a language-specific observability tool (such as the traditional IAST tools), it may be cumbersome to go into each container and instrument them.

 

More modern observability tools, such as DeepFactor, enable easy insertion at the Kubernetes level so you don’t have to muck with each container/service. Checkout tools like DeepFactor.io for security and compliance and Datadog.com for performance.

 

Modern monitoring tools are moving in the direction of supporting Kubernetes. But then again, you must pick many tools because each of them specializes in gathering separate event streams from your applications via sensors in the host/network.

For cost reasons

Since you need significantly fewer observability tools, you will save a lot of time and money using an observability solution
versus numerous monitoring tools.

 

Using numerous monitoring tools will result in more in time, resources, and budget needed.

For Machine Learning (ML) use cases

When it comes to ML, the greater the number of data sources, the better the insight. For this use case, observability tools could be combined with monitoring tools in order to gather an even more comprehensive set of data sources and event streams that could then be fed into your ML engines. 

When combined with your application’s observability feeds (such as DeepFactor), multiple data streams coming in
from monitoring tools (Prometheus feeds coming in from your hosts, Fluentd logs, VPC Flow logs, Zipkin, or other traces) can be fed into your ML engines for more comprehensive insight for your SRE or SecOps teams.

 

 To summarize, both observability and monitoring techniques are useful platforms. Deciding between the two, or, using a combination of both tools depends on your use cases, resources, and budget. Our recommendations at DeepFactor.io are the following:

 

  1. For your engineering organization, developers, QA engineers, or your DevOps pipeline, when looking for visibility
    into your application’s security, performance, compliance or other behavior, observability solutions will give you  faster deployment, better adoption due to ease of use, and likely better economics.

  2. If you have more modern/containerized apps/Kubernetes deployments, observability solutions are a better fit
    since many are being built for the container-first world.

 

  1. For ML use cases where you are better served with richer data sets, having a combination of data sources from
    both observability and monitoring tools feeding into your logging platforms like ELK or Splunk will give you the most comprehensive insight.

 

Experience easy-to-use pre-production observability at no cost. Register now and begin using DeepFactor immediately!