Serverless Blog
all
guides & tutorials
user stories
operations & observability
engineering culture
news
operations & observability

Best tools for serverless observability

written by  
  Andrea Passwater

We admit it. In the serverless realm, getting the observability you need can be really frustrating.

In his series on serverless observability, Yan Cui has stated the challenges, and the reasons behind them, incredibly well.

But there is hope.

There is a constant onslaught of new tools, new features, and loud voices demanding change. At this point, we’re truly at the cusp of serverless observability being not just passable, but great.

In this post, we are compiling resources that you can use to have top notch insight into your functions. We will update this as new information becomes available, so it can serve as an observability tools guide for you, the intrepid serverless developer.

Read on for the best tools and best practices.

The tools

AWS CloudWatch

CloudWatch is the native AWS logging tool. It’s primarily for logging, monitoring, and alerts.

Benefits:

  • Tracing & profiling to investigate performance and cold starts
  • Monitoring and error logs
  • Customizable alerts
  • For Lambda users, works out of the box
  • A lot of people use it, which means there are a lot of plugins and other resources widely available

Drawbacks:

  • Metrics have up to one minute delay (not real-time)
  • No customizable events
  • Will probably need to use a separate log aggregator for centralized logging

Metrics: Cloudwatch comes with easy Lambda metrics; no setup.

Logs: Logs from your Lambda function, plus general status logs, are sent directly to Cloudwatch Logs.

Further reading:

AWS X-ray

X-ray is a distributed tracing system you can use for debugging across various AWS systems. It’s usage is not mutually exclusive with another tool, like IOpipe or CloudWatch, and most people use X-ray in conjunction with another monitoring tool.

Further reading:

Dashbird

Ever used the native CloudWatch interface? Not always touted as the most user-friendly UI. Dashbird sits on top of CloudWatch and provides a more navigable user experience, plus a few additional features.

Benefits:

  • Tracing & profiling to investigate performance and cold starts
  • Monitoring and error logs for debugging your serverless functions
  • Doesn’t require additional code to implement
  • Customizable alerts
  • Lambda cost-analysis (per-function basis)

Drawbacks:

  • Metrics have up to one minute delay (not real-time)

Performance metrics: includes extras like Lambda cost analysis.

Architecture metrics: track account-level stats across your entire architecture (individual microservice views also available).

Further reading:

IOpipe

IOpipe works with AWS Lambda functions written in Node.js, Python, and Java. It provides tracing, profiling, monitoring, alerts, and real-time metrics.

Benefits:

  • Tracing & profiling to investigate performance and cold starts
  • Monitoring & customizable events for granular error logs and debugging your serverless functions
  • Real-time metrics
  • Customizable alerts
  • Really easy to install and get running

Drawbacks:

  • You have to use a wrapper for each function, which can result in performance delays (about 20ms)

Real-time metrics: Monitor invocations, duration, memory usage, and errors in one place.

Search functionality: You can add multiple “rules” to find invocations that match. The example below looks for long-running invocations over 100ms, but you can search for errors, cold starts, or even custom metric values (e.g., “userId” = 1234).

Further reading:

Thundra

Thundra has not yet hit general availability, but you can sign up for beta access here.

Much like IOpipe, it promises to provide tracing, profiling, monitoring, alerts, and metrics.

Thunda will differ from IOpipe in a couple ways. They plan to focus on Java rather than Node.js or Python. They are also attempting to avoid latency by keeping data-sending separate from the Lambda function itself. Instead, they’ll first write their metrics to logs, and an out-of-band log processor will send those metrics to the Thundra backend.

Further reading:

OpenTracing

OpenTracing, is a vendor-neutral open standard for distributed tracing that is supported by the CNCF. Libraries are available in 9 languages: Go, JavaScript, Java, Python, Ruby, PHP, Objective-C, C++, and C#.

Note that this is a standard, and not a tool. You’ll have to set up your own collector and interface, or you can use a paid tool such as LightStep.

Benefits:

  • You can use it with any cloud provider, not just AWS

Drawbacks:

  • Takes some set-up

Further Reading:

Did we miss anything?

Feel free to leave comments, and/or submit a PR against this post to leave us suggestions.

About Andrea Passwater

Andrea leads growth marketing at Serverless.

user-stories - 25.05.16

Beginning Serverless Framework V.1

How startups to large enterprises, are using serverless to develop and deploy serverless, event-driven architectures on AWS Lambda

written by Philipp Müns

 - 26.10.17

Bobby Calderwood - toward a functional programming analogy for microservices

We're posting the full videos and transcripts from Emit 2017! Here's Bobby from CapitalOne with his talk on microservices.

written by Andrea Passwater

Join 12,000+ other serverless devs & keep up to speed on the latest serverless trends.

Join 12,000+ other serverless devs & keep up to speed on the latest serverless trends.

Made with love in San Francisco + Atlanta, Austria, Germany, Pakistan, Poland, Nebraska & Thailand

Serverless, Inc. © 2018

Join our newsletter and get the latest news about Serverless products and happenings. #noSpamWePromise