Startup Resources: Monitoring Tools

Learn how to keep track all of your server metrics, business metrics, and alerts. See the "Software Delivery" chapter in Part II, Technologies for more info.

  1. Logging Tools
  2. Availability Metrics
  3. Business Metrics
  4. Application Metrics
  5. Process Monitoring
  6. Code Metrics
  7. Server Metrics
  8. Alerting
  9. Further Reading

These startup resources are based on the book Hello, Startup: A Programmer's Guide to Building Products, Technologies, and Teams by Yevgeniy Brikman. These resources are a work in a progress. They are also open source, so you can add your contributions by submitting a pull request to the Hello, Startup GitHub Repository. To see how these resources fit into the bigger picture, check out the The Startup Checklist, which is a comprehensive collection of everything you need to do to launch a startup.

Logging Tools

Logging is your first layer of monitoring. Make sure you understand log levels, log formats, and log aggregation.

Apache Logging Services

http://logging.apache.org/

log4j, log4php, log4net, log4cxx, etc


logstash

http://logstash.net/

logstash is a tool for managing events and logs. You can use it to collect logs, parse them, and store them for later use (like, for searching). Speaking of searching, logstash comes with a web interface for searching and drilling into all of your logs. It is fully free and fully open source.


Loggly

https://www.loggly.com/

Cloud Log Management Service




Sumo Logic

http://www.sumologic.com/

Next Generation Log Management & Analytics



Application Metrics

Tools to monitor what your application code, both on the server-side (QPS, latency, through put, error counts) and on the client-side (load time, payload size, crashes).








boomerang

http://www.lognormal.com/boomerang/doc/

boomerang is an open source piece of javascript that you add to your web pages, where it measures the performance of your website from your end user's point of view. It has the ability to send this data back to your server for further analysis. With boomerang, you find out exactly how fast your users think your site is.


CoScale

http://www.coscale.com/

CoScale provides full stack web performance monitoring, combining server and application metrics, page load times, and custom metrics and events. CoScale simplifies monitoring and troubleshooting with automated anomaly detection and contextual insights, so you can act proactively on performance changes that impact your business.


Further Reading

More reading on monitoring and metrics

Agility Requires Safety

https://www.ybrikman.com/writing/2016/02/14/agility-requires-safety/

To go faster in a car, you need not only a powerful engine, but also safety mechanisms like brakes, air bags, and seat belts. This is a talk that discusses the safety mechanisms that allow you to build software faster.