18. Logging, Monitoring and Observability

18. Logging, Monitoring and Observability

Logging, Monitoring, and Observability Overview

Introduction to Key Concepts

  • Logging, monitoring, and observability are significant topics that warrant individual discussions due to their complexity.
  • These practices exist on a spectrum; there are no strict rules governing their implementation across different organizations.
  • It's important not to feel overwhelmed by the various terms and tools associated with these practices in the industry.

Definitions of Key Terms

  • The speaker emphasizes the need for clarity on what logging, monitoring, and observability entail without delving too deeply into theory.
  • Code examples will be included to illustrate how these practices relate closely to application development.

Importance of Logging, Monitoring, and Observability

  • In modern distributed environments where applications run across multiple servers globally, tracking system performance is crucial.
  • Effective logging helps maintain oversight of events occurring within applications and infrastructure.

Understanding Logging

What is Logging?

  • Logging involves recording all significant events in an application’s lifecycle for future reference.
  • Important metadata such as user ID and request latency should accompany logged events for better context during analysis.

Understanding Monitoring

What is Monitoring?

  • Monitoring refers to keeping track of the state of backend applications and their components (e.g., CPU usage, memory consumption).
  • It provides near real-time data about system performance but may have slight delays (10–15 seconds).

Understanding Observability

What is Observability?

  • Observability encompasses multiple practices essential for understanding system behavior through three main pillars: logs, metrics, and traces.
  • Logs document important events; metrics provide quantitative insights related to monitoring; traces help track requests through systems.

Understanding Observability in Modern Applications

Components of Observability

  • The key components of a backend application include the handler layer, service layer, validation layer, repository layer, and database layer. Traces help track requests through these components.
  • A trace is defined as a transaction that encompasses all involved components during the execution of a request.

Evolution from Monitoring to Observability

  • Traditional monitoring practices primarily focused on error detection but only indicated that an issue existed without providing details on the nature of the problem.
  • The shift towards observability allows for not just identifying issues but also understanding their specifics when proper logging, metrics, and traces are implemented.

Key Practices: Logging, Monitoring, and Observability

  • Logging involves recording significant events throughout an application's lifecycle (e.g., user logins or database queries), serving as a detailed journal for debugging.
  • Monitoring entails real-time data collection to assess system health and performance over time while tracking patterns and trends.
  • Observability refers to determining an application's internal state by analyzing external outputs; it relies on logs, metrics, and traces working together.

Interplay Between Logs, Metrics, and Traces

  • Logs provide insights into what happened within the application; monitoring yields metrics that reveal performance patterns; observability utilizes traces to show component interactions.
  • Traces are crucial for understanding how different components interact during transactions in production systems with established logging and monitoring practices.

Practical Application of Metrics in Error Handling

  • Alerts can be configured based on specific parameters (e.g., error rates exceeding 80%), prompting notifications via platforms like Slack when issues arise.
  • Metrics encompass various parameters such as request counts or failure rates. They can be historical or real-time data points essential for diagnosing problems effectively.
  • Configurable metrics allow teams to focus on critical data points relevant to their applications' performance and health monitoring strategies.

Tools for Implementing Observability

  • Various tools exist for implementing observability practices including open-source options like Grafana and Prometheus for monitoring/tracing. New Relic serves as a comprehensive solution for those preferring not to configure multiple open-source tools.

Logging, Monitoring, and Observability in Backend Systems

Overview of Logging Tools

  • Grafana is an open-source tool for monitoring, while New Relic is a proprietary software. Both can help identify logs related to metrics such as high error rates.
  • Users can navigate from logs to traces, allowing them to see the request flow through various functions until it fails at a specific point.

Benefits of Implementing Logging and Monitoring

  • The integration of logging, monitoring, and observability enables quick debugging by pinpointing where issues occur in backend systems.

Introduction to Seala Platform

  • Seala is introduced as a platform-as-a-service provider that allows deployment of full-stack applications and databases similar to Netlify or Heroku.
  • Users can deploy observability tools like Grafana or Prometheus using Docker containers connected via an internal network.

Deployment Features of Seala

  • Seamless integration with GitHub allows automatic deployments upon pushing changes to the main branch using a GitHub bot.
  • Supports multiple build options including Nyx packs for over 20 languages and compatibility with Heroku build packs for smooth migration.

Team Collaboration Features

  • Preview deployments enable team members to test new features instantly via unique domains generated for pull requests (PR).
  • This feature enhances productivity by allowing quick feedback on changes before merging PRs.

Cost Efficiency and Infrastructure

  • Applications run on Google’s infrastructure with Cloudflare's edge network, providing cost-effective bandwidth compared to competitors like Vercel.

Understanding Logging Levels

Importance of Logging Levels

  • Discusses the significance of logging levels in production systems which categorize log events based on severity.

Common Logging Levels Explained

  • Debug: Used during development for detailed troubleshooting; typically disabled in production due to verbosity.
  • Info: Logs general application operations or successful events (e.g., creating a task in a todo app).

Warning Level Usage

  • Warning: Indicates non-critical issues that are not successful but do not warrant an error status (e.g., failed user authentication).

Logging Levels and Practices in Application Development

Understanding Logging Levels

  • Warning Level: Used for non-critical issues that are not errors but should be noted. Examples include password warnings.
  • Fatal Level: Indicates a serious issue that causes the application to stop or restart, highlighting critical bugs that need immediate attention.

Structured vs Unstructured Logging

  • Console Logs: During development, logs are displayed in the console for easy readability and troubleshooting. They are formatted attractively to help developers spot issues quickly.
  • Structured Logging: In production, logging is done in JSON format to facilitate parsing by log management tools. This format includes detailed parameters like error status and messages.

Importance of Log Management Tools

  • Log Parsing Efficiency: Production systems require structured logs (like JSON) for efficient parsing by tools such as ELK stack or Grafana, which extract valuable information without errors.
  • Development vs Production Needs: Unstructured logging is preferred during development for ease of use, while structured logging is essential in production environments for effective data extraction.

Implementing Logging Practices

  • Practical Demonstration: The video will showcase a Go-based to-do application implementing best practices in logging, monitoring, and observability using New Relic.
  • Tool Overview: New Relic provides a comprehensive solution for logging and monitoring needs; it simplifies integration compared to managing multiple open-source tools like Prometheus and Grafana.

Configuration Insights

  • Configuration Complexity: Setting up logging and monitoring can be complicated; proprietary solutions like New Relic may be more suitable for teams with limited resources.
  • Log Level Functionality: A function checks if the application runs in development or production mode to set appropriate log levels—info level for production and debug level for local development.

Logging Strategies in Development and Production

Structured vs. Unstructured Logging

  • The default logging format is set to console for development environments, but can be changed to JSON for production.
  • JSON logs are beneficial for production due to their structured nature, though they lack readability compared to development logs.
  • Development logs provide clearer insights into application events, such as database connections and server startups.

Monitoring with Middleware

  • New Relic middleware is used to instrument requests by wrapping the entire application, enhancing observability.
  • Key concepts in observability include instrumentation (measuring function attributes) and OpenTelemetry (a standard providing tools and SDKs).

Instrumentation Practices

  • OpenTelemetry supports various programming languages, offering APIs and SDKs for effective application instrumentation regardless of the language used.
  • Integration of OpenTelemetry collectors allows better control over request instrumentation even when using proprietary tools like New Relic.

Workflow of Creating a To-Do Item

  • The createToDo function begins by extracting transaction data from context via enhanced tracing middleware.
  • A new transaction is created with parameters such as service name, environment type, IP address, user agent, request ID, user email, and tenant ID.

Logging Events During To-Do Creation

  • Important events are logged throughout the process; this includes creating a new to-do item along with its title and any priority passed in the payload.
  • Error handling involves logging errors at an error level while successful operations log debug information about the created to-do's ID.

Finalizing Logs for Business Events

  • A business event log captures essential metadata related to the created to-do item including its ID, title, category ID, and priority.

Logging, Monitoring, and Observability in New Relic

Overview of New Relic Dashboard

  • The speaker demonstrates how to view logs in JSON format on the New Relic dashboard, emphasizing the importance of setting the server value to production for accurate monitoring.
  • The application’s error rates and transaction times are displayed; however, no data is shown due to inactivity over the last 30 minutes. A request is triggered to test API functionality.

Error Logging and Metrics

  • An unauthorized error occurs when testing an API without a token. The speaker checks if this error is logged in the dashboard.
  • Upon refreshing the errors section, unauthorized errors are visible along with related logs, highlighting how logging aids in debugging.

Understanding Metrics

  • Key metrics such as average transaction time and throughput are discussed. These metrics provide quantifiable insights into system performance.
  • Detailed information about unauthorized errors is available through logs, including application name, environment details, error code, method used (GET), and API route.

Tracing Transactions

  • Logs are connected to traces that provide deeper insights into transactions. This connection helps track performance issues effectively.
  • The dashboard allows users to analyze specific transactions by clicking on them for detailed data regarding error rates and response times.

System Performance Insights

  • Information about backend application performance includes garbage collection time and memory usage statistics (3 MB), which contribute to understanding overall system health.

Implementing Observability Practices

  • Observability practices should be implemented on a spectrum; complete observability is challenging but achievable with various tools.
  • Open-source tools like Grafana and Prometheus can be utilized alongside proprietary software like New Relic or Datadog for comprehensive monitoring solutions.

Conclusion on Workflow Integration

  • Effective logging, monitoring, and observability require collaboration between developers and infrastructure teams (DevOps). Proper implementation leads to a holistic view of service states across applications.
Video description

Logging, monitoring, and observability are crucial for understanding the behavior of software systems. They provide insights into system performance, identify potential issues, and aid in debugging and optimization. And we will discuss all that in this video. Check out Sevalla and get $50 credits for free: https://sevalla.com/?utm_source=srinivas&utm_medium=Referral&utm_campaign=youtube Join the Discord community: https://discord.gg/NXuybNcvVH #backend #nodejs #golang #softwareengineering Nerd out about the history of technologies here https://www.fascinatingtechhistory.xyz/