18. Logging, Monitoring and Observability

Name: 18. Logging, Monitoring and Observability
Uploaded: 2025-07-26T21:55:09.000Z
Duration: 1 h 18 min 51 s

Logging, Monitoring, and Observability Overview

Introduction to Key Concepts

Logging, monitoring, and observability are significant topics that warrant individual discussions due to their complexity.

These practices exist on a spectrum; there are no strict rules governing their implementation across different organizations.

It's important not to feel overwhelmed by the various terms and tools associated with these practices in the industry.

Definitions of Key Terms

The speaker emphasizes the need for clarity on what logging, monitoring, and observability entail without delving too deeply into theory.

Code examples will be included to illustrate how these practices relate closely to application development.

Importance of Logging, Monitoring, and Observability

In modern distributed environments where applications run across multiple servers globally, tracking system performance is crucial.

Effective logging helps maintain oversight of events occurring within applications and infrastructure.

Understanding Logging

What is Logging?

Logging involves recording all significant events in an application’s lifecycle for future reference.

Important metadata such as user ID and request latency should accompany logged events for better context during analysis.

Understanding Monitoring

What is Monitoring?

Monitoring refers to keeping track of the state of backend applications and their components (e.g., CPU usage, memory consumption).

It provides near real-time data about system performance but may have slight delays (10–15 seconds).

Understanding Observability

What is Observability?

Observability encompasses multiple practices essential for understanding system behavior through three main pillars: logs, metrics, and traces.

Logs document important events; metrics provide quantitative insights related to monitoring; traces help track requests through systems.

Understanding Observability in Modern Applications

Components of Observability

The key components of a backend application include the handler layer, service layer, validation layer, repository layer, and database layer. Traces help track requests through these components.

A trace is defined as a transaction that encompasses all involved components during the execution of a request.

Evolution from Monitoring to Observability

Traditional monitoring practices primarily focused on error detection but only indicated that an issue existed without providing details on the nature of the problem.

The shift towards observability allows for not just identifying issues but also understanding their specifics when proper logging, metrics, and traces are implemented.

Key Practices: Logging, Monitoring, and Observability

Logging involves recording significant events throughout an application's lifecycle (e.g., user logins or database queries), serving as a detailed journal for debugging.

Monitoring entails real-time data collection to assess system health and performance over time while tracking patterns and trends.

Observability refers to determining an application's internal state by analyzing external outputs; it relies on logs, metrics, and traces working together.

Interplay Between Logs, Metrics, and Traces

Logs provide insights into what happened within the application; monitoring yields metrics that reveal performance patterns; observability utilizes traces to show component interactions.

Traces are crucial for understanding how different components interact during transactions in production systems with established logging and monitoring practices.

Practical Application of Metrics in Error Handling

Alerts can be configured based on specific parameters (e.g., error rates exceeding 80%), prompting notifications via platforms like Slack when issues arise.

Metrics encompass various parameters such as request counts or failure rates. They can be historical or real-time data points essential for diagnosing problems effectively.

Configurable metrics allow teams to focus on critical data points relevant to their applications' performance and health monitoring strategies.

Tools for Implementing Observability

Various tools exist for implementing observability practices including open-source options like Grafana and Prometheus for monitoring/tracing. New Relic serves as a comprehensive solution for those preferring not to configure multiple open-source tools.

Logging, Monitoring, and Observability in Backend Systems

Overview of Logging Tools

Grafana is an open-source tool for monitoring, while New Relic is a proprietary software. Both can help identify logs related to metrics such as high error rates.

Users can navigate from logs to traces, allowing them to see the request flow through various functions until it fails at a specific point.

Benefits of Implementing Logging and Monitoring

The integration of logging, monitoring, and observability enables quick debugging by pinpointing where issues occur in backend systems.

Introduction to Seala Platform

Seala is introduced as a platform-as-a-service provider that allows deployment of full-stack applications and databases similar to Netlify or Heroku.

Users can deploy observability tools like Grafana or Prometheus using Docker containers connected via an internal network.

Deployment Features of Seala

Seamless integration with GitHub allows automatic deployments upon pushing changes to the main branch using a GitHub bot.

Supports multiple build options including Nyx packs for over 20 languages and compatibility with Heroku build packs for smooth migration.

Team Collaboration Features

Preview deployments enable team members to test new features instantly via unique domains generated for pull requests (PR).

This feature enhances productivity by allowing quick feedback on changes before merging PRs.

Cost Efficiency and Infrastructure

Applications run on Google’s infrastructure with Cloudflare's edge network, providing cost-effective bandwidth compared to competitors like Vercel.

Understanding Logging Levels

Importance of Logging Levels

Discusses the significance of logging levels in production systems which categorize log events based on severity.

Common Logging Levels Explained

Debug: Used during development for detailed troubleshooting; typically disabled in production due to verbosity.

Info: Logs general application operations or successful events (e.g., creating a task in a todo app).

Warning Level Usage

Warning: Indicates non-critical issues that are not successful but do not warrant an error status (e.g., failed user authentication).

Logging Levels and Practices in Application Development

Understanding Logging Levels

Warning Level: Used for non-critical issues that are not errors but should be noted. Examples include password warnings.

Fatal Level: Indicates a serious issue that causes the application to stop or restart, highlighting critical bugs that need immediate attention.

Structured vs Unstructured Logging

Console Logs: During development, logs are displayed in the console for easy readability and troubleshooting. They are formatted attractively to help developers spot issues quickly.

Structured Logging: In production, logging is done in JSON format to facilitate parsing by log management tools. This format includes detailed parameters like error status and messages.

Importance of Log Management Tools

Log Parsing Efficiency: Production systems require structured logs (like JSON) for efficient parsing by tools such as ELK stack or Grafana, which extract valuable information without errors.

Development vs Production Needs: Unstructured logging is preferred during development for ease of use, while structured logging is essential in production environments for effective data extraction.

Implementing Logging Practices

Practical Demonstration: The video will showcase a Go-based to-do application implementing best practices in logging, monitoring, and observability using New Relic.

Tool Overview: New Relic provides a comprehensive solution for logging and monitoring needs; it simplifies integration compared to managing multiple open-source tools like Prometheus and Grafana.

Configuration Insights

Configuration Complexity: Setting up logging and monitoring can be complicated; proprietary solutions like New Relic may be more suitable for teams with limited resources.

Log Level Functionality: A function checks if the application runs in development or production mode to set appropriate log levels—info level for production and debug level for local development.

Logging Strategies in Development and Production

Structured vs. Unstructured Logging

The default logging format is set to console for development environments, but can be changed to JSON for production.

JSON logs are beneficial for production due to their structured nature, though they lack readability compared to development logs.

Development logs provide clearer insights into application events, such as database connections and server startups.

Monitoring with Middleware

New Relic middleware is used to instrument requests by wrapping the entire application, enhancing observability.

Key concepts in observability include instrumentation (measuring function attributes) and OpenTelemetry (a standard providing tools and SDKs).

Instrumentation Practices

OpenTelemetry supports various programming languages, offering APIs and SDKs for effective application instrumentation regardless of the language used.

Integration of OpenTelemetry collectors allows better control over request instrumentation even when using proprietary tools like New Relic.

Workflow of Creating a To-Do Item

The createToDo function begins by extracting transaction data from context via enhanced tracing middleware.

A new transaction is created with parameters such as service name, environment type, IP address, user agent, request ID, user email, and tenant ID.

Logging Events During To-Do Creation

Important events are logged throughout the process; this includes creating a new to-do item along with its title and any priority passed in the payload.

Error handling involves logging errors at an error level while successful operations log debug information about the created to-do's ID.

Finalizing Logs for Business Events

A business event log captures essential metadata related to the created to-do item including its ID, title, category ID, and priority.

Logging, Monitoring, and Observability in New Relic

Overview of New Relic Dashboard

The speaker demonstrates how to view logs in JSON format on the New Relic dashboard, emphasizing the importance of setting the server value to production for accurate monitoring.

The application’s error rates and transaction times are displayed; however, no data is shown due to inactivity over the last 30 minutes. A request is triggered to test API functionality.

Error Logging and Metrics

An unauthorized error occurs when testing an API without a token. The speaker checks if this error is logged in the dashboard.

Upon refreshing the errors section, unauthorized errors are visible along with related logs, highlighting how logging aids in debugging.

Understanding Metrics

Key metrics such as average transaction time and throughput are discussed. These metrics provide quantifiable insights into system performance.

Detailed information about unauthorized errors is available through logs, including application name, environment details, error code, method used (GET), and API route.

Tracing Transactions

Logs are connected to traces that provide deeper insights into transactions. This connection helps track performance issues effectively.

The dashboard allows users to analyze specific transactions by clicking on them for detailed data regarding error rates and response times.

System Performance Insights

Information about backend application performance includes garbage collection time and memory usage statistics (3 MB), which contribute to understanding overall system health.

Implementing Observability Practices

Observability practices should be implemented on a spectrum; complete observability is challenging but achievable with various tools.

Open-source tools like Grafana and Prometheus can be utilized alongside proprietary software like New Relic or Datadog for comprehensive monitoring solutions.

Conclusion on Workflow Integration

Effective logging, monitoring, and observability require collaboration between developers and infrastructure teams (DevOps). Proper implementation leads to a holistic view of service states across applications.