A Practical Guide to SNMP Monitoring with Dynatrace
Practical Guide to SNMP Monitoring
Introduction to SNMP and Its Importance
- The session introduces a practical guide on SNMP monitoring, emphasizing its significance as a data source for users monitoring infrastructures with D.
- David Mass, an expert from Dynatrace, provides insights into how SNMP works and its relevance in infrastructure monitoring.
Overview of SNMP
- David outlines the two main topics: an overview of SNMP and building an SNMP extension using the Dynatrace framework.
- Simple Network Management Protocol (SNMP) originated in the 1980s for network monitoring and device management.
Key Concepts: MIBs and OIDs
- A Management Information Base (MIB) is a file that describes different Object Identifiers (OIDs), which can be pulled or controlled on devices.
- Standard MIBs are governed by IETF; vendor-specific MIBs exist for particular devices like VPN or firewalls.
Working with OIDs
- Examples of standard OIDs include system description, uptime, device name, contact information, and location details.
- The default port for managing network devices via SNMP is 161; traps use port 162. Traps send information proactively to a trap receiver.
Practical Applications of SNMP
- David compares traps to webhooks in modern technology, highlighting their role in proactive communication between systems.
Building Extensions with Dynatrace
- To create an extension using the EF2 framework, users define OIDs or their names; the data source handles translation.
- The Dynatrace ActiveGate performs polling for all extensions while supporting both v2c and v3 versions of SNMP.
Existing Extensions Available
- Several pre-built SNMP extensions are available on the Dynatrace Hub, including popular ones for F5 networks and generic Cisco devices.
Setting Up SNMP Extensions
Overview of DataTrace Hub and Required Setup
- The DataTrace Hub can be accessed at datatracehub.dat.com, where users can search for extensions and download necessary files to understand the metrics being pulled.
- An active GATE is required for setting up extensions, specifically an environment active GATE that has access to monitored devices. Users need to know the device's IP address or domain name and port (default is 161).
SNMP Configuration Details
- Depending on the version of SNMP supported by the device, users may need a community string or authentication details for SNMP V3, which includes various security options like username and password.
- The process of writing an extension involves using VS Code with a specific add-on that assists in schema validation and managing the extension lifecycle.
Building an SNMP Extension
- To build an SNMP extension, users initialize their workspace in VS Code after ensuring they have set up the necessary add-ons connected to a DataTrace environment.
- Users create a blank directory structure for their custom extension, which will include signing the extension zip file for validation on both cluster and agent sides.
Utilizing VS Code Add-On Features
- The VS Code add-on simplifies development by providing templates and boilerplate code; it allows users to fill in details easily while benefiting from code completion features.
- When naming custom extensions, it’s important to prefix with "custom:"; this helps distinguish them from standard extensions.
Structuring YAML Files for Metrics
- In building out YAML files, groups are created within which different properties or attributes can be defined. This organization aids in capturing various metrics effectively.
- Dimensions added to a group will also apply to any subgroups within it. Standard OIDs provide essential information about devices such as description, contact info, and location.
Validating OIDs with VS Code Add-On
- The basic structure of defining metrics or dimensions in YAML requires specifying OIDs correctly; doing so enables automatic retrieval of descriptions and data related to those OIDs.
- The VS Code add-on validates whether specified OIDs are standardized or valid by querying online databases, reducing errors during development.
SNMP Metrics and Data Capture
Overview of SNMP Metrics
- The discussion begins with the importance of validating Object Identifiers (OIDs) from a Management Information Base (MIB) file to ensure accurate data capture.
Defining Metrics
- A focus on defining metrics is introduced, specifically capturing device uptime as a critical metric for monitoring device status and potential restarts.
- The metric is named "S&P device uptime," emphasizing the need to verify that the OID used for this metric is correct.
Types of Metrics
- Two types of metrics are discussed: gauge type metrics, which reflect current values like uptime, and count type metrics, which measure changes over time (Deltas).
- Count type metrics are particularly useful for monotonic counters where only increases matter; understanding Deltas helps in accurately interpreting error counts.
Interval Property in SNMP
- The interval property for polling OIDs is explained. By default, ActiveGate polls every minute but can be adjusted to different intervals based on user needs.
Capturing Table Data
- Transitioning to more complex data capture, the conversation highlights how SNMP tables function similarly to SQL database tables with multiple indexes and associated columns.
- Unlike single-instance OIDs previously discussed, table-based metrics can have multiple entries per index, necessitating careful configuration for accurate data collection.
Interface Metrics Configuration
- When capturing interface metrics from a table, it’s essential to specify that the data source should collect all entries within that table rather than just specific OIDs.
- Each interface must be treated as a unique data point by including its index in the dimension settings during configuration.
Understanding Indexes and Descriptions
- The significance of using an index OID is emphasized; while indexes are typically integers without inherent meaning, they help track different interfaces effectively.
- Additional information such as interface descriptions provides human-readable names for interfaces. Admin and operational statuses offer insights into whether interfaces are functioning correctly or if issues exist.
Understanding SNMP Metrics and Configuration
Capturing Error Metrics
- The discussion begins with the importance of capturing specific metrics, particularly focusing on errors to ensure no erroneous data is sent or received.
- Incoming and outgoing error counters are highlighted as continuously increasing until a device restart occurs, emphasizing the need for tracking these metrics effectively.
Router Metrics Overview
- An extension applied to routers captures general information such as contact name, location, and uptime metrics.
- The SNMP table structure is explained, detailing that it contains multiple rows for each network interface along with relevant dimensions like index, description, admin status, and operational status.
Configuring Additional Metrics
- There’s flexibility in adding various other metrics such as incoming/outgoing packets or bytes based on specific device requirements.
- Once all desired metrics are set up, the VS Code add-on will be used to build the extension which will then be uploaded to the Datra tenant for activation.
Extension Activation Process
- After uploading the extension to the tenant, users can search for it and confirm its active version before proceeding with configuration.
- During configuration setup, users select active gate groups where they want this extension to run while ensuring necessary device information is correctly inputted.
Feature Sets in Configuration
- The YAML configuration allows defining feature sets that enable toggling specific metrics on or off per configuration. This provides customization based on different devices' needs.
- Users can choose whether to capture detailed interface metrics or stick with essential uptime data depending on their monitoring requirements.
Troubleshooting Extension Issues
- If an error arises during monitoring configuration assignment due to downloading issues from the Dynatrace cluster, it's noted that this may not require troubleshooting since successful queries can still occur despite initial errors.
SNMP Extension Development and Troubleshooting
Overview of SNMP Data Retrieval
- The process of refreshing the status may take time, but it will eventually update to show the correct information.
- Using the data explorer, users can search for "SNP device uptime" to view system metrics and dimensions added during setup.
- Interface metrics can be verified, showing both global dimensions and specific group details like error counts for incoming and outgoing traffic.
Integration with Other Tools
- Metrics are also available in Grail, allowing users to create dashboards using the new metric explorer feature.
- The similarity between SNMP and other data sources is highlighted; while JSON is used instead of SQL or JMX, the underlying concepts remain consistent.
MIB File Support
- Users can import vendor-specific MIB files through the V Studio extension for better validation and usage of OIDs.
- MIB files can be included within extensions or placed directly on active GATES for name resolution and status translation purposes.
Configuration Best Practices
- Proper configuration involves ensuring all active GATES have network access to avoid issues when capturing OIDs from devices.
- If problems arise after setting up extensions, checking firewall rules and verifying device responses via tools like SNMP walk is recommended.
Advanced Configuration Options
- The SNMP executable has a built-in health check mode that provides options for running commands related to OID retrieval.
- Advanced settings allow customization such as timeout adjustments and maximum repetitions returned from queries, which helps optimize data requests based on device capabilities.
SNMP Extensions and Data Retrieval Insights
Advanced Settings for Bulk Retrieval
- The final option discussed is the maximum OIDs (Object Identifiers) per query, which helps optimize data retrieval intervals and load from network devices.
- Emphasis on the importance of advanced settings in bulk retrieval, highlighting their role in efficient data gathering.
Creating SNMP Extensions
- A question raised about the potential for a point-and-click interface to create SNMP extensions by defining OIDs directly in the UI without needing an extension.
- Confirmation that there are plans for a user-friendly application on the D platform to facilitate building extensions through a more intuitive interface.
Consistency Across Data Sources
- Clarification that defining SNMP data sources and related concepts (alerts, screens, etc.) follows similar principles as seen with JMX and SQL extensions.
- Discussion on how topology can be defined within AMO, including entity creation from metrics and establishing relationships between them.
Resources and Documentation
- Encouragement to explore additional videos and documentation linked in the description for further insights into SNMP integration.
Reflection on Technology Evolution
- Personal reflection on past experiences with SNMP in load testing analytics tools, noting its evolution into a first-class citizen data source within modern frameworks like Dinat Trace.