Making AI Work for Hybrid Observability
     
    Remove Translation Translation
    Original Text

    Various forms of AI, e.g., including generative AI, anomaly detection, predictive analytics, and reinforcement learning, represent exciting potential for improving the speed and effectiveness of operational management solutions. But AI does not replace the basic requirements for observability platforms; instead, it is meant to build upon those requirements. TechTarget’s Enterprise Strategy Group believes that four key pillars must be addressed to set the proper foundation for effective application of AI: hybrid observability, intelligent data collection, auto-discovery, and dependency mapping (see Figure 1). These are all required to provide AI with the data needed to generate actionable insights and alerts, automated root cause analysis (RCA), accurate predictions, and contextualized responses to questions coming from IT users, developers, security engineers, or the line of business.

     
    Figure 1. The Four Pillars of Comprehensive Observability
    Figure 1. The Four Pillars of Comprehensive Observability

    The four pillars of observability create the foundation for optimal AI effectiveness in the following ways:

     

    Hybrid observability. By normalizing and unifying telemetry data across all infrastructure layers, AI gains a complete view of the system via elimination of data silos and blind spots.

     

    Intelligent data collection. Deep and consistent telemetry collection ensures that AI has high-quality, detailed, application-aware data for learning and inferencing. 

     

    Auto-discovery. Automatically detecting and monitoring new or changed components ensures that AI is always analyzing the most up-to-date environment across both east-west and north-south dimensions. 

     

    Dependency mapping. By continuously mapping relationships and dependencies across components, AI can better “understand” the blast radius and cascading effects of incidents, including user experience and service quality impacts.

     

    If these needs are met, the use of AI becomes a powerful means of leveraged acceleration to accurately correlate data and events, improving RCA convergence, reducing time to restoration, and presenting realistic paths to proactively protecting application performance and user experience. Done properly, AI can help teams address their most problematic and burdensome tasks, as identified by Enterprise Strategy Group research in Figure 2. 1

     
    Figure 2. The Five Most Burdensome IT Operations Tasks
    Figure 2. The Five Most Burdensome IT Operations Tasks

    1. Source: Enterprise Strategy Group Research Report, Generative AI in IT Operations: Fueling the Next Wave of Modernization, September 2024.