AWS DevOps Agent Webhook Integration for Automated Cost Optimization: A Practical POC

Objective

The objective of this Proof of Concept (POC) is to demonstrate how AWS DevOps Agent can be integrated with a Webhook-based event flow to support cost optimization use cases. 

This POC focuses on detecting potential cost-saving opportunities in AWS infrastructure and sending those events to AWS DevOps Agent for investigation and recommendations. 

AWS DevOps Agent – Region Limitations 
  • AWS DevOps Agent is not available in all AWS regions 
  • It works only in selected supported regions 
  • United States (N. Virginia) United States (Oregon) 
  • Asia Pacific (Sydney) Asia Pacific (Tokyo) 
  • Europe (Frankfurt) Europe (Ireland) 

Use Case

The selected use case for this POC is: 

Cost Optimization Assistant 

This assistant helps identify and reduce unnecessary cloud costs by: 

  • detecting underutilized or idle resources 
  • identifying EC2 instances with low usage 
  • triggering DevOps Agent investigations through webhook events 
  • enabling cost optimization recommendations for the current AWS setup 

Approach Overview

There are two possible approaches for implementing the AWS DevOps Agent integration: 

  1. Webhook-Based Integration using AWS Lambda (Current Approach) In this approach, the DevOps Agent communicates with AWS services through webhooks, which trigger AWS Lambda functions to perform required actions. This is the approach currently being implemented as part of this POC. 
  2. MCP-Based Integration (Future Scope) The second approach involves using an MCP (Model Context Protocol) server for integration. This would require building and exposing MCP tools, typically using Python, and connecting them with the DevOps Agent. This approach is planned for further exploration. 

POC Focus 

For this POC, the primary focus is on demonstrating the Webhook-Based Integration using AWS Lambda, as it provides a simpler and more direct way to enable communication between the DevOps Agent and AWS services. 

Solution Overview

In this POC, a simple and scalable architecture is implemented using: 

  • AWS Lambda as the event generation layer 
  • Webhook as the communication layer 
  • AWS DevOps Agent as the analysis and recommendation engine 

AWS Lambda monitors AWS resources (such as EC2 instances) and detects cost-related signals like low CPU utilization. When such a condition is identified, Lambda sends a structured event to the DevOps Agent using a webhook. 

The DevOps Agent then investigates the infrastructure context and provides recommendations for cost optimization. 

Flow Diagram

AWS DevOps Agent Flow Diagram

Prerequisites

Before starting this POC, the following prerequisites were required: 

  • AWS account access 
  • AWS DevOps Agent enabled 
  • Agent Space created successfully 
  • Primary AWS account connected 
  • Topology mapping completed 
  • Webhook configuration access available 
  • Basic understanding of AWS infrastructure resources 

POC Implementation Steps

Step 1: Create Agent Space 

  • Navigate to AWS DevOps Agent 
  • Create a new Agent Space 
  • Provide name and configure access 

Result: Agent Space created successfully 

Step 2: Connect AWS Account

  • Add AWS account as Primary Cloud Source
  • Verify status as Valid

Result: DevOps Agent gains access to infrastructure

Step 3: Verify Topology Mapping

Once the account was connected, DevOps Agent completed topology mapping and discovered relationships between infrastructure resources.

  • Go to DevOps Agent
  • click on Action
  • Then go to web app link
  • click on oprater Access

Step 4: Open DevOps Agent Web App

The DevOps Agent Web App was opened to interact with the agent in chat mode.

Using the web interface, infrastructure queries were submitted to validate whether the agent could understand the environment.

Example prompt used:

“What is in my topology?”

Step 5: Configure Webhook

  • Go to Webhooks section
  • Click Add Webhook
  • Follow steps:
    • Review schema
    • Enable HMAC security
    • Generate URL and Secret
  • save URL & SECRET-KEY

Step 6: Create Lambda Function

  • Go to AWS Lambda
  • Create function (Python runtime)
  • Attach IAM role with required permissions

Permissions required:

  • EC2 (DescribeInstances)
  • CloudWatch (GetMetricStatistics)
  • Compute Optimizer (GetEC2InstanceRecommendations)
  • Logs

Step 7: Add Lambda Logic

Lambda performs:

  • Fetch running EC2 instances
  • Calculate average CPU usage (7 days)
  • Identify idle instances
  • Fetch Compute Optimizer recommendations
  • Send webhook event

Step 8: Configure Environment Variables

Add:

  • WEBHOOK_URL
  • WEBHOOK_SECRET

Step 9: Test Lambda

  • Run test event {}
  • Verify:
    • Lambda execution success
    • CloudWatch logs
    • DevOps Agent receives event

use this code

import json
import boto3
import os
import hmac
import hashlib
import base64
from datetime import datetime, timedelta, timezone
import urllib3

ec2 = boto3.client("ec2")
cw = boto3.client("cloudwatch")
co = boto3.client("compute-optimizer")

http = urllib3.PoolManager(
    timeout=urllib3.util.Timeout(connect=5.0, read=10.0)
)

WEBHOOK_URL = os.environ["WEBHOOK_URL"]
WEBHOOK_SECRET = os.environ["WEBHOOK_SECRET"]


def get_idle_instances(days=7, cpu_threshold=5.0):
    reservations = ec2.describe_instances(
        Filters=[{"Name": "instance-state-name", "Values": ["running"]}]
    )["Reservations"]

    instances = []
    for reservation in reservations:
        for instance in reservation["Instances"]:
            instances.append(instance)

    end_time = datetime.now(timezone.utc)
    start_time = end_time - timedelta(days=days)

    idle = []

    for instance in instances:
        instance_id = instance["InstanceId"]
        instance_type = instance.get("InstanceType", "unknown")

        metric = cw.get_metric_statistics(
            Namespace="AWS/EC2",
            MetricName="CPUUtilization",
            Dimensions=[{"Name": "InstanceId", "Value": instance_id}],
            StartTime=start_time,
            EndTime=end_time,
            Period=86400,
            Statistics=["Average"]
        )

        datapoints = metric.get("Datapoints", [])

        if datapoints:
            avg_cpu = sum(x["Average"] for x in datapoints) / len(datapoints)
        else:
            avg_cpu = 0.0

        if avg_cpu < cpu_threshold:
            idle.append({
                "instance_id": instance_id,
                "instance_type": instance_type,
                "avg_cpu": round(avg_cpu, 2)
            })

    return idle


def get_compute_optimizer_data():
    try:
        response = co.get_ec2_instance_recommendations()
        recommendations = []

        for item in response.get("instanceRecommendations", []):
            recs = []

            for opt in item.get("recommendationOptions", [])[:3]:
                recs.append({
                    "instance_type": opt.get("instanceType"),
                    "performance_risk": opt.get("performanceRisk")
                })

            recommendations.append({
                "instance_arn": item.get("instanceArn"),
                "current_instance_type": item.get("currentInstanceType"),
                "finding": item.get("finding"),
                "recommendations": recs
            })

        return recommendations

    except Exception as e:
        return [{"error": str(e)}]


def sign_payload(secret, body, timestamp):
    message = f"{timestamp}:{body}".encode("utf-8")

    digest = hmac.new(
        secret.encode("utf-8"),
        message,
        hashlib.sha256
    ).digest()

    return base64.b64encode(digest).decode("utf-8")


def send_to_devops_agent(payload):
    body = json.dumps(payload, separators=(",", ":"))

    timestamp = datetime.now(timezone.utc).strftime(
        "%Y-%m-%dT%H:%M:%S.000Z"
    )

    signature = sign_payload(
        WEBHOOK_SECRET,
        body,
        timestamp
    )

    headers = {
        "Content-Type": "application/json",
        "x-amzn-event-timestamp": timestamp,
        "x-amzn-event-signature": signature
    }

    print("Webhook URL:", WEBHOOK_URL)
    print("Timestamp:", timestamp)
    print("Payload body:", body)
    print("Signature:", signature)

    response = http.request(
        "POST",
        WEBHOOK_URL,
        body=body.encode("utf-8"),
        headers=headers
    )

    print("Webhook response status:", response.status)

    return {
        "status_code": response.status,
        "response": response.data.decode(
            "utf-8",
            errors="ignore"
        )
    }


def lambda_handler(event, context):

    idle_instances = get_idle_instances(
        days=7,
        cpu_threshold=5.0
    )

    optimizer_data = get_compute_optimizer_data()

    incident_id = (
        f"cost-"
        f"{datetime.now(timezone.utc).strftime('%Y%m%d%H%M%S')}"
    )

    payload = {
        "eventType": "incident",
        "incidentId": incident_id,
        "action": "created",
        "priority": "MEDIUM",
        "title": "Potential AWS cost optimization opportunities detected",
        "description": "Lambda found underutilized EC2 instances and collected Compute Optimizer recommendations.",
        "timestamp": datetime.now(timezone.utc).strftime(
            "%Y-%m-%dT%H:%M:%S.000Z"
        ),
        "service": "cost-optimizer",
        "data": {
            "idle_instances": idle_instances,
            "compute_optimizer": optimizer_data
        }
    }

    webhook_result = send_to_devops_agent(payload)

    return {
        "statusCode": 200,
        "body": json.dumps({
            "idle_instances_found": len(idle_instances),
            "webhook_result": webhook_result,
            "payload_sent": payload
        })
    }

 

Step 10: Schedule Automation

  • Use EventBridge
  • Create rule (daily or periodic trigger)
  • Attach Lambda

Sample Webhook Event

A sample event for the cost optimization use case is shown below:

{
  "eventType": "incident",
  "incidentId": "cost-001",
  "action": "created",
  "priority": "MEDIUM",
  "title": "Idle EC2 detected",
  "description": "Some EC2 instances are underutilized and may be candidates for cost optimization.",
  "timestamp": "2026-04-14T10:00:00Z",
  "service": "cost-optimizer",
  "data": {
    "instances": [
      "i-1234567890abcd",
      "i-0987654321efgh"
    ]
  }
}

POC Outcome

This POC successfully demonstrates:

  • DevOps Agent setup and topology mapping
  • Cost optimization analysis via DevOps Agent
  • Webhook integration for event-driven architecture
  • Lambda-based cost signal generation

Benefits

  • Beginner-friendly implementation
  • No need for MCP server initially
  • Scalable architecture
  • Supports automation using Lambda and EventBridge
  • Enables proactive cost optimization

Limitations

  • Lambda automation is basic in current POC
  • Advanced optimization logic not fully implemented
  • Region availability is limited
  • Requires additional integrations for full automation

Future Enhancements

  • Advanced cost analysis logic in Lambda
  • Real-time CloudWatch integration
  • SNS / Slack alerts
  • Auto-remediation workflows
  • Multi-region support

Conclusion

This POC demonstrates a practical and scalable approach to cost optimization using:

  • AWS DevOps Agent for intelligent analysis
  • AWS Lambda for automation
  • Webhook for event-driven communication

It provides a strong foundation for building a fully automated cost optimization system in AWS.

Related Searches

Related Solutions