70% Faster Deployments and $350K Storage Savings for Leading OTT Platform - OpsTree Global
AI Icon OpsTree AI Experience Center Explore Now →

70% Faster Deployments and $350K Storage Savings for Leading OTT Platform

The client is a leading Indian OTT platform offering over 100,000 hours of content, including Hollywood blockbusters, regional favorites, and live coverage of major sports events, including cricket and other popular sports, attracting millions of viewers globally.

The Problem Statement

The client faced limitations with manual infrastructure configurations that constrained scalability and monitoring failures during high traffic. They struggled with costly 1TB of daily logs, revenue impacts from payment gateway issues, and delayed issue detection that negatively affected user experience.

Challenges

The client faced difficulties in scaling their infrastructure to handle sudden spikes in user traffic, particularly during live events like the Cricket Leagues.

Relying on manual configurations made it difficult for the client to efficiently manage their infrastructure and rapidly scale as needed.

Redis on-prem faced performance issues when delivering content globally, leading to slow load times and buffering.

Existing monitoring systems failed under high user traffic, making it difficult to detect and resolve issues in real time.

The integration of multiple payment gateways led to high vulnerability, where failure in one gateway could impact revenue across the platform.

Developers were generating over 1TB of logs daily, driving up storage costs without providing effective incident resolution.

Solutions

To overcome these challenges, they required a scalable DevSecOps framework to manage traffic surges, enhance system performance, and streamline operational processes.

A custom solution was developed to scale infrastructure and applications dynamically, enabling smooth performance during high-traffic events for 33 million users.

Transitioned the platform from manual configurations to Infrastructure-as-Code (IaC), automating infrastructure management and improving operational efficiency.

Migrated from Redis on-prem to Redis Cloud, ensuring fast, buffer-free content delivery to a global audience.

Implemented advanced monitoring tools like Last9 and TSDB metrics, providing real-time detection of issues and improving overall performance tracking.

Strengthened security using GCP Cloud Armour, Akamai Site Shield, and quarterly key rotations, and established strategies to mitigate the impact of payment gateway failures.

Refactored microservices in Golang, optimizing logging solutions, and enhanced incident management with automated escalations and real-time tracking via PagerDuty, Jira, and Slack.

Outcomes

Reduced storage costs by 80%, saving $350,000 annually through optimized storage solutions and reduced log management expenses.

Cut deployment time by 70% by implementing IaC and automated workflows, accelerating deployment processes.

Scaled infrastructure to manage 900 billion metrics per month by enhancing the system’s capacity for comprehensive data management.

Migrated over 1PB of data with zero downtime, transitioning to new infrastructure without interrupting services and ensuring continuous availability.

Faster & Secure Software Delivery With BuildPiper!!

See the Impact We've Made

Uninterrupted live class streaming with 100k concurrent users

Read More

Global HRMS leader launched its modernized cloud platform for 2 million users in under 4 weeks.

Read More
Get in Touch!
Experience Faster Time-to-Market
w

Possibilities ReImagined

w