Case study

Redge Technologies streamlines OTT analytics on AWS

Discover how Redge, a leader in media streaming solutions, built a high-performance, serverless data ingestion API with Chaos Gears.

Industry Over-the-top media

Size Enterprise

Key focus Data engineering

TVP VOD — Poland's largest TV network publishes its VOD catalogue via Redge Media

Opportunity The ever-increasing demand for video

The OTT market is experiencing unprecedented growth. It is projected to reach $434.5 billion by 2027, with a CAGR of 16.5%. This surge reflects a global shift in consumer preferences, as viewers demand personalized, on-demand entertainment experiences.

Delivering secure, scalable streaming solutions, advanced analytics, and a deep understanding of user behavior is the mission of Redge Technologies.

As a global leader in media streaming solutions, Redge Technologies has been shaping the OTT landscape since 2007. A prominent technology provider, Redge Technologies helps broadcasters and content distributors deliver a superior user experience, increase viewership, and reduce churn.

Their flagship solution, Redge Media, a managed TV platform, powers major players like Go3, Player.pl, Play NOW, and Canal+, serving over 10 million users daily across Europe. However, achieving this hinges on performing in-depth data analysis and turning raw information into actionable insights.

Businesses generate more data than ever before. The success of an OTT channel is intrinsically tied to the quality and relevance of its data. Recognizing this, Redge Technologies has built a data lake in AWS for one of their customers. This solution stored vast datasets from multiple partners, products, and services in a centralized, easily accessible repository.

The data lake was production-ready and met customers' expectations, but the data ingestion process posed some challenges. Each system pushed data differently, with no standardization. Maintaining unique pipelines for each data provider lacked reusability and created redundant code, so the process wasn't fully optimized.

Together with the Redge team, we identified that the system's visibility was limited, leaving questions unanswered about who was pushing data, when, and where.

Security was our other concern. External systems directly accessing Amazon S3 complicated efforts to establish a robust security perimeter around the data lake.

Our common mission was clear: tackle these challenges head-on by introducing streamlined processes, a reusable architecture, and improved visibility and security measures.

Solution Efficient data lake in service of optimal media delivery

To address the complexities of designing a secure and effective data ingestion service, Redge Technologies collaborated with Chaos Gears — a trusted AWS partner specializing in modernizing data platforms, cloud-native architectures, and advanced analytics.

The journey began with interactive workshops alongside the customer team to assess existing data producers and anticipate future ones. There were already 4 different systems pushing the data, with additional ones poised to integrate in the coming year with the newly created data lake, underscoring the need for a robust, future-proof solution.

During our analysis, it became evident that data producers operated with vastly different patterns. Some systems transmitted data every 15 minutes, but the data packages were less than 10 MB, while others sent massive payloads exceeding tens of gigabytes, but only once per day. The data itself spanned diverse domains, encompassing datasets on OTT subscribers, marketing campaigns, product details, video metadata, and more.

Considering these factors, it was clear that one, standardized approach for ingesting data from all producers is needed. Relying on unique pipelines for each data provider would not only hinder scalability but also introduce inefficiencies that could slow them down over time. This also highlighted the need for a solution capable of handling both high-frequency, small-scale transmissions and infrequent, large-scale data flows with equal efficiency.

Turning the vision into reality

Through meticulous analysis, we proposed a unified mechanism: the Data Ingestion Service, a single API designed to standardize data ingestion. This fully serverless application empowers every data producer to request permission from the customer before pushing new datasets into the lake, ensuring a streamlined and controlled process.

The application validated each producer's authorization to push data. Upon approval, it automatically selected a proper Amazon S3 bucket, applied the correct partitioning scheme, and ensured alignment with the proper data package. The service then generated a valid presigned URL for secure upload and awaited the producer’s confirmation of completion, marking each data package as successfully ingested.

The Data Ingestion Service tracks producer activity, collecting detailed metrics on data imports, frequency, and the volume of files pushed into the lake. This gave the customer unparalleled visibility into the ingestion process, empowering them with actionable insights to monitor and optimize their data pipeline.

Transitioning producers to integrate with the Data Ingestion Service required some adjustments, and not all could adapt immediately. For those unable to switch quickly, Redge Technologies leveraged the system's flexibility, building custom mechanisms to periodically retrieve data from these producers and seamlessly push it through the application. This ensured uninterrupted operations while maintaining the integrity and efficiency of the data ingestion process.

Given that the application itself incurred minimal traffic (the actual file uploads were done through native S3 APIs), it was designed as a fully serverless solution. This approach ensured seamless scalability to meet demand fluctuations while significantly reducing the Total Cost of Ownership (TCO), embodying efficiency and cost-effectiveness in every aspect of its architecture.

Data ingestion using AWS services

To efficiently meet our objectives, our teams used a suite of native AWS services. AWS Lambda served as the main compute power, seamlessly handling tasks such as presigned S3 URL generation and executing the business logic. Amazon DynamoDB efficiently tracked the state of every data upload across all partners. Amazon API Gateway served as the backbone of the API, with granular API keys ensuring security and streamlining the development.

Amazon S3 with AWS Glue Crawlers inside ensures data is stored in the right location and remains easily discoverable for downstream processes. With Amazon Athena, data engineers can seamlessly perform queries, making it easy to analyze data whenever needed. Amazon CloudWatch delivered robust monitoring with metrics, alerts, and dashboards, providing a comprehensive overview of the entire process. Security remained paramount, with Amazon KMS used at every touchpoint to safeguard data and the whole project.

The project unfolded in carefully planned phases. We initiated by collaboratively designing the API schema and defining the expected communication flow between systems, engaging closely with the data producers. This inclusive approach ensured all stakeholders had input into the end solution design. Once the technical blueprint was agreed upon, we seamlessly transitioned into the implementation phase, turning the vision into reality.

After reviewing the first version of the Data Ingestion Service, addressing feedback, and eliminating rough edges, we deployed a UAT (User Acceptance Testing) environment, allowing data producers to adapt their systems to the new solution at a pace that aligned with their timelines.

With the application ready, we deployed it to the production environment. Gradually, each data producer transitioned to using the API for data ingestion. Once all producers were onboarded, legacy methods, such as direct access to Amazon S3, were systematically revoked, ensuring the API became the single, unified solution for all data ingestion activities.

Outcome Streamlined data ingestion in the cloud

The project’s primary goal was achieved: Redge Technologies now has complete control over the data ingestion process. As the data lake owner, they dictate the terms of ingestion and ensure all data producers adhere to the same API standard for pushing data into the lake. This shift from reactive to proactive management has streamlined operations and reinforced the integrity of the data ecosystem.

Redge Technologies now benefits from complete visibility into data ingestion, monitoring the volume and frequency of data each producer contributes in real-time. Onboarding a new producer is effortless – no additional setup is required beyond generating an API key. The simple, standardized API ensures seamless integration, while revoking access is equally straightforward — just disable the producer’s API key. Effect? The streamlined process reinforces control, scalability, and security across their data operations.

One of the significant benefits is the reallocation of human resources. With a standardized ingestion process, the data engineering team can now dedicate their efforts to data analytics and deriving insights, rather than being bogged down by repetitive, time-consuming tasks of maintaining disparate data pipelines. This shift not only enhances efficiency, but also amplifies business impact by focusing on strategic outcomes.

True to their innovative spirit, Redge Technologies is already looking ahead. They plan to enhance the API to accommodate streaming data producers, ensuring they stay ahead of emerging trends and continue to provide a scalable, future-proof solution.

In today’s fast-paced landscape, the ability to harness data effectively is a key differentiator, especially in fast-evolving markets like OTT, where agility and innovation define success.

A well-designed data ingestion API is crucial for a data lake with multiple data producers. It ensures that the data lake owner maintains complete ownership and transparency during the ingestion process.

Rafał Mituła Cloud Data Architect, Chaos Gears

In the world of streaming platforms, keeping viewers engaged and in front of the screen is critical for success. Achieving this requires the ability to analyze massive volumes of data in real-time, enabling platforms to deliver personalized, compelling content experiences. This isn’t achievable without transforming chaotic data influxes into structured, efficient pipelines during data ingestion.

Core tech

AWS Lambda
Amazon API Gateway
AWS Glue
Amazon S3
Amazon DynamoDB
Amazon CloudWatch
AWS KMS