Case study

Advancing medical research with MLOps

Learn how an innovative, HIPAA-compliant machine learning platform supports a healthcare leader in advancing global medical research.

Industry Healthcare

Size Enterprise

Key focus MLOps

Opportunity Innovating with machine learning

The Chaos Gears team was contacted by the Swiss leader in the global healthcare industry. Working in both research and pharma, they discovered the opportunity that machine learning offers to their scientific teams - an approach still unique and innovative in healthcare.

Drawn to the idea of supporting their research and innovation by machine learning, the client wanted to create a secure, internal ML platform based on open source technologies. The project was initially meant to support their data science teams of around 20. These teams have compiled a vast resource of healthcare research data and the project’s goal was simple: to create a solution that will allow them to benefit from this data and further global medical research through machine learning.

The ML platform’s objective was to speed up their experiments, automate repeatable workflows, allow them to safely host machine learning models in a high-scale production environment and make the entire process observable, monitored and rock-solid.

What they needed was someone competent to make it happen. Never before had they attempted a ML implementation in a production environment. They had no Kubernetes and MLOps professionals on their team. Additionally, they needed a solution compliant with their high internal security standards and HIPAA regulations.

The timeframe for the project was also limited — six months, from day one to final delivery.

Lack of expertise blocks innovation and agility

The project analyzed in this machine learning case study was purely technical. The client’s plan to create a machine learning platform was research-based and didn’t stem from business needs.

They, however, faced a major problem with security.

Our client already had their own Docker images that created various machine learning models and could serve as a baseline for the future platform. However, they were far from compliant with the security standards or best practices of creating similar solutions. To prevent backdoor attacks and dangerous data leaks, the new platform had to meet these standards and ensure the strongest security at all levels.

On top of this, multiple problems stemmed from the underdeveloped state of their current solution, the lack of safe and reliable integration with AWS infrastructure components and low maturity of MLOps techniques in general.

All that combined created pressure to deliver the project quickly. Rapid development was essential for success.

Solution Delivering a machine learning pipeline within a flexible collaboration model

The way Chaos Gears teams work aligned well with what the client expected. They looked for a highly flexible collaboration model, quick development and an easy way to take over the project. And that was exactly what we delivered to them.

The combination of these factors make us a perfect partner for the project.

Bespoke process based on proven methodology

Two Chaos Gears specialists joined the client’s team to help the project succeed. Together, the entire development team consisted of ten people total, including a Product Owner.

The tasks weren’t all of the same nature. Here’s what we had to do:

Introduce the potential benefits of the platform to the company’s business departments,
Create the proof of concept,
Design the initial architecture,
Suggest the best way to implement the solution,
Devise a plan in case of changes in the concept and how to adapt to them,
Cover the early development.

The actual development of the ML platform involved four stages over 6 months:

Research and discovery,
Early development (setting milestones and implementing the environment),
Delivery of the platform in the initial form,
User acceptance testing: employees involved in and related to the project.

All work and actions in each stage followed the best practices used for similar projects and AWS environments.

Empowering science with machine learning

Without focusing on the technical details of this machine learning case study, we will outline the basics of our approach.

Machine learning in healthcare is a unique space that hasn’t been fully explored yet. There are only a few widely adopted open source ML platforms. That’s why it was essential to base the technological approach on deep research and our MLOps market knowledge. We went with Kubeflow as a Kubernetes native solution that provided all functionalities our client needed.

When it came to serving models we had to make some crucial choices. KServe integrates natively and nicely with Kubeflow as both were created by the same community. However, Seldon had more sophisticated functionalities to offer - and that is what we picked.

Data science teams performed their experiments and research inside Amazon SageMaker, which was not as fully-fledged at that time as it is today. However, our client enjoyed the elastic compute power it provided along with its hosted Jupyter notebooks.

We then had to integrate all the technologies used - Kubeflow, Seldon and Amazon SageMaker. Thanks to our planning during the initial stage of the project, we were certain that they all would function well together. And indeed they did.

Additionally, we decided to choose the blue/green methodology of deployment. There were several instances of the MLOps platform, allowing us to rapidly introduce changes to only a subset of the client’s teams. The entire platform ran on Kubernetes, Istio and Grafana Loki, providing us all the crucial metrics we needed, in real-time.

Outcome Reliably lower time to value with a cloud-native MLOps pipeline

Despite the many challenges of this project and its limited timeframe, it turned out to be a complete success. Chaos Gears specialists ensured the results met the project criteria and the delivery was completed on schedule, milestone by milestone.

We provided the Client with the necessary foundations and competencies without which the platform couldn’t exist. Thanks to Chaos Gears' MLOps expertise, the client received a reliable, cloud-native and production-ready platform.

The solution finally enabled the client’s data science teams to run their machine learning models on production in a structured, repeatable, safe and secure manner and serve their models to the outside world. The platform allowed the collaboration of 20 teams initially, but was designed to easily scale up when a new data science team was added.

Currently, the client uses their own resources to further develop the solution. They do not need external contractors either, benefitting from the experience they gained working with us.

The machine learning platform developed together with Chaos Gears continues to support our client in advancing global medical research.

Core tech

Amazon SageMaker
Amazon Cognito
AWS Cloud Development Kit
AWS CodePipeline
Amazon CloudFront
Kubeflow
Seldon
Kubernetes
Istio
Grafana Loki