DZone
Thanks for visiting DZone today,
Edit Profile
  • Manage Email Subscriptions
  • How to Post to DZone
  • Article Submission Guidelines
Sign Out View Profile
  • Post an Article
  • Manage My Drafts
Over 2 million developers have joined DZone.
Log In / Join
Refcards Trend Reports
Events Video Library
Refcards
Trend Reports

Events

View Events Video Library

Zones

Culture and Methodologies Agile Career Development Methodologies Team Management
Data Engineering AI/ML Big Data Data Databases IoT
Software Design and Architecture Cloud Architecture Containers Integration Microservices Performance Security
Coding Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks
Culture and Methodologies
Agile Career Development Methodologies Team Management
Data Engineering
AI/ML Big Data Data Databases IoT
Software Design and Architecture
Cloud Architecture Containers Integration Microservices Performance Security
Coding
Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance
Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks

Low-Code Development: Leverage low and no code to streamline your workflow so that you can focus on higher priorities.

DZone Security Research: Tell us your top security strategies in 2024, influence our research, and enter for a chance to win $!

Launch your software development career: Dive head first into the SDLC and learn how to build high-quality software and teams.

Open Source Migration Practices and Patterns: Explore key traits of migrating open-source software and its impact on software development.

Related

  • DDN and Tintri: Powering the Future of AI and Enterprise Storage
  • Simplifying Data Management From Desktop to Datacenter With Graid Technology
  • Simplifying Data Management With Hammerspace
  • Simplify Data Management With Rimage’s AI-Powered Platform

Trending

  • You Can Shape Trend Reports: Participate in DZone Research Surveys + Enter the Prize Drawings!
  • Build Your Business App With BPMN 2.0
  • Build an Advanced RAG App: Query Rewriting
  • Next-Gen Lie Detector: Stack Selection
  1. DZone
  2. Data Engineering
  3. AI/ML
  4. MLOps Architectural Models: An Advanced Guide to MLOps in Practice

MLOps Architectural Models: An Advanced Guide to MLOps in Practice

Learn about MLOps architectural challenges and ways to manage complexity through this end-to-end guide for MLOps app development with a financial institution use case.

By 
Daniela Kolarova user avatar
Daniela Kolarova
DZone Core CORE ·
Mar. 30, 24 · Tutorial
Like (3)
Save
Tweet
Share
2.6K Views

Join the DZone community and get the full member experience.

Join For Free

Editor's Note: The following is an article written for and published in DZone's 2024 Trend Report, Enterprise AI: The Emerging Landscape of Knowledge Engineering.


AI continues to transform businesses, but this leads to enterprises facing new challenges in terms of digital transformation and organizational changes. Based on a 2023 Forbes report, those challenges can be summarized as follows:

  • Companies whose analytical tech stacks are built around analytical/batch workloads need to start adapting to real-time data processing (Forbes). This change affects not only the way the data is collected, but it also leads to the need for new data processing and data analytics architectural models.
  • AI regulations need to be considered as part of AI/ML architectural models. According to Forbes, "Gartner predicts that by 2025, regulations will force companies to focus on AI ethics, transparency, and privacy." Hence, those platforms will need to comply with upcoming standards.
  • Specialized AI teams must be built, and they should be capable of not only building and maintaining AI platforms but also collaborating with other teams to support models' lifecycles through those platforms.

The answer to these new challenges seems to be MLOps, or machine learning operations. MLOps builds on top of DevOps and DataOps as an attempt to facilitate machine learning (ML) applications and a way to better manage the complexity of ML systems. The goal of this article is to provide a systematic overview of MLOps architectural challenges and demonstrate ways to manage that complexity.

MLOps Application: Setting Up the Use Case

For this article, our example use case is a financial institution that has been conducting macroeconomic forecasting and investment risk management for years. Currently, the forecasting process is based on partially manual loading and postprocessing of external macroeconomic data, followed by statistical modeling using various tools and scripts based on personal preferences.

However, according to the institution's management, this process is not acceptable due to recently announced banking regulations and security requirements. In addition, the delivery of calculated results is too slow and financially not acceptable compared to competitors in the market. Investment in a new digital solution requires a good understanding of the complexity and the expected cost. It should start with gathering requirements and subsequently building a minimum viable product.

Requirements Gathering

For solution architects, the design process starts with a specification of problems that the new architecture needs to solve — for example:

  • Manual data collection is slow, error prone, and requires a lot of effort
  • Real-time data processing is not part of the current data loading approach
  • There is no data versioning and, hence, reproducibility is not supported over time
  • The model's code is triggered manually on local machines and constantly updated without versioning
  • Data and code sharing via a common platform is completely missing
  • The forecasting process is not represented as a business process, all the steps are distributed and unsynchronized, and most of them require manual effort
  • Experiments with the data and models are not reproducible and not auditable
  • Scalability is not supported in case of increased memory consumptions or CPU-heavy operations
  • Monitoring and auditing of the whole process are currently not supported

The following diagram demonstrates the four main components of the new architecture: monitoring and auditing platform, model deployment platform, model development platform, and data management platform.

Figure 1. MLOps architecture diagram

Platform Design Decisions

The two main strategies to consider when designing a MLOps platform are:

  1. Developing from scratch vs. selecting a platform
  2. Choosing between a cloud-based, on-premises, or hybrid model

Developing From Scratch vs. Choosing a Fully Packaged MLOps Platform

Building an MLOps platform from scratch is the most flexible solution. It would provide the possibility to solve any future needs of the company without depending on other companies and service providers. It would be a good choice if the company already has the required specialists and trained teams to design and build an ML platform.

A prepackaged solution would be a good option to model a standard ML process that does not need many customizations. One option would even be to buy a pretrained model (e.g., model as a service), if available on the market, and build only the data loading, monitoring, and tracking modules around it. The disadvantage of this type of solution is that if new features need to be added, it might be hard to achieve those additions on time.

Buying a platform as a black box often requires building additional components around it. An important criterion to consider when choosing a platform is the possibility to extend or customize it.

Cloud-Based, On-Premises, or Hybrid Deployment Model

Cloud-based solutions are already on the market, with popular options provided by AWS, Google, and Azure. In case of no strict data privacy requirements and regulations, cloud-based solutions are a good choice due to the unlimited infrastructural resources for model training and model serving. An on-premises solution would be acceptable for very strict security requirements or if the infrastructure is already available within the company. The hybrid solution is an option for companies that already have part of the systems built but want to extend them with additional services — e.g., to buy a pretrained model and integrate with the locally stored data or incorporate into an existing business process model.

MLOps Architecture in Practice

The financial institution from our use case does not have enough specialists to build a professional MLOps platform from scratch, but it also does not want to invest in an end-to-end managed MLOps platform due to regulations and additional financial restrictions. The institution's architectural board has decided to adopt an open-source approach and buy tools only when needed. The architectural concept is built around the idea of developing minimalistic components and a composable system. The general idea is built around microservices covering nonfunctional requirements like scalability and availability. Striving for maximal simplicity of the system, the following decisions for the system components were made.

Data Management Platform

The data collection process will be fully automated. There will be a separate data loading component for each data source due to the heterogeneity of external data providers. The database choice is crucial when it comes to writing real-time data and reading a large amount of data. Due to the time-based nature of the macroeconomic data and the institution's already available relational database specialists, they chose to use the open-source database, TimescaleDB.

The possibility to provide a standard SQL-based API, perform data analytics, and conduct data transformations using standard relational database GUI clients will decrease the time to deliver a first prototype of the platform. Data versions and transformations can be tracked and saved into separate data versions or tables.

Model Development Platform

The model development process consists of four steps:

  1. Data reading and transformation
  2. Model training
  3. Model serialization
  4. Model packaging

Once the model is trained, the parametrized and trained instance is usually stored as a packaged artifact. The most common solution for code storage and versioning is a Git. Furthermore, the financial institution is already equipped with a solution like GitHub, providing functionality to define pipelines for building, packaging, and publishing the code. The architecture of Git-based systems usually relies on a set of distributed worker machines executing the pipelines. That option will be used as part of the minimalistic MLOps architectural prototype to also train the model.

After training a model, the next step is to store it in a model repository as a released and versioned artifact. Storing the model in a database as a binary file, a shared file system, or even an artifacts repository are all acceptable options at that stage. Later, a model registry or a blob storage service could be incorporated into the pipeline. A model's API microservice will expose the model's functionality for macroeconomic projections.

Model Deployment Platform

The decision to keep the MLOps prototype as simple as possible applies to the deployment phase as well. The deployment model is based on a microservices architecture. Each model can be deployed using a Docker container as a stateless service and be scaled on demand. That principle applies for the data loading components, too. Once that first deployment step is achieved and dependencies of all the microservices are clarified, a workflow engine might be needed for orchestrating the established business processes.

Model Monitoring and Auditing Platform

Traditional microservices architectures are already equipped with tools for gathering, storing, and monitoring log data. Tools like Prometheus, Kibana, and ElasticSearch are flexible enough for producing specific auditing and performance reports.

Open-Source MLOps Platforms

A minimalistic MLOps architecture is a good start for the initial digital transformation of a company. However, keeping track of available MLOps tools in parallel is crucial for the next design phase. The following table provides a summary of some of the most popular open-source tools.

Table 1. Open-source MLOps tools for initial digital transformations

Tool Description Functional Areas
Kubeflow Makes deployments of ML workflows on Kubernetes simple, portable, and scalable Tracking and versioning, pipeline orchestration, and model deployment
MLflow Is an open-source platform for managing the end-to-end ML lifecycle Tracking and versioning
BentoML Is an open standard and SDK for AI apps and inference pipelines; provides features like auto-generation of API servers, REST APIs, gRPC, and long-running inference jobs; and offers auto-generation of Docker container images Tracking and versioning, pipeline orchestration, model development, and model deployment
TensorFlow Extended (TFX) Is a production-ready platform; is designed for deploying and managing ML pipelines; and includes components for data validation, transformation, model analysis, and serving Model development, pipeline orchestration, and model deployment
Apache Airflow, Apache Beam Is a flexible framework for defining and scheduling complex workflows — data workflows in particular, including ML Pipeline orchestration

Summary

MLOps is often called DevOps for machine learning, and it is essentially a set of architectural patterns for ML applications. However, despite the similarities with many well-known architectures, the MLOps approach brings some new challenges for MLOps architects. On one side, the focus must be on the compatibility and composition of MLOps services. On the other side, AI regulations will force existing systems and services to constantly adapt to new regulatory rules and standards. I suspect that as the MLOps field continues to evolve, a new type of service providing AI ethical and regulatory analytics will soon become the focus of businesses in the ML domain.

This is an excerpt from DZone's 2024 Trend Report, Enterprise AI: The Emerging Landscape of Knowledge Engineering.

Read the Free Report

AI Data management Machine learning MLOps

Opinions expressed by DZone contributors are their own.

Related

  • DDN and Tintri: Powering the Future of AI and Enterprise Storage
  • Simplifying Data Management From Desktop to Datacenter With Graid Technology
  • Simplifying Data Management With Hammerspace
  • Simplify Data Management With Rimage’s AI-Powered Platform

Partner Resources


Comments

ABOUT US

  • About DZone
  • Send feedback
  • Community research
  • Sitemap

ADVERTISE

  • Advertise with DZone

CONTRIBUTE ON DZONE

  • Article Submission Guidelines
  • Become a Contributor
  • Core Program
  • Visit the Writers' Zone

LEGAL

  • Terms of Service
  • Privacy Policy

CONTACT US

  • 3343 Perimeter Hill Drive
  • Suite 100
  • Nashville, TN 37211
  • support@dzone.com

Let's be friends: