The term ‘’MLOps” comes from two words “machine learning” and the practice of DevOps in software development. MLOps is a set of practices for deploying machine learning models to production. It is a combination of Data Science, Software Engineering and DevOps. Data scientists, machine learning engineers, data engineers, and software engineers must put models into production. AWS MLOps is the process of managing and integrating machine learning pipelines on AWS machine learning services.
Advantages / Strengths
MLOps aims to unify the release cycle for machine learning and software application releases.
MLOps enables automated testing of machine learning artifacts (e.g. data validation, ML model testing, and ML model integration testing)
MLOps enables the application of agile principles to machine learning projects.
MLOps enables supporting machine learning models and datasets to build these models as first-class citizens within CI/CD systems.
MLOps reduces technical debt across machine learning models.
MLOps must be a language-, framework-, platform-, and infrastructure-agnostic practice.
Model Artifacts
A model artifact is a collection of files produced by a training job required for model deployment. In AWS Sagemaker API, they are the output that results from training a model and typically consist of trained parameters, a model definition that describes how to compute inferences, and other metadata.
Tracking
Besides generating the model artifacts, MLOps need to track the code that builds them, the data they were trained and tested on, and how they are related. This then makes it possible to deliver solutions to customers through automation in the stages of app development.
Tools
The tools used in MLOps depend on the nature of project. Amazon has Amazon SageMaker launched in November 2017. Amazon SageMaker is a fully managed machine learning platform making it easy for data scientists and developers to build and train machine learning models quickly and then deploy them into production. Amazon SageMaker provides MLOps to help developers automate and standardize processes throughout the ML lifecycle. This increases productivity by training, testing, troubleshooting, deploying, and governing ML models.
SageMaker Features
SageMaker JumpStart - contains built-in algorithms and pre-built machine learning (ML) solutions that can be deployed with just a few clicks.
SageMaker Studio - provides an integrated machine-learning environment for building, training, deploying, and analyzing models. SageMaker Studio provides all the tools to facilitate a good workflow, from data preparation to experimentation to production, making implementing MLOps convenient.
SageMaker Ground Truth Plus - helps to provide the service of data labeling so that it can be ready to use in any project as a completed product. It uses an expert workforce to deliver quality, saving time and reducing costs. All this comes without having to build labeling applications and manage the workforce.
SageMaker Model Building Pipelines - is a tool for building machine learning pipelines benefitting from direct SageMaker integration. This integration helps create a pipeline and set up SageMaker Projects for orchestration. This helps to improve MLOps activity in the development process.
SageMaker Debugger - a powerful tool for ensuring the control of bugs. It does not only detect bugs but also sends details by profiling them from the training job, making the ML models highly robust. These debuggers send alerts when bugs or anomalies are found. It can identify the root cause and take action against the problems using metrics and tensors. Inspecting training parameters in the training process becomes very easy.
SageMaker Model Monitoring - Monitoring and analyzing models in production to detect unwanted scenarios such as data drift and deviations in model quality in production becomes easier. It provides the ability to set up continuous monitoring in real-time or batches or on-schedule monitoring for asynchronous batch transform jobs. It is possible to set alerts that notify when there is a deviation in the model quality. It functions like the debugger, except it doesn’t send alerts on model and data, not an error.
Preprocessing - deals with processing such as feature engineering, data validation, model evaluation and interpretation, and evaluation of models. It provides a simplified, well-managed experience for running data processing workloads. such as feature engineering, data validation, model evaluation, and model interpretation. It also provides a way of post-monitoring by using APIs during the experimentation phase and after the code is deployed. This is a very good feature for MLOps.
Practical Implementations
Students enrolling for any AI related course from Carnegie Training Institute have access to practical and working implementation guidelines