Technology

Alt Full Text
SageMaker Machine Learning Workflow

AWS SageMaker Projects - ML deployment strategy

 

Generally any project setup aims at solving a given corporate problem. A big project may have multiple smaller projects within it, with each smaller project have a definate objective achieving a clearly defined performance index. Some of the common problems faced by various corporates across many different industries are but not limited to customer retention strategies, increase in profit margins, customer attrition, product development and so much more. Artificial intelligence is being adopted across various industries as a proven concept that addresses these problems with guaranteed return on investments. You have a data lake, you have a machine learning pipeline but what would be the best strategy to deliver these AI driven solutions to market. SageMaker projects is one of the best strategies deployed

What 

SageMaker projects helps organizations setup and standardize developer environments for Data Scientists and CI/CD systems for MLOps Engineers. Projects also helps organizations setup dependency management, code repository management, build reproducibility and artifact sharing.

SageMaker Projects

SageMaker projects build on SageMaker Pipelines by providing several MLOps templates that automate model building and deployment pipelines using continuous integration and continuous delivery (CI/CD). SageMaker Projects help organizations set up and standardize developer environments for data scientists and CI/CD systems for MLOps engineers. Projects also help organizations set up dependency management, code repository management, build reproducibility, and artifact sharing.

With SageMaker Projects, MLOps engineers and organization admins can define their own templates or use SageMaker-provided templates. The SageMaker-provided templates bootstrap the ML workflow with source version control, automated ML pipelines, and a set of code to quickly start iterating over ML use cases.

You can provision SageMkaer projects from the AWS Service catalog using custom or SageMaker provided templates. With SageMaker projects, MLOps Engineers and Organization admins can define their own templates or use SageMaker provided templates. The SageMaker provided templates bootstrap the ML Workflow with source version control, automated ML pipelines, and a set of code to quickly start iterating over ML use cases.

When to use SageMaker Projects

So of the justification for using SageMaker Projects are:

  1. Team of Data Scientists and ML Engineers sharing code needs a more scalable way to maintain code consistency and strict version control
  2. Need for a set of standards and practices that provide security and governance for its AWS environment
  3. Need for a set of first-party templates for organizations that want to quickly get started with ML workflows and CI/CD
  4. Need for templates that include projects that use AWS-native services for CI/CD, such as AWS CodeBuild, AWS CodePipeline, and AWS CodeCommit.
  5. Need to create projects that use third-party tools, such as Jenkins and GitHub.
  6. Need for tight control over the MLOps resources that they provision and manage. Such responsibility assumes certain tasks, including configuring IAM roles and policies, enforcing resource tags, enforcing encryption, and decoupling resources across multiple accounts. SageMaker Projects can support all these tasks through custom template offerings where organizations use AWS CloudFormation templates to define the resources needed for an ML workflow.
  7. Whenever there is need for Data Scientists to choose a template to bootstrap and pre-configure their ML workflow. These custom templates are created as Service Catalog products and you can provision them in the Studio UI under Organization Templates. The Service Catalog is a service that helps organizations create and manage catalogs of products that are approved for use on AWS.

SageMaker Projec Collaboration

SageMaker Projects can help you manage your Git repositories so that you can collaborate more efficiently across teams, ensure code consistency, and support CI/CD. SageMaker Projects can help you with the following tasks:

  1. Organize all entities of the ML lifecycle under one project.
  2. Establish a single-click approach to set up standard ML infrastructure for model training and deployment that incorporates best practices.
  3. Create and share templates for ML infrastructure to serve multiple use cases.
  4. Leverage SageMaker-provided pre-built templates to quickly start focusing on model building, or create custom templates with organization-specific resources and guidelines.
  5. Integrate with tools of your choice by extending the project templates. For an example, see Create a SageMaker Project to integrate with GitLab and GitLab Pipelines.
  6. Organize all entities of the ML lifecycle under one project.

Typical SageMaker Project

A typical SageMaker Project would consist of:

  1. One or more repository with sample code to build and deploy ML Solutions. You can clone and modify to your needs. You own this code and can take advantage of the version-controlled repositories for your tasks.
  2. SageMaker pipeline that defines steps for data preparation, training, model evaluation, and model deployment
  3. A CodePipeline or Jenkins Pipeline that runs your SageMaker pipeline every time you check in a new version of the code
  4. A model group that contains model versions. Every time you approve the resulting model version from a SageMaker pipeline run, you can deploy it to a SageMaker end-point

SageMaker Project Identity

Each SageMaker project has a unique name and ID that are applied as tags to all of the SageMaker and AWS resources created in the project. With the name and ID, you can view all of the entities associated with your project. These include:

  • Pipelines
  • Registered Models
  • Deployed Models
  • DataSets
  • ServiceCatalog products
  • CodePipelines and JenkinsPipelines
  • CodeCommit and third-party Git repositories

Practical Implementations

  • Students enrolling for any AI related course from Carnegie Training Institute have access to practical and working implementation guidelines

Sources

  1. AWS Documentation

Related Articles