Kubeflow on AWS Training Course
Kubeflow is a framework for running Machine Learning workloads on Kubernetes. TensorFlow is a machine learning library and Kubernetes is an orchestration platform for managing containerized applications.
This instructor-led, live training (online or onsite) is aimed at engineers who wish to deploy Machine Learning workloads to an AWS EC2 server.
By the end of this training, participants will be able to:
- Install and configure Kubernetes, Kubeflow and other needed software on AWS.
- Use EKS (Elastic Kubernetes Service) to simplify the work of initializing a Kubernetes cluster on AWS.
- Create and deploy a Kubernetes pipeline for automating and managing ML models in production.
- Train and deploy TensorFlow ML models across multiple GPUs and machines running in parallel.
- Leverage other AWS managed services to extend an ML application.
Format of the Course
- Interactive lecture and discussion.
- Lots of exercises and practice.
- Hands-on implementation in a live-lab environment.
Course Customization Options
- To request a customized training for this course, please contact us to arrange.
Course Outline
Introduction
- Kubeflow on AWS vs on-premise vs on other public cloud providers
Overview of Kubeflow Features and Architecture
Activating an AWS Account
Preparing and Launching GPU-enabled AWS Instances
Setting up User Roles and Permissions
Preparing the Build Environment
Selecting a TensorFlow Model and Dataset
Packaging Code and Frameworks into a Docker Image
Setting up a Kubernetes Cluster Using EKS
Staging the Training and Validation Data
Configuring Kubeflow Pipelines
Launching a Training Job using Kubeflow in EKS
Visualizing the Training Job in Runtime
Cleaning up After the Job Completes
Troubleshooting
Summary and Conclusion
Requirements
- An understanding of machine learning concepts.
- Knowledge of cloud computing concepts.
- A general understanding of containers (Docker) and orchestration (Kubernetes).
- Some Python programming experience is helpful.
- Experience working with a command line.
Audience
- Data science engineers.
- DevOps engineers interesting in machine learning model deployment.
- Infrastructure engineers interesting in machine learning model deployment.
- Software engineers wishing to integrate and deploy machine learning features with their application.
Open Training Courses require 5+ participants.
Kubeflow on AWS Training Course - Booking
Kubeflow on AWS Training Course - Enquiry
Kubeflow on AWS - Consultancy Enquiry
Consultancy Enquiry
Testimonials (3)
the ML ecosystem not only MLFlow but Optuna, hyperops, docker , docker-compose
Guillaume GAUTIER - OLEA MEDICAL
Course - MLflow
All good, nothing to improve
Ievgen Vinchyk - GE Medical Systems Polska Sp. Z O.O.
Course - AWS Lambda for Developers
IOT applications
Palaniswamy Suresh Kumar - Makers' Academy
Course - Industrial Training IoT (Internet of Things) with Raspberry PI and AWS IoT Core 「4 Hours Remote」
Upcoming Courses
Related Courses
Advanced Amazon Web Services (AWS) CloudFormation
7 HoursThis instructor-led, live training in Brazil (online or onsite) is aimed at cloud engineers and developers who wish to use CloudFormation to manage infrastructure resources within the AWS ecosystem.
By the end of this training, participants will be able to:
- Implement CloudFormation templates to automate infrastructure management.
- Integrate existing AWS resources into CloudFormation.
- Use StackSets to manage stacks across multiple accounts and regions.
DeepSeek: Advanced Model Optimization and Deployment
14 HoursThis instructor-led, live training in Brazil (online or onsite) is aimed at advanced-level AI engineers and data scientists with intermediate-to-advanced experience who wish to enhance DeepSeek model performance, minimize latency, and deploy AI solutions efficiently using modern MLOps practices.
By the end of this training, participants will be able to:
- Optimize DeepSeek models for efficiency, accuracy, and scalability.
- Implement best practices for MLOps and model versioning.
- Deploy DeepSeek models on cloud and on-premise infrastructure.
- Monitor, maintain, and scale AI solutions effectively.
Amazon DynamoDB for Developers
14 HoursThis instructor-led, live training in Brazil (online or onsite) is aimed at developers who wish to integrate a DynamoDB NoSQL database into a web application hosted on AWS.
By the end of this training, participants will be able to:
- Set up the necessary development environment to start integrating data into DynamoDB.
- Integrate DynamoDB into web applications and mobile applications.
- Move data in AWS with AWS services.
- Implement operations with AWS DAX.
AWS IoT Core
14 HoursThis instructor-led, live training in Brazil (onsite or remote) is aimed at engineers who wish to deploy and manage IoT devices on AWS.
By the end of this training, participants will be able to build an IoT platform that includes the deployment and management of a backend, gateway, and devices on top of AWS.
Amazon Web Services (AWS) IoT Greengrass
21 HoursThis instructor-led, live training in Brazil (online or onsite) is aimed at developers who wish to install, configure, and manage AWS IoT Greengrass capabilities to create applications for various devices.
By the end of this training, participants will be able to use AWS IoT Greengrass to build, deploy, manage, secure, and monitor applications on intelligent devices.
AWS Lambda for Developers
14 HoursThis instructor-led, live training in Brazil (onsite or remote) is aimed at developers who wish to use AWS Lambda to build and deploy services and applications to the cloud, without needing to worry about provisioning the execution environment (servers, VMs and containers, availability, scalability, storage, etc.).
By the end of this training, participants will be able to:
- Configure AWS Lambda to execute a function.
- Understand FaaS (Functions as a Service) and the advantages of serverless development.
- Build, upload and execute AWS Lambda functions.
- Integrate Lambda functions with different event sources.
- Package, deploy, monitor and troubleshoot Lambda based applications.
AWS CloudFormation
7 HoursThis instructor-led, live training in Brazil (online or onsite) is aimed at engineers who wish to use AWS CloudFormation to automate the process of managing AWS cloud infrastructure.
By the end of this training, participants will be able to:
- Enable AWS services to get started managing infrastructure.
- Understand and apply the principle of "infrastructure as code".
- Improve quality and lower the costs of deploying infrastructure.
- Write AWS CloudFormation Templates using YAML.
Mastering DevOps with AWS Cloud9
21 HoursThis instructor-led, live training in Brazil (online or onsite) is aimed at advanced-level professionals who wish to deepen their understanding of DevOps practices and streamline development processes using AWS Cloud9.
By the end of this training, participants will be able to:
- Set up and configure AWS Cloud9 for DevOps workflows.
- Implement continuous integration and continuous delivery (CI/CD) pipelines.
- Automate testing, monitoring, and deployment processes using AWS Cloud9.
- Integrate AWS services such as Lambda, EC2, and S3 into DevOps workflows.
- Utilize source control systems like GitHub or GitLab within AWS Cloud9.
Developing Serverless Applications on AWS Cloud9
14 HoursThis instructor-led, live training in Brazil (online or onsite) is aimed at intermediate-level professionals who wish to learn how to effectively build, deploy, and maintain serverless applications on AWS Cloud9 and AWS Lambda.
By the end of this training, participants will be able to:
- Understand the fundamentals of serverless architecture.
- Set up AWS Cloud9 for serverless application development.
- Develop, test, and deploy serverless applications using AWS Lambda.
- Integrate AWS Lambda with other AWS services such as API Gateway and S3.
- Optimize serverless applications for performance and cost efficiency.
Industrial Training IoT (Internet of Things) with Raspberry PI and AWS IoT Core 「4 Hours Remote」
4 HoursSummery:
- Basics of IoT architecture and functions
- “Things”, “Sensors”, Internet and the mapping between business functions of IoT
- Essential of all IoT software components- hardware, firmware, middleware, cloud and mobile app
- IoT functions- Fleet manager, Data visualization, SaaS based FM and DV, alert/alarm, sensor onboarding, “thing” onboarding, geo-fencing
- Basics of IoT device communication with cloud with MQTT.
- Connecting IoT devices to AWS with MQTT (AWS IoT Core).
- Connecting AWS IoT core with AWS Lambda function for computation and data storage.
- Connecting Raspberry PI with AWS IoT core and simple data communication.
- Alerts and events
- Sensor calibration
Industrial Training IoT (Internet of Things) with Raspberry PI and AWS IoT Core 「8 Hours Remote」
8 HoursSummary:
- Basics of IoT architecture and functions
- “Things”, “Sensors”, Internet and the mapping between business functions of IoT
- Essential of all IoT software components- hardware, firmware, middleware, cloud and mobile app
- IoT functions- Fleet manager, Data visualization, SaaS based FM and DV, alert/alarm, sensor onboarding, “thing” onboarding, geo-fencing
- Basics of IoT device communication with cloud with MQTT.
- Connecting IoT devices to AWS with MQTT (AWS IoT Core).
- Connecting AWS IoT core with AWS Lambda function for computation and data storage using DynamoDB.
- Connecting Raspberry PI with AWS IoT core and simple data communication.
- Hands on with Raspberry PI and AWS IoT Core to build a smart device.
- Sensor data visualization and communication with web interface.
Kubeflow on Azure
28 HoursThis instructor-led, live training in Brazil (online or onsite) is aimed at engineers who wish to deploy Machine Learning workloads to Azure cloud.
By the end of this training, participants will be able to:
- Install and configure Kubernetes, Kubeflow and other needed software on Azure.
- Use Azure Kubernetes Service (AKS) to simplify the work of initializing a Kubernetes cluster on Azure.
- Create and deploy a Kubernetes pipeline for automating and managing ML models in production.
- Train and deploy TensorFlow ML models across multiple GPUs and machines running in parallel.
- Leverage other AWS managed services to extend an ML application.
MLflow
21 HoursThis instructor-led, live training in (online or onsite) is aimed at data scientists who wish to go beyond building ML models and optimize the ML model creation, tracking, and deployment process.
By the end of this training, participants will be able to:
- Install and configure MLflow and related ML libraries and frameworks.
- Appreciate the importance of trackability, reproducability and deployability of an ML model
- Deploy ML models to different public clouds, platforms, or on-premise servers.
- Scale the ML deployment process to accommodate multiple users collaborating on a project.
- Set up a central registry to experiment with, reproduce, and deploy ML models.
MLOps: CI/CD for Machine Learning
35 HoursThis instructor-led, live training in Brazil (online or onsite) is aimed at engineers who wish to evaluate the approaches and tools available today to make an intelligent decision on the path forward in adopting MLOps within their organization.
By the end of this training, participants will be able to:
- Install and configure various MLOps frameworks and tools.
- Assemble the right kind of team with the right skills for constructing and supporting an MLOps system.
- Prepare, validate and version data for use by ML models.
- Understand the components of an ML Pipeline and the tools needed to build one.
- Experiment with different machine learning frameworks and servers for deploying to production.
- Operationalize the entire Machine Learning process so that it's reproduceable and maintainable.
MLOps for Azure Machine Learning
14 HoursThis instructor-led, live training in (online or onsite) is aimed at machine learning engineers who wish to use Azure Machine Learning and Azure DevOps to facilitate MLOps practices.
By the end of this training, participants will be able to:
- Build reproducible workflows and machine learning models.
- Manage the machine learning lifecycle.
- Track and report model version history, assets, and more.
- Deploy production ready machine learning models anywhere.