Gopi Vadlamudi

Overall 17 years of industry experience in DevOps, Build and Release Engineering and System tool development. Designed and implemented release engineering strategy, continuous integration and continuous deployment for a distributed development products across Yahoo ADs/Mail and NVIDIA GPU Cloud computing and storage platform on BM/VM using various tools and technologies. Currently leading CICD E2E workflow setup for Data Center automation, right from switch config updates to compute and storage platform provisioning on a dynamically provisioned k8s cluster and deploy containerized micro-services using an ecosystem built around Jenkins.


Data Center automation at NVIDIA using Pipeline as Code

Level :
Date :
2:15 PM Saturday
Room :
Interested : (-) - Registered : (-)

Speakers: Gopi Vadlamudi

Have you ever thought of automating end to end workflow for setting up a new data center by single click? Have you ever thought of implementing automations for infrastructure setups which generally takes months of effort to hours with single click? Some of the examples of such setups are:

1. Setting up DC network in reliable and reproducible manner.
2. Automatic OS provisioning on blade servers.
3. Secure login mechanism for human and service accounts.
4. Configuring the DC components using idempotent automation workflows.
5. Setting up highly available internal private cloud / container orchestration platform like Kubernetes on auto provisioned infra.
6. A very complex Inventory state life management workflow.

To accomplish reliable, reproducible and idempotent automation for infrastructure setup, NVIDIA DevOps team has been working on implementing *DC Automation Manager*, a framework developed using CICD tools ecosystem.

In this presentation we will talk about design and automation used at NVIDIA GPU Cloud to setup new DC of 1000s of GPU and CPU blade servers from scratch using Jenkins and GitOps for,

1. Streamlining inventory life cycle
2. L2/L3 network setups
3. Secret management to secure human or automated interaction with all the data center services.
4. Node provisioning and OS configuration with dynamic inventory capabilities
5. Setting up container orchestration platforms on BM/Cloud
6. Bridging the gap between application engineering and operation engineering.

  • Not Interested
  • Interested