Инфраструктура ИИ NVIDIA и подготовка к NCA-AIIO [Udemy] [Ashish Prajapati]

Bot

Администратор
Команда форума
23 Янв 2020
206,848
3,145
113
269049.jpg

Master NVIDIA AI Infrastructure & Pass NCA-AIIO
Your Guide to Understanding NVIDIA-Powered AI Infrastructure - From Fundamentals to Certification Success

Bestseller

What you'll learn


Comprehend GPU Architecture and Use Cases - Learn about GPU architecture and its role in accelerating AI workloads across various industries.

Navigate NVIDIA Software Suite – Learn CUDA, GPU cores, DGX, NVLink, InfiniBand, DCGM, GPUDirect, and key tools for AI data center operations.

Prepare for NVIDIA NCA-AIIO Certification - Gain the knowledge and skills needed to successfully pass the NVIDIA AI Infrastructure Operations Associate exam.

Comprehend GPU Architecture and Use Cases Learn about GPU architecture and its role in accelerating AI workloads across various industries.

Course content

6 sections • 72 lectures • 4h 40m total length

Introduction
1 lecture • 2min

Certification Details
2 lectures • 5min

Module 1 - Fundamentals
5 lectures • 16min

Module 2 - Inside an AI centric Data Center
13 lectures • 1hr 1min

Module 3 - NVIDIA Technology Stack
41 lectures • 2hr 34min

Module 4 - AI Workflows
10 lectures • 44min

Module 1

  • Drivers of AI evolution
  • AI use cases across industries
  • AI, ML, DL, Gen AI
  • Analogy for AI, ML, DL, Gen AI
  • Transformer Model
Module 2 - Inside an AI centric Data Center
  • Inside an AI centric Data Center
  • Power Usage Effectiveness (PUE)
  • The Compute Power
  • CPU and GPU
  • CPU vs. GPU - Architectural difference
  • Beyond Moore's law
  • Data Processing Unit (DPU)
  • Network inside an AI centric Data Center
  • Network fabric
  • Ethernet vs. InfiniBand
  • Converged Ethernet (CE)
  • Storage inside an AI centric Data Center
  • Cloud vs. On-Prem
Module 3 - NVIDIA Technology Stack
  • NVIDIA: Powering AI GPU Innovation
  • NVIDIA Technology Stack
  • Layer 1 - Physical Layer
  • GPU on a Graphic Card
  • DGX Platform
  • DGX SuperPOD
  • ConnectX
  • BlueField DPUs
  • NVIDIA Reference Architectures
  • Understanding GPU Cores
  • Comparing GPU Cores
  • NVIDIA DGX Platform - Timeline
  • DGX Platform - Deployment Options
  • DGX A100 vs H100
  • Layer 2: Data Movement and I/O Acceleration
  • NVLink
  • InfiniBand
  • InfiniBand vs. Ethernet
  • DMA and RDMA
  • GPUDirect RDMA
  • GPUDirect Storage
  • Quick Comparison
  • Layer 3: OS, Driver and Virtualization
  • GPU Drivers
  • GPU Virtualization
  • vGPU vs. MIG - Part 1
  • vGPU vs. MIG - Part 2
  • Layer 4: Core Libraries
  • Compute Unified Device Architecture (CUDA)
  • Installing CUDA
  • NVIDIA Collective Communications Library (NCCL)
  • NVLink, NVSwitch, PCIe, RDMA vs. NCCL
  • Layer 5: Monitoring and Management
  • NVIDIA-SMI
  • Data Center GPU Manager (DCGM)
  • Base Command Manager
  • Which one to use?
  • Layer 6: Applications & Vertical Solutions
  • Summary
  • NVIDIA AI Enterprise:
  • NVIDIA AI Factory
Module 4 - AI Workflows
  • AI Workflows
  • ML Frameworks
  • The NVIDIA differentiator
  • Model Training vs. Model Inference
  • Job Scheduling vs. Container Orchestration
  • Slurm vs Kubernetes
  • NVIDIA Integration
  • ML Ops - Analogy
  • Why ML Ops?
  • NVIDIA Tools supporting ML Ops
Requirements
No prior AI infrastructure experience is required; this course is suitable for beginners. Basic understanding of IT concepts, data centers, or enterprise computing is helpful but not mandatory.

Familiarity with general IT hardware and networking concepts is useful.

Description

Embark on a transformative journey into the world of AI infrastructure with this comprehensive course designed to prepare you for the NVIDIA Certified Associate: AI Infrastructure and Operations (NCA-AIIO) certification. Whether you're an IT professional, system administrator, or DevOps engineer, this course equips you with the foundational knowledge and practical skills needed to manage and optimize AI workloads in data center environments.

What You'll Learn:

AI Fundamentals: Understand the core concepts of Artificial Intelligence, Machine Learning, and Deep Learning, and their applications in modern computing.

NVIDIA Hardware & Software: Gain proficiency in NVIDIA's GPU architectures, including A100, H100, and B200, and explore essential software tools like CUDA, DCGM, and NGC Catalog.

Infrastructure Design: Learn about data center components, networking technologies such as NVLink and InfiniBand, and how to design scalable AI infrastructure.

AI Operations: Master the deployment, monitoring, and optimization of AI workloads in a enterprise data center, utilizing tools like DCGM, Slurm and Kubernetes.

Exam Preparation: Prepare thoroughly for the NCA-AIIO exam with detailed study guides, practice questions, and real-world scenarios. Gain a clear understanding of the exam objectives, learn tips to maximize your performance, and build confidence to pass the certification on your first attempt, validating your expertise in AI infrastructure operations.

Who this course is for:

This course is for IT professionals, beginners, and anyone preparing for the NVIDIA NCA-AIIO certification. Learn AI infrastructure, NVIDIA GPUs, software, and data center operations from the ground up.