Hitesh Sahu
Hitesh SahuHitesh Sahu
  1. Home
  2. ›
  3. work
  4. ›
  5. …

  6. ›
  7. 4 nvidia super pod

Loading ⏳
Fetching content, this won’t take long…


💡 Did you know?

🐙 Octopuses have three hearts and blue blood.

🍪 This website uses cookies

No personal data is stored on our servers however third party tools Google Analytics cookies to measure traffic and improve your website experience. Learn more

AI-Machine-Learning

    AI & Machine Learning

    Cloud & DevOps

    Full-Stack Applications

    Mobile Development

Cover Image for NVIDIA Super POD
AI & Machine Learning

NVIDIA Super POD

Personal / Open Source

Ongoing

Creator / Maintainer

AI Infrastructure & LLM

Tech Stack
Kubernetes
NVIDIA GPU Operator
DCGM
Triton Inference Server
SLURM
Terraform

Summary

Self-provisioned GPU cluster on AWS with full observability and HPC-style job scheduling for multi-model inference serving.


What I Built
  • Provisioned a GPU cluster on AWS using Terraform (g4dn Spot instances) for cost-efficient compute.

  • Deployed Kubernetes with the NVIDIA GPU Operator and DCGM exporter feeding Prometheus/Grafana dashboards.

  • Configured Triton Inference Server for multi-model concurrent serving.

  • Set up SLURM/enroot for HPC-style job scheduling on the cluster.

← Previous

RAG Factory

Next →

GPU Fabric Bench

Let's work together
+49 176-2019-2523
hiteshkrsahu@gmail.com
WhatsApp
Skype
Munich 🥨, Germany 🇩🇪, EU
Playstore
Hitesh Sahu's apps on Google Play Store
Need Help?
Let's Connect
Navigation
  Home/About
  Skills
  Work/Projects
  Lab/Experiments
  Contribution
  Awards
  Art/Sketches
  Thoughts
  Contact
Links
  Sitemap
  Legal Notice
  Privacy Policy

Made with

NextJS logo

NextJS by

hitesh Sahu

| © 2026 All rights reserved.