← Back to Showcase

Loading Project...

image

image

Project Overview

Pioneered the development of a prototype monitoring web app for NPU cards tailored for AI inference tasks. This platform is the first of its kind in Korea to integrate multiple NPU chip brands, setting a new standard in the AI-Chip market.

Key Responsibilities

  • Project Management: Led the entire project lifecycle, ensuring seamless communication and issue resolution with Vietnamese designers and developers.
  • Dashboard and Architecture Design: Crafted intuitive dashboard screens and architected data structures to manage users, clusters, servers, NPUs, storage, and inference endpoints.
  • API Development: Created NPU inference endpoint APIs using Kubernetes and Istio to streamline AI inference workflows.
  • Object Storage Integration: Implemented features for user-specific Object Storage (MinIO) creation and deletion to manage inference data effectively.
  • Time-Series Data Collection: Deployed InfluxDB for capturing and analyzing NPU utilization metrics over time.
  • User Interface Development: Built a Streamlit-based UI for NPU inference tasks, enabling video uploads for object detection and returning detailed inference results.

Achievements

  • Market Innovation: Developed the first-ever Prometheus-compatible exporter, CLI tool, and unified monitoring dashboard in Korea’s emerging NPU and AI-Chip market.
  • Efficiency Gains: Reduced the project timeline by one-third through optimized collaboration with the Vietnamese branch.
  • Cloud Inference Demo Service: Delivered a Kubernetes cluster-based NPU inference demo platform, showcasing real-time AI capabilities to stakeholders.

Key Learnings and Insights

This project provided deep technical exposure to NPU monitoring, AI inference infrastructure, and multi-vendor AI chip integration, reinforcing my expertise in observability, system architecture, and cloud-based inference services.

  1. Building a Unified Monitoring Platform for Various AI Chips
    • Designed a multi-brand NPU monitoring system, gaining a comprehensive understanding of how different AI chips handle and export performance data.
    • Learned vendor-specific logging, telemetry, and data extraction mechanisms, allowing seamless integration into a unified dashboard.
  2. Understanding Prometheus and Time-Series Data Collection
    • Developed custom Prometheus exporters for various NPU chips, ensuring real-time performance tracking.
    • Gained a deeper understanding of Prometheus’s data collection architecture and applied similar principles to build custom monitoring tools.
    • Integrated InfluxDB for historical data analysis, enabling time-series insights into NPU utilization and inference workloads.
  3. Enhancing Expertise in Kubernetes and Cloud-Based AI Inference
    • Designed Kubernetes-based inference APIs, improving AI workload efficiency by leveraging Istio and service mesh concepts.
    • Implemented user-specific MinIO storage solutions, allowing efficient management of inference results and AI model artifacts.
    • Built a Streamlit-based UI, enhancing user interaction by enabling real-time video-based AI inference testing.
  4. Scaling AI Infrastructure and Tooling Capabilities
    • Gained proficiency in developing exporters, monitoring tools, and cloud-based inference services.
    • Strengthened DevOps and MLOps skills, ensuring scalable and high-performance AI infrastructure for real-world applications.
    • Successfully managed cross-functional teams, collaborating efficiently with Vietnamese developers and designers to reduce project timelines by one-third.

This project significantly expanded my expertise in AI infrastructure monitoring, multi-chip compatibility, and cloud-based AI inference, setting a strong foundation for scalable and efficient AI system management.

Incoming ConnectionEstablishing secure link...
Today--
Total Operations--