.png?table=block&id=16671659-3274-807f-9fe7-e4377a4f1237&cache=v2)
.png?table=block&id=16671659-3274-8079-94aa-d5020271880d&cache=v2)
Project Overview
Pioneered the development of a prototype monitoring web app for NPU cards tailored for AI inference tasks. This platform is the first of its kind in Korea to integrate multiple NPU chip brands, setting a new standard in the AI-Chip market.
Key Responsibilities
- Project Management: Led the entire project lifecycle, ensuring seamless communication and issue resolution with Vietnamese designers and developers.
- Dashboard and Architecture Design: Crafted intuitive dashboard screens and architected data structures to manage users, clusters, servers, NPUs, storage, and inference endpoints.
- API Development: Created NPU inference endpoint APIs using Kubernetes and Istio to streamline AI inference workflows.
- Object Storage Integration: Implemented features for user-specific Object Storage (MinIO) creation and deletion to manage inference data effectively.
- Time-Series Data Collection: Deployed InfluxDB for capturing and analyzing NPU utilization metrics over time.
- User Interface Development: Built a Streamlit-based UI for NPU inference tasks, enabling video uploads for object detection and returning detailed inference results.
Achievements
- Market Innovation: Developed the first-ever Prometheus-compatible exporter, CLI tool, and unified monitoring dashboard in Korea’s emerging NPU and AI-Chip market.
- Efficiency Gains: Reduced the project timeline by one-third through optimized collaboration with the Vietnamese branch.
- Cloud Inference Demo Service: Delivered a Kubernetes cluster-based NPU inference demo platform, showcasing real-time AI capabilities to stakeholders.