kade.im
AI Workflow Platform Development (UI to Server way) for Client

AI Workflow Platform Development (UI to Server way) for Client

표시
Project Years
2022-2023
Tags
MLOPS
AI
Infra
Distributed System
Skills
Angular14
Python
K8s
Java-Spring
  • This is exmaple image ( original is secret )
notion image

"Front to Back" Approach Explanation

  • Reason for the "Front to Back" Description
    • Typically, MLOps pipelines implemented using tools like Kubeflow or Airflow rely on SDKs or YAML files for creating pipelines in development environments (e.g., Notebooks, VSCode).
    • However, this project allows users to design pipelines through a drag-and-drop UI. Users can configure required inputs and outputs, mount them, and execute the pipeline directly from the interface.
    • This project leveraged my MLOps expertise with Kubeflow (gained between 2020–2022). I was involved in every aspect, including UI design, frontend development, backend engine development, and Kubernetes-based infrastructure. I also served as the project manager.
 

Pipeline Types and Features

Users can drag and drop seven types of components (Load Data, Container, Notebook, Job, Viewer, Serving) to create pipelines. Arrows represent Input and Output paths.
  1. Load Data
      • Downloads data from a URL, decompresses files based on their format, and mounts them for the next task.
  1. Container
      • Allows specifying images, resource specs, and ports.
      • Creates containers (pods) and selects shared volumes.
      • Provides SSH commands during pipeline execution, enabling users to manually advance to the next stage or let it proceed automatically.
  1. Notebook
      • Generates containers (pods) using specified notebook images and sets notebook start paths.
      • Offers URLs for user access during execution; users can manually advance to the next stage or allow automatic transitions.
  1. Job
      • Similar to Kubernetes Job, it executes commands in the selected image and advances automatically upon completion.
      • Example Use Case: Splitting code and model paths from a prior notebook task into multiple jobs for parallel training with different hyperparameters.
  1. Viewer
      • Provides TensorBoard functionality by specifying log paths. Logs are accessible regardless of the pipeline stage.
      • Allows pausing without affecting the next stage.
  1. Serving
      • Uses BentoML for serving models. Users can specify resource configurations (e.g., GPU, CPU, memory).
      • Enables selection of frameworks (e.g., PyTorch, TensorFlow) and creation of pack.py, service.py, and bentofile.yaml directly in the UI.
      • Backend decodes base64-encoded files from the frontend to create containerized endpoints.
notion image
notion image
 

Flow (Arrow) Implementation

  • Users can specify paths (e.g., From Path, From Task, To Path, To Task) for arrows.
  • Once connected, tasks automatically inherit configurations from prior tasks.
  • The volume manager dynamically creates volumes for missing paths using Kubernetes subPath.
Example JSON Request to Backend Engine:
"volumes": [{ "id": "kade0_NAS", "acl": "RW", "mountPath": "/scl_nas", "subPath": null},{ "id": "kade0_NAS", "acl": "RO", "mountPath": "/con2/input", "subPath": "pipeline/7e2c8a99.../con1/output" }]
  • Yaml
volumeMounts: - mountPath: /mnt/.kade name: run-scripts - mountPath: /scl_nas │ name: kade0**-nas** - mountPath: /con2/input name: kade**0-nas** subPath: pipeline/7e2c8a99-d4a5-4f48-adc2-5e17d0f615ef/eab47b4b-dec3-4b22-a22e-73cd760fe17a/1/con1/output - mountPath: /con2/output name: kade**0-nas** subPath: pipeline/7e2c8a99-d4a5-4f48-adc2-5e17d0f615ef/2afb7f5f-bbab-4f20-a67e-cc5c58826815/1/con2/output ... volumes: - hostPath: path: /nas/tmp/k8s/de0d/0f8c/15e5/4f32/b5de/d9ab/4179/6797/kade0-sf-7e2c-container-container-2 type: DirectoryOrCreate name: run-scripts - name: kade**-nas** persistentVolumeClaim: claimName: **user-kade0-pvc** ```
 

UI-to-JSON Code Generation

  • The drag-and-drop pipeline can be converted into a JSON file for reloading or recovery.
notion image
 

Undo and Rollback Support

  • Implemented undo functionality for drag-and-drop or arrow addition actions using a stack-based system, similar to Ctrl+Z.
 
 

Key Learnings and Insights

This project provided valuable experience in building an intuitive UI for MLOps workflows, developing scalable platforms, and optimizing infrastructure for container-based AI pipelines.
  1. Enhancing User-Friendly Interfaces for MLOps
      • Designed a drag-and-drop UI for pipeline creation, making MLOps more accessible to non-expert users.
      • Integrated real-time configuration updates, ensuring seamless interactions between UI elements and backend services.
      • Learned best practices for designing complex workflows visually, improving usability without sacrificing flexibility.
  1. Building a Scalable and Extensible Platform
      • Developed a modular container-based structure, allowing flexible pipeline execution.
      • Implemented a dynamic volume management system using Kubernetes subPath, ensuring efficient data sharing across pipeline stages.
      • Explored how to maintain code versioning and documentation to support long-term scalability.
  1. Balancing Flexibility and Automation in AI Pipelines
      • Created JSON-based pipeline definitions, allowing users to save, reload, and modify workflows programmatically.
      • Designed undo and rollback mechanisms, providing a seamless user experience while maintaining pipeline integrity.
      • Optimized UI-to-Kubernetes interactions, ensuring smooth execution of AI workflows without excessive manual intervention.
This project deepened my understanding of how to design scalable, user-friendly, and infrastructure-efficient MLOps platforms. It reinforced the importance of intuitive UI/UX, flexible containerization strategies, and robust version control mechanisms in AI pipeline management.