kade.im
AIOps (AI-based Troubleshooting and Issue Resolution Platform)

AIOps (AI-based Troubleshooting and Issue Resolution Platform)

표시
Project Years
2021
Tags
AI
Infra
DevOps
Skills
gitlab-ci
ReactJS
Docker
K8s

Project Overview

Developed a platform to recommend solutions for errors in operating clusters using machine learning models trained on various error messages.

Key Responsibilities

  • Built a CI/CD pipeline for backend and frontend services on Kubernetes using GitLab-CI.
  • Designed a drag-and-drop issue board frontend inspired by Jira's Kanban board.

Achievements

  • Simplified the development-to-deployment process with CI/CD pipelines.
  • Delivered a user-friendly issue management board.
 

Key Learnings and Insights

This project was an early attempt to automate troubleshooting using machine learning models, developed before ChatGPT became widely available. It demonstrated the potential of AI in DevOps and system operations.
  1. Innovative Approach to Automated Issue Resolution
      • Built a system that suggested solutions for recurring cluster errors, reducing the need for manual research and troubleshooting.
      • Recognized the importance of structured error logging and ML-driven recommendation systems in DevOps environments.
  1. Pioneering AI-Assisted Troubleshooting Before LLMs Became Mainstream
      • At the time, this feature provided groundbreaking automation for error handling, significantly improving operational efficiency.
      • While ChatGPT and LLMs now handle a similar role, the project still holds value for internal company use where data privacy and proprietary infrastructure constraints apply.
This project reinforced the value of AI-driven automation in DevOps and provided early insights into AI-powered troubleshooting, shaping my understanding of how modern AI tools can enhance operational workflows.