kade.im
Illegal Content Website Collection and Analysis

Illegal Content Website Collection and Analysis

표시
Project Years
2023
Tags
AI
Infra
BigData
Distributed System
Skills
Docker
Python
Java
Celery
Selenium
Postgres

Project Overview

A large-scale project in collaboration with several public and private entities to track and analyze illegal websites.

Key Contributions

  • Conducted preliminary research on AI-based web scraping technologies.
  • Drafted proposal documents and presentation materials after analyzing requirements.
  • Developed a distributed data collection system using RabbitMQ and Python Celery.
  • Created algorithms for automatically tracking over 300 illegal website domains.
  • Designed APIs to check the visibility of illegal websites on Google and Naver search engines.
  • Proposed new UI features and designed database schemas (ERD).
  • Implemented PostgreSQL integration using Peewee ORM.
  • Refactored legacy service code, reducing code size by 80%.
 

Achievements

  • Contributed to winning project bids and took a lead role in ensuring code quality for one of the two teams.