[Dr.Kube] Official Wiki
KR: ์ด ํ์ด์ง๋ Dr.Kube์ ๋น์ , ๊ธฐ์ ์ ๋ฐฉํฅ์ฑ, ๊ทธ๋ฆฌ๊ณ ํ์ ๋ฐฉ์์ ์ ์ํ๋ ํตํฉ ๋ฌธ์์ ๋๋ค. ํ์๊ณผ ์ธ๋ถ ๊ธฐ์ฌ์๋ค์ด ์กฐํ๋กญ๊ฒ ํ์ ํ ์ ์๋๋ก ๋๋ ๊ณต์ ๊ฐ์ด๋๋ผ์ธ์ ๋๋ค.
EN: This page serves as the comprehensive documentation defining the vision, technical direction, and collaboration methods for Dr.Kube. It is an official guideline to ensure seamless collaboration between the team and external contributors.
1. ํ๋ก์ ํธ ๊ฐ์ (Project Overview)
-
Purpose: ํด๋ฌ์คํฐ์ '๋ค์ ๋ฉ์์งโ์ ๋ถ์ํด ์ค์ค๋ก ์ฅ์ ๋ฅผ ์ง๋จํ๊ณ ํด๊ฒฐ ๊ฐ์ด๋๊น์ง ์ ๊ณตํ๋ ์ง๋ฅํ AI ์์ด์ ํธ ; An autonomous AI agent that analyzes โdying messagesโ from failing clusters to diagnose root causes and deliver actionable remediation guidelines.
-
Background / Introduction (KR):
-
Kubernetes์ ๊ณ ์ง์ ์ธ ๋ฌธ์ ์ธ ์ฅ์ ๋ฐ์ ์ MTTR(ํ๊ท ๋ณต๊ตฌ ์๊ฐ)์ ๋จ์ถํ๋ ๊ฒ์ ๋ชฉํ๋ก ํฉ๋๋ค.
-
LangGraph ๊ธฐ๋ฐ ์ถ๋ก ๋ฃจํ๋ฅผ ํตํด ๊ทผ๋ณธ ์์ธ์ ํ์ํ๊ณ , Slack ๋ฑ ๋ฉ์ ์ ๋ก ์ด์์์๊ฒ ์ฆ๊ฐ์ ์ธ ์กฐ์น ๊ฐ์ด๋๋ผ์ธ์ ์ ๊ณตํฉ๋๋ค.
-
์ด์์์ ์น์ธ ํ์ ๋ช ๋ น์ ์ํํ์ฌ ์์ ์ฑ์ ํ๋ณดํ๋ Safe AI Ops ์ํ๊ณ๋ฅผ ๋ง๋ญ๋๋ค.
-
Background / Introduction (EN):
-
Our objective is to minimize Mean Time to Recovery (MTTR), a persistent challenge in Kubernetes environments during system failures. By leveraging LangGraph-based reasoning loops, the system identifies root causes and delivers immediate remediation guidelines to operators via messaging platforms like Slack. We are building a โSafe AIOpsโ ecosystem that ensures operational integrity by executing commands only upon explicit human-in-the-loop approval.
-
Core Values:
-
Safe AIOps (์์ ํ AI ์ด์) โ Human-in-the-Loop ์น์ธ ๊ธฐ๋ฐ์ ์์ ํ ์๋ ๋ณต๊ตฌ / Safe automated recovery based on human-in-the-loop approval
-
GitOps First (GitOps ์ฐ์ ) โ ๋ชจ๋ ๋ณ๊ฒฝ์ Git์ ํตํด์๋ง, ํด๋ฌ์คํฐ ์ง์ ์์ ๊ธ์ง / All changes through Git only, no direct cluster modifications
-
Observability (๊ด์ธก ๊ฐ๋ฅ์ฑ) โ 4๊ฐ์ง ์๊ทธ๋(Metrics, Logs, Traces, Profiles) ๊ธฐ๋ฐ ๊ทผ๋ณธ ์์ธ ๋ถ์ / Root cause analysis based on 4 signals (Metrics, Logs, Traces, Profiles)
2. ํ ๊ตฌ์ฑ (The Team)
Roles and responsibilities for the member team.
|์ด๋ฆ (Name)|ID|์ญํ (Role)|SNS|์ฃผ์ ์ฑ ์ (Responsibilities - KR/EN)|
| โ | โ | โ | โ | โ |
|๋ฐฑ์ข
ํ|@jonghwa|Team Leader|LinkedIn|๋ก๋๋งต ๋ฐ ์ต์ข
์์ฌ๊ฒฐ์ / Roadmap & Final decision-making|
|๊นํ๋น|@taebin|Member|LinkedIn|๋ฉค๋ฒ / Member|
|๋ฐ์น๊ท|@seunggyu|Member|LinkedIn|๋ฉค๋ฒ / Member|
|์ ์ง์น|@jinseung|Member|LinkedIn|๋ฉค๋ฒ / Member|
|์์ฌํ|@jaehoon|Member|LinkedIn|๋ฉค๋ฒ / Member|
3. ๊ธฐ์ ์คํ (Tech Stack)
-
Language: Python 3.11+
-
Framework: LangGraph, FastAPI
-
LLM: Google Gemini Flash, Ollama (local fallback)
-
Infra: Kubernetes (Kind), ArgoCD, Docker, Helm, Chaos Mesh
-
Observability: Prometheus, Grafana, Loki, Tempo, Pyroscope, Alloy
-
Security: SOPS + age, cert-manager (Letโs Encrypt)
-
Communication: Slack, GitHub Issues, Discord
4. ๋ก๋๋งต (Roadmap)
-
Phase 1: ๊ด์ธก์ฑ ์คํ ๊ตฌ์ถ ๋ฐ ์๋ฆผ ์ฒด๊ณ ์์ฑ (Observability Stack & Alert System)

-
Phase 2: LangGraph ์์ด์ ํธ ์ํฌํ๋ก์ฐ ๊ฐ๋ฐ ๋ฐ ์นด์ค์ค ํ ์คํธ (Agent Workflow Dev & Chaos Testing)

-
Phase 3: Human-in-the-Loop ํผ๋๋ฐฑ ๋ฃจํ ๋ฐ E2E ํตํฉ ํ ์คํธ (HITL Feedback & E2E Integration Test)
5. ์ฐธ์ฌ ๋ฐฉ๋ฒ (How to Contribute)
-
Branch Strategy / ๋ธ๋์น ์ ๋ต:
-
main๋ธ๋์น๊ฐ ๊ธฐ๋ณธ์ด๋ฉฐ, ArgoCD๊ฐ ์๋ ๋๊ธฐํํฉ๋๋ค. /mainbranch is the default; ArgoCD auto-syncs from it. -
์์ ์ feature ๋ธ๋์น๋ฅผ ์์ฑํ๊ณ PR์ ํตํด ๋ณํฉํฉ๋๋ค. / Create feature branches for work and merge via PR.
-
Issues: ๋ฒ๊ทธ๋ ๊ธฐ๋ฅ ์ ์์ GitHub Issues๋ฅผ ํ์ฉํ์ธ์. / Please use GitHub Issues for bug reports or feature requests.
-
PRs: ๋ชจ๋ Pull Request๋ ๋ฆฌ๋(
@b100to)์ ๋ฆฌ๋ทฐ ํ ๋ณํฉ๋ฉ๋๋ค. / All PRs require review by the Team Leader (@b100to) before merging. -
GitOps ์์น / GitOps Principles:
-
kubectl apply/patch๋ฑ ํด๋ฌ์คํฐ ์ง์ ์์ ๊ธ์ง โ ๋ณ๊ฒฝ์ ์ค์ง Git์ ํตํด์๋ง ์ํ / No direct cluster modifications โ all changes through Git only -
values/*.yaml์์ โ PR ์์ฑ โ ArgoCD Sync / Editvalues/*.yamlโ Create PR โ ArgoCD Sync -
Guide: CONTRIBUTING.md ํ์ผ์ ์ฐธ๊ณ ํ์ธ์. / Please refer to the CONTRIBUTING.md file.
-
Discord (Official): [Dr.Kube Invite Link]
-
KR: ์ค์๊ฐ ์ํต ๋ฐ ๊ธฐ์ ์ง์์ ์ํ ์ฑ๋์ ๋๋ค.
-
EN: Official channel for real-time communication and technical support.
6. ๋ฆฌ์์ค ๋ฐ ๋งํฌ (Resources & Links)
-
GitHub Repository: dr-kube/dr-kube
-
Docs: Architecture / Roadmap / Changelog
| This is a space where knowledge is not merely consumed, but respected, sovereign, and connectedโshared together with cloud industry professionals (Bros).|
| ์ง์์ด ์๋น๋์ง ์๊ณ ์กด์คยท์ฃผ๊ถ๋ณด์ฅยท์ฐ๊ฒฐ๋๋ ๊ณต๊ฐ์ผ๋ก ํด๋ผ์ฐ๋ ํ์ ์ ๋ฌธ๊ฐ(Bro)์ ํจ๊ป ๊ณต์ ํ๊ณ ์์ต๋๋ค. |