πŸš€ Kubernetes webhook 운영 - 개발자 κ²½ν—˜μ€ 살리고, λ¦¬μŠ€ν¬λŠ” μ€„μ΄λŠ” 사둀 곡유

Admission Controllerλ₯Ό μ΄μš©ν•˜μ—¬, Cluster 및 Pod λ“± λ‹€μ–‘ν•œ μ˜μ—­μ— λ¬Έμ œκ°€ λ˜λŠ” 것을 사전에 λ§‰κ³ μžν•˜λŠ” κ²½ν—˜μ˜ 글을 κ³΅μœ ν•©λ‹ˆλ‹€. [좜처] https://medium.com/@sridharcloud/kubernetes-admission-controllers-how-20-webhooks-saved-my-production-cluster-104d930f87dc

Kubernetes Admission Webhook μš΄μ˜μ— λŒ€ν•΄μ„œ

1) κ°œμš”

  • λͺ©μ : Kubernetes ν΄λŸ¬μŠ€ν„°μ˜ μ•ˆμ •μ„±Β·λ³΄μ•ˆΒ·μ»΄ν”ŒλΌμ΄μ–ΈμŠ€λ₯Ό κ°•ν™”ν•˜κΈ° μœ„ν•΄ Admission Webhook을 μ²΄κ³„μ μœΌλ‘œ λ„μž…Β·μš΄μ˜ν•˜λŠ” ν‘œμ€€μ„ μ œμ‹œν•œλ‹€.
  • 적용 λŒ€μƒ: ν”Œλž«νΌ μ—”μ§€λ‹ˆμ–΄λ§ νŒ€, SRE, λ³΄μ•ˆνŒ€, 각 μ„œλΉ„μŠ€ νŒ€μ˜ 배포 νŒŒμ΄ν”„λΌμΈ λ‹΄λ‹Ήμž.
  • λ²”μœ„: OPA Gatekeeper, cert-manager, Datadog(λͺ¨λ‹ˆν„°λ§ μ£Όμž…), Azure Policy/Workload Identity, Linkerd(Service Mesh) λ“± μ£Όμš” Webhookκ³Ό μ •μ±… λ‘€μ•„μ›ƒΒ·μ΅œμ ν™” 절차.

2) λ°°κ²½ 및 문제 μ •μ˜

  • νŒ€ λΆ„μ‚°(6개 νŒ€/30λͺ…), Kubernetes μˆ™λ ¨λ„ 편차, λ¬Έμ„œΒ·κ΅μœ‘Β·λ¦¬λ§ˆμΈλ” μ€‘μ‹¬μ˜ 사후적 ν†΅μ œ ν•œκ³„ λ°œμƒ.
  • λŒ€ν‘œ 사고 μœ ν˜•:
    • λ¦¬μ†ŒμŠ€ 고갈: Memory limit λ―Έμ„€μ • Pod둜 μΈν•œ Node NotReady.
    • 운영 λˆ„λ½: 라벨/μ–΄λ…Έν…Œμ΄μ…˜ 미흑으둜 λͺ¨λ‹ˆν„°λ§ μ‚¬κ°μ§€λŒ€ λ°œμƒ.
    • λ³΄μ•ˆ/μ»΄ν”ŒλΌμ΄μ–ΈμŠ€: 쑰직 μ •μ±… 뢈일치, μˆ˜λ™ 점검 λΉ„μš© κ³Όλ‹€.
    • μΈμ¦μ„œ 만료: μˆ˜λ™ κ΄€λ¦¬λ‘œ κ°±μ‹  λˆ„λ½ β†’ μ„œλΉ„μŠ€ 쀑단.
  • 핡심 인식: β€œμ‚¬λžŒμ΄ 47개 κ·œμΉ™μ„ κΈ°μ–΅β€ν•˜λŠ” μ ‘κ·Όμ—μ„œ β€œAdmission 단계 μžλ™ μ§‘ν–‰β€μœΌλ‘œ μ „ν™˜ ν•„μš”.

3) λͺ©ν‘œ

  1. 사전 차단: 잘λͺ»λœ 섀정이 ν΄λŸ¬μŠ€ν„°μ— 반영되기 μ „ Admission λ‹¨κ³„μ—μ„œ μžλ™ 검증/μˆ˜μ •.
  2. ν‘œμ€€ν™”: νŒ€Β·μ„œλΉ„μŠ€ κ°„ 운영 κΈ°μ€€ 일관성 확보.
  3. κ°€μ‹œμ„±Β·μ‹ λ’°μ„±: λͺ¨λ‹ˆν„°λ§/λ³΄μ•ˆ/μΈμ¦μ„œ λ“± κΈ°λ³Έ ν’ˆμ§ˆ μ†μ„±μ˜ μžλ™ λΆ€μ—¬.
  4. 개발 생산성 μœ μ§€: 점진적 둀아웃과 μ˜ˆμ™ΈΒ·νŠœλ‹μœΌλ‘œ 마찰 μ΅œμ†Œν™”.
  5. μ„±λŠ₯ 보전: Webhook 증가에 λ”°λ₯Έ 배포 지연을 ν—ˆμš© κ°€λŠ₯ν•œ λ²”μœ„(3~4초)둜 관리.

4) λ„μž… μ†”λ£¨μ…˜ κ°œμš”

4.1 OPA Gatekeeper (Policy as Code)

  • μ—­ν• : Validating(검증) 및 μ œν•œκ³Ό κΈ°μ€€(Constraint)을 μ„ μ–Έμ μœΌλ‘œ μ§‘ν–‰.
  • 핡심 μ •μ±…: Resource limits, Security Context, Network Policy λ“±.
  • κ΅ν›ˆ: κ³Όλ„ν•œ 일괄 κ°•μ œλŠ” μ„œλΉ„μŠ€ λ©”μ‹œ μ£Όμž… λ“±κ³Ό 좩돌 κ°€λŠ₯ β†’ λ‹¨μˆœ μ‹œμž‘ ν›„ 점진 κ°•ν™”.

4.2 cert-manager (Certificate Lifecycle)

  • μ—­ν• : Certificate λ¦¬μ†ŒμŠ€ 검증/λ³€ν™˜, μžλ™ κ°±μ‹ (Webhook)으둜 만료 μœ„ν—˜ 제거.
  • 효과: μΈμ¦μ„œ 만료둜 μΈν•œ μ„œλΉ„μŠ€ 쀑단 예방(β€œλ§κ° 방지” 계측).

4.3 Datadog Mutating Webhook (Monitoring-by-Default)

  • μ—­ν• : Pod에 λͺ¨λ‹ˆν„°λ§ 라벨/μ–΄λ…Έν…Œμ΄μ…˜ μžλ™ μ£Όμž….
  • 효과: λͺ¨λ‹ˆν„°λ§ 컀버리지 60% β†’ 95% (1κ°œμ›”). κΈ°λ³Έκ°’μ˜ 힘으둜 운영 ν’ˆμ§ˆ 확보.

4.4 Azure Policy / Workload Identity

  • μ—­ν• : 쑰직 λ³΄μ•ˆ μ •μ±… μžλ™ μ§‘ν–‰(검증), Azure μ„œλΉ„μŠ€ 인증 μžλ™ν™”(λ³€ν™˜).
  • 효과: λ³΄μ•ˆ 감사 λŒ€λΉ„ 수주 λΆ„λŸ‰ μˆ˜μž‘μ—… 제거, 인증 였λ₯˜ κ°μ†Œ.

4.5 Linkerd Webhook (Service Mesh)

  • ꡬ성: Proxy Injector, Tap Injector, Service Profile Validator, Policy Validator.
  • 효과: κ°œλ°œμžλŠ” λ©”μ‹œ λ³΅μž‘λ„λ₯Ό λͺ°λΌλ„ μžλ™ μ£Όμž…/κ²€μ¦μœΌλ‘œ 이점 νšλ“.

5) μ •μ±… 섀계 원칙

  1. μ΅œμ†Œ κΈ°λŠ₯Β·λͺ…ν™• μ±…μž„: 각 Webhook의 λͺ©μ κ³Ό μž…λ ₯/좜λ ₯(λ³€κ²½ ν•­λͺ©)을 λͺ…μ‹œ.
  2. μ˜ˆμ™Έ 섀계: System Namespace, Legacy App, Sidecar μ£Όμž… λ“± μ˜ˆμ™Έ 경둜λ₯Ό λ¨Όμ € μ •μ˜.
  3. μ •μ±… 계측화: Must-Have β†’ Should-Have β†’ Nice-to-Have 3λ‹¨κ³„λ‘œ λΆ„λ₯˜/적용.
  4. κ°€μ‹œμ„±: μœ„λ°˜ λ©”μ‹œμ§€λŠ” λͺ…확·행동가λŠ₯ν•˜κ²Œ(μˆ˜μ • μ˜ˆμ‹œΒ·μ°Έμ‘° 링크 포함).
  5. κ΄€μΈ‘ κ°€λŠ₯μ„±: μœ„λ°˜Β·μ§€μ—°Β·μ‹€νŒ¨μœ¨μ„ λŒ€μ‹œλ³΄λ“œ/KPI둜 μƒμ‹œ 좔적.

6) 단계적 둀아웃(4μ£Ό ν‘œμ€€ 절차)

  • 1μ£Ό μ°¨ – 곡지: μ •μ±… λͺ©μ Β·μ˜ν–₯ λ²”μœ„Β·μˆ˜μ • κ°€μ΄λ“œΒ·μ§€μ› 채널 μ•ˆλ‚΄.
  • 2μ£Ό μ°¨ – κ²½κ³  λͺ¨λ“œ: Warn-only(차단 μ—†μŒ)둜 둜그/μ•Œλ¦Όλ§Œ λ°œν–‰, 자율 μˆ˜μ • κΈ°κ°„.
  • 3μ£Ό μ°¨ – μ§‘ν–‰ λͺ¨λ“œ: 차단 ν™œμ„±ν™”(Blocking). 이 μ‹œμ μ—” λ‹€μˆ˜ μœ„λ°˜ ν•΄μ†Œ μƒνƒœ.
  • 4μ£Ό μ°¨ – μ΅œμ ν™”: μ‹€μ œ μœ„λ°˜ νŒ¨ν„΄Β·μ—£μ§€ μΌ€μ΄μŠ€ λ°˜μ˜ν•΄ Constraint/Webhook νŠœλ‹.

7) μ„±λŠ₯ μ΅œμ ν™” μ „λž΅

  • Namespace νƒ€κ²ŸνŒ…: κ΄€λ ¨ λ„€μž„μŠ€νŽ˜μ΄μŠ€μ—λ§Œ Webhook μ‹€ν–‰.
  • λ¦¬μ†ŒμŠ€ 필터링: ν•„μš”ν•œ λ¦¬μ†ŒμŠ€ νƒ€μž…μ—λ§Œ ν›… 적용.
  • Timeout νŠœλ‹: 30s β†’ 5s둜 μΆ•μ†Œ(μ„œλΉ„μŠ€ νŠΉμ„±μ— 맞게).
  • FailurePolicy: μ„œλΉ„μŠ€ 영ν–₯도에 따라 Fail-Open vs Fail-Closed ꡬ뢄 적용.
  • 효과: 20+ Webhook ꡬ동 ν™˜κ²½μ—μ„œ 배포 μ§€μ—° 8–10초 β†’ 3–4초둜 단좕.

8) μš°μ„ μˆœμœ„(Webhook μ •μ±… 계측)

Must-Have

  • Resource Limits Validation: CPU/Memory ν•œλ„ λ―Έμ„€μ • 차단 β†’ λ¦¬μ†ŒμŠ€ 고갈 λ°©μ§€.
  • Security Context Policies: RunAsNonRoot, readOnlyRootFilesystem λ“± λ³΄μ•ˆ κΈ°μ€€ κ°•μ œ.
  • Network Policy Validation: λ„€μž„μŠ€νŽ˜μ΄μŠ€/μ„œλΉ„μŠ€ 레벨 μ„ΈλΆ„ν™” 보μž₯.
  • Certificate Management: μΈμ¦μ„œ λ°œκΈ‰Β·κ°±μ‹  μžλ™ν™”λ‘œ κ°€μš©μ„± 확보.

Really-Should-Have

  • Label/Annotation Injection: λͺ¨λ‹ˆν„°λ§Β·λΌμš°νŒ…Β·λ°±μ—… λ“± νƒœκ·Έ μžλ™ μ£Όμž….
  • Image Policy Enforcement: μ·¨μ•½/미인가 이미지 μ‚¬μš© 차단(RegistryΒ·μ„œλͺ…Β·SBOM κΈ°μ€€).
  • Service Mesh Injection: μ‚¬μ΄λ“œμΉ΄ μžλ™ μ£Όμž…μœΌλ‘œ λ©”μ‹œ ν‘œμ€€ν™”.
  • Backup Annotation Injection: 데이터 μ˜μ†μ„±Β·λ°±μ—… μ •μ±… 일관화.

Nice-to-Have

  • Cost Allocation Labels: λΉ„μš© νƒœκΉ… μžλ™ν™”λ‘œ νŒ€/μ„œλΉ„μŠ€ λ‹¨μœ„ μ •μ‚° κ°€λŠ₯.
  • Compliance Validation: 규제 μ€€μˆ˜ 검증(산업별 ν•„μˆ˜ ν•­λͺ©).
  • Custom Business Logic: 쑰직/도메인별 νŠΉν™” μ •μ±…(μŠΉμΈΒ·λΌλ²¨Β·μ΄λ¦„ κ·œμΉ™ λ“±).

9) 운영 ν‘œμ€€ 및 κ±°λ²„λ„ŒμŠ€

  • λ³€κ²½ 관리: μ •μ±… 변경은 PRΒ·μ½”λ“œλ¦¬λ·°Β·μŠ€ν…Œμ΄μ§• 검증 ν›„ 점진 반영.
  • μ˜ˆμ™Έ 승인: ν•œμ‹œμ  μ˜ˆμ™ΈλŠ” 만료일/μ‚¬μœ /보완 κ³„νš λͺ…μ‹œ, λŒ€μ²΄ ν†΅μ œ 적용.
  • μž₯μ•  λŒ€μ‘: μ •μ±… μ˜€νƒ‘μ§€(blocking) λ°œμƒ μ‹œ Fail-Open μ „ν™˜ κΈ°μ€€Β·μ ˆμ°¨ 사전 μ •μ˜.
  • μ§€ν‘œ/KPI:
    • 배포 평균 μ§€μ—° μ‹œκ°„(p50/p95), Webhook νƒ€μž„μ•„μ›ƒ/μ‹€νŒ¨μœ¨
    • μ •μ±… μœ„λ°˜ λ°œμƒ κ±΄μˆ˜Β·ν•΄κ²° λ¦¬λ“œνƒ€μž„
    • λͺ¨λ‹ˆν„°λ§ 컀버리지, μΈμ¦μ„œ 만료 사고 건수
    • 이미지 μ •μ±… μœ„λ°˜λ₯ , λ„€νŠΈμ›Œν¬ μ •μ±… μ€€μˆ˜μœ¨

10) μœ„ν—˜ 및 λŒ€μ‘

  • κ³Όλ„ν•œ μ°¨λ‹¨μœΌλ‘œ 개발 마찰 β†’ κ²½κ³  λͺ¨λ“œΒ·κ°€μ΄λ“œΒ·μ‚¬μ „ κ³΅μ§€λ‘œ μ™„ν™”.
  • λ ˆκ±°μ‹œΒ·μ‚¬μ΄λ“œμΉ΄ 좩돌 β†’ Namespace/Label 기반 μ˜ˆμ™Έμ™€ μŠ€μ½”ν”„ ν•œμ •.
  • μ„±λŠ₯ μ €ν•˜ β†’ νƒ€κ²ŸνŒ…Β·ν•„ν„°λ§Β·TimeoutΒ·FailurePolicy μ΅œμ ν™”.
  • 운영 λ³΅μž‘μ„± 증가 β†’ μ •μ±… μΉ΄νƒˆλ‘œκ·ΈΒ·RunbookΒ·λŒ€μ‹œλ³΄λ“œλ‘œ ν‘œμ€€ν™”.

11) 뢀둝 A: Gatekeeper μ˜ˆμ‹œ μŠ€μΌˆλ ˆν†€

apiVersion: templates.gatekeeper.sh/v1beta1
kind: ConstraintTemplate
metadata:
  name: k8srequiredresources
# spec.crd/spec.targets/rego 등은 쑰직 ν‘œμ€€ λ ˆν¬μ— μ˜ˆμ‹œ λ³Έλ¬Έ μ°Έμ‘°

μ˜ˆμ‹œ μ—λŸ¬ λ©”μ‹œμ§€(검증 μ‹€νŒ¨)

admission webhook "validation.gatekeeper.sh" denied the request:
[denied by k8srequiredresources] Container must have memory limits set

12) 뢀둝 B: ν‘œμ€€ Webhook μΉ΄νƒˆλ‘œκ·Έ(μƒ˜ν”Œ)

  • cert-manager-webhook β€” Certificate 검증/λ³€ν™˜/μžλ™ κ°±μ‹ 
  • gatekeeper-webhook β€” OPA Policy 검증
  • datadog-webhook β€” λͺ¨λ‹ˆν„°λ§ 라벨/μ–΄λ…Έν…Œμ΄μ…˜ μžλ™ μ£Όμž…
  • azure-policy-validating-webhook-configuration β€” 쑰직 μ •μ±… 검증
  • azure-wi-webhook-mutating-webhook-configuration β€” Workload Identity μ£Όμž…
  • linkerd-proxy-injector-webhook-config β€” μ‚¬μ΄λ“œμΉ΄ ν”„λ‘μ‹œ μ£Όμž…
  • linkerd-tap-injector-webhook-config β€” Tap 디버깅 μ£Όμž…
  • linkerd-sp-validator-webhook-config β€” μ„œλΉ„μŠ€ ν”„λ‘œν•„ 검증
  • linkerd-policy-validator-webhook-config β€” κΆŒν•œ μ •μ±… 검증

13) κ²°λ‘ 

Admission Webhook은 μ‚¬λžŒμ˜ μ‹€μˆ˜μ™€ μ‚¬λžŒμ˜ 망각을 λͺ¨λ‘ λ‹€λ£¨λŠ” μžλ™ν™” 측이닀. λ‹¨μˆœν•œ 좜발, λͺ…ν™•ν•œ μ˜ˆμ™Έ, 점진적 μ§‘ν–‰, μ„±λŠ₯ μ΅œμ ν™”λΌλŠ” 4좕을 μ§€ν‚€λ©΄ 개발자 κ²½ν—˜κ³Ό ν”Œλž«νΌ 신뒰성을 λ™μ‹œμ— 달성할 수 μžˆλ‹€.