Files

Allyson de Paula aa2bcfce46 aula-07 e aula-08: Cluster Talos HA na Hetzner com Autoscaler

aula-07: Criação de imagem Talos customizada na Hetzner Cloud
- Usa Talos Factory para gerar imagem ARM64/AMD64
- Inclui extensões: qemu-guest-agent, hcloud

aula-08: Provisionamento de cluster Kubernetes Talos via OpenTofu
- 3 Control Planes em HA (CAX11 ARM64)
- 1 Worker Node (CAX11 ARM64)
- Rede privada, Floating IP, Firewall
- Cluster Autoscaler para Hetzner (0-5 workers extras)
- Setup interativo com validação de pré-requisitos
- Custo estimado: ~€18/mês (base)

Também inclui:
- .gitignore para ignorar arquivos sensíveis
- CLAUDE.md com instruções do projeto

2025-12-27 07:12:58 -03:00

4.7 KiB

Raw Blame History

CLAUDE.md

This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.

Project Overview

This is a workshop repository for teaching Docker and Kubernetes concepts, specifically focusing on container health checks and liveness probes. It contains a deliberately "buggy" Node.js app that hangs after a configurable number of requests to demonstrate how container orchestration handles unhealthy containers.

Repository Structure

aula-01/: Docker Compose lesson - basic container deployment with restart policies
aula-02/: Kubernetes lesson - deployment with liveness probes and ConfigMaps
aula-03/: Kubernetes lesson - high availability with replicas and readiness probes
aula-04/: Kubernetes lesson - NGINX Ingress with Keep Request (Lua) for zero-downtime
aula-05/: Kubernetes lesson - KEDA + Victoria Metrics for metrics-based auto-scaling
aula-06/: Kubernetes lesson - n8n deployment via Helm with Queue Mode (workers, webhooks, PostgreSQL, Redis)
aula-07/: Talos Linux - creating custom Talos image for Hetzner Cloud
aula-08/: OpenTofu - provisioning HA Talos Kubernetes cluster on Hetzner Cloud

Running the Examples

Aula 01 (Docker Compose)

cd aula-01
docker-compose up

The app runs on port 3000. After MAX_REQUESTS (default 3), the app stops responding.

Aula 02 (Kubernetes)

cd aula-02
kubectl apply -f configmap.yaml
kubectl apply -f deployment.yaml
kubectl apply -f service.yaml

Access via NodePort 30080. The liveness probe at /health will detect when the app hangs and restart the container.

Aula 03 (Kubernetes - High Availability)

cd aula-03
kubectl apply -f configmap.yaml
kubectl apply -f deployment.yaml
kubectl apply -f service.yaml

Builds on Aula 02 with multiple replicas and a readiness probe. When one pod hangs, the others continue serving requests. The readiness probe removes unhealthy pods from the Service immediately, while the liveness probe restarts them.

Aula 04 (Kubernetes - NGINX Ingress with Keep Request)

Requires NGINX Ingress Controller with Lua support.

cd aula-04
kubectl apply -f configmap.yaml
kubectl apply -f deployment.yaml
kubectl apply -f service.yaml
kubectl apply -f ingress-nginx.yaml

Access via NGINX Ingress. The Keep Request pattern uses Lua to hold requests when backends are unavailable, waiting up to 99s for a pod to become ready instead of returning 503 immediately. This eliminates user-visible failures during pod restarts.

Aula 05 (Kubernetes - KEDA Auto-scaling)

cd aula-05
./setup.sh

Installs Victoria Metrics (metrics collection), KEDA (event-driven autoscaling), and NGINX Ingress. The ScaledObject monitors metrics like unavailable pods and restart counts, automatically scaling the deployment from 5 to 30 replicas based on demand.

Aula 06 (Kubernetes - n8n via Helm)

cd aula-06
./setup.sh

Deploys n8n workflow automation platform via Helm chart with Queue Mode architecture: main node, workers (2-5 replicas with HPA), webhooks (1-3 replicas with HPA), PostgreSQL, and Redis. Access via http://n8n.localhost (requires NGINX Ingress).

Aula 07 (Talos Linux - Custom Image)

Follow the instructions in aula-07/README.md to create a custom Talos Linux image on Hetzner Cloud using Talos Factory. This is a prerequisite for Aula 08.

Aula 08 (OpenTofu - Talos Cluster on Hetzner Cloud)

cd aula-08
./setup.sh

Provisions a full HA Kubernetes cluster on Hetzner Cloud using OpenTofu:

3x Control Plane nodes (CAX11 ARM64)
1x Worker node (CAX11 ARM64)
Private network, Floating IP, Firewall
Cluster Autoscaler support (1-5 workers)
Estimated cost: ~€18/month (base), up to ~€33/month with max autoscaling

Prerequisites:

OpenTofu (brew install opentofu)
talosctl (brew install siderolabs/tap/talosctl)
kubectl
Hetzner Cloud API token
Talos image ID from Aula 07

Optional - Enable cluster autoscaling:

./install-autoscaler.sh

This installs the Kubernetes Cluster Autoscaler configured for Hetzner Cloud, automatically scaling workers from 1 to 5 based on pending pods.

To destroy the infrastructure: ./cleanup.sh

App Behavior

The Node.js app (app.js) is intentionally designed to:

Accept requests normally until MAX_REQUESTS is reached
Stop responding (hang) after the limit, simulating a crashed but running process
The /health endpoint also stops responding when the app is "stuck"

This behavior demonstrates why process-level monitoring (restart: always) is insufficient and why application-level health checks (liveness probes) are necessary.

Environment Variables

MAX_REQUESTS: Number of requests before the app hangs (default: 3)

4.7 KiB Raw Blame History