Aula 08 - Cluster Kubernetes HA: - Setup interativo com OpenTofu para Talos na Hetzner - CCM, CSI Driver, Cluster Autoscaler, Metrics Server - NGINX Ingress com LoadBalancer (HTTP/HTTPS/SSH) Aula 09 - n8n na Hetzner: - Deploy via Helm com PostgreSQL e Redis - Suporte multi-tenant com add-client.sh - Integração com Hetzner CSI para volumes persistentes Aula 10 - GitLab na Hetzner: - Setup agnóstico: CloudFlare (trusted proxies) ou Let's Encrypt - Anti-affinity para distribuir webservice/sidekiq em nós diferentes - Container Registry e SSH via TCP passthrough - Documentação do erro 422 e solução com trustedCIDRsForXForwardedFor Melhorias gerais: - READMEs atualizados com arquitetura e troubleshooting - Scripts cleanup.sh para todas as aulas - CLAUDE.md atualizado com contexto do projeto
6.3 KiB
CLAUDE.md
This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.
Project Overview
This is a workshop repository for teaching Docker and Kubernetes concepts, specifically focusing on container health checks and liveness probes. It contains a deliberately "buggy" Node.js app that hangs after a configurable number of requests to demonstrate how container orchestration handles unhealthy containers.
Repository Structure
- aula-01/: Docker Compose lesson - basic container deployment with restart policies
- aula-02/: Kubernetes lesson - deployment with liveness probes and ConfigMaps
- aula-03/: Kubernetes lesson - high availability with replicas and readiness probes
- aula-04/: Kubernetes lesson - NGINX Ingress with Keep Request (Lua) for zero-downtime
- aula-05/: Kubernetes lesson - KEDA + Victoria Metrics for metrics-based auto-scaling
- aula-06/: Kubernetes lesson - n8n deployment via Helm (LOCAL environment - Docker Desktop, minikube, kind)
- aula-07/: Talos Linux - creating custom Talos image for Hetzner Cloud
- aula-08/: OpenTofu - provisioning HA Talos Kubernetes cluster on Hetzner Cloud with CCM and LoadBalancer
- aula-09/: Kubernetes lesson - n8n deployment via Helm (Hetzner Cloud with CSI Driver and multi-tenant support)
- aula-10/: Kubernetes lesson - GitLab deployment via Helm with Container Registry and SSH
Running the Examples
Aula 01 (Docker Compose)
cd aula-01
docker-compose up
The app runs on port 3000. After MAX_REQUESTS (default 3), the app stops responding.
Aula 02 (Kubernetes)
cd aula-02
kubectl apply -f configmap.yaml
kubectl apply -f deployment.yaml
kubectl apply -f service.yaml
Access via NodePort 30080. The liveness probe at /health will detect when the app hangs and restart the container.
Aula 03 (Kubernetes - High Availability)
cd aula-03
kubectl apply -f configmap.yaml
kubectl apply -f deployment.yaml
kubectl apply -f service.yaml
Builds on Aula 02 with multiple replicas and a readiness probe. When one pod hangs, the others continue serving requests. The readiness probe removes unhealthy pods from the Service immediately, while the liveness probe restarts them.
Aula 04 (Kubernetes - NGINX Ingress with Keep Request)
Requires NGINX Ingress Controller with Lua support.
cd aula-04
kubectl apply -f configmap.yaml
kubectl apply -f deployment.yaml
kubectl apply -f service.yaml
kubectl apply -f ingress-nginx.yaml
Access via NGINX Ingress. The Keep Request pattern uses Lua to hold requests when backends are unavailable, waiting up to 99s for a pod to become ready instead of returning 503 immediately. This eliminates user-visible failures during pod restarts.
Aula 05 (Kubernetes - KEDA Auto-scaling)
cd aula-05
./setup.sh
Installs Victoria Metrics (metrics collection), KEDA (event-driven autoscaling), and NGINX Ingress. The ScaledObject monitors metrics like unavailable pods and restart counts, automatically scaling the deployment from 5 to 30 replicas based on demand.
Aula 06 (Kubernetes - n8n via Helm - LOCAL)
cd aula-06
./setup.sh
Deploys n8n workflow automation platform via Helm chart in a LOCAL Kubernetes cluster (Docker Desktop, minikube, kind). Queue Mode architecture with main node, workers (2-5 replicas with HPA), webhooks (1-3 replicas with HPA), PostgreSQL, and Redis. Access via http://n8n.localhost (requires NGINX Ingress).
Aula 07 (Talos Linux - Custom Image)
Follow the instructions in aula-07/README.md to create a custom Talos Linux image on Hetzner Cloud using Talos Factory. This is a prerequisite for Aula 08.
Aula 08 (OpenTofu - Talos Cluster on Hetzner Cloud)
cd aula-08
./setup.sh
Provisions a full HA Kubernetes cluster on Hetzner Cloud using OpenTofu:
- 3x Control Plane nodes (CAX11 ARM64)
- 1x Worker node (CAX11 ARM64)
- Private network, Floating IP, Firewall
- Cluster Autoscaler support (1-5 workers)
- Estimated cost: ~€18/month (base), up to ~€33/month with max autoscaling
Prerequisites:
- OpenTofu (
brew install opentofu) - talosctl (
brew install siderolabs/tap/talosctl) - kubectl
- Hetzner Cloud API token
- Talos image ID from Aula 07
Optional - Enable cluster autoscaling:
./install-autoscaler.sh
This installs the Kubernetes Cluster Autoscaler configured for Hetzner Cloud, automatically scaling workers from 1 to 5 based on pending pods.
Optional - Install Hetzner Cloud Controller Manager and NGINX Ingress with LoadBalancer:
./install-ccm.sh
./install-nginx-ingress.sh
This enables automatic LoadBalancer provisioning and exposes HTTP/HTTPS/SSH via a single Hetzner LoadBalancer (~$5/month).
To destroy the infrastructure: ./cleanup.sh
Aula 09 (Kubernetes - n8n via Helm - Hetzner Cloud)
cd aula-09
export KUBECONFIG=/path/to/aula-08/kubeconfig
./setup.sh
Deploys n8n workflow automation platform via Helm chart on Hetzner Cloud. Installs Hetzner CSI Driver for persistent volumes (10Gi minimum). Includes multi-tenant support with add-client.sh script for provisioning clients in separate namespaces.
Prerequisites:
- Completed Aula 08 (Talos cluster on Hetzner)
- Hetzner Cloud API token
Aula 10 (Kubernetes - GitLab via Helm)
cd aula-10
./setup.sh
Deploys GitLab via official Helm chart with:
- Web UI at git.kube.quest
- Container Registry at registry.git.kube.quest
- SSH access via port 22 (TCP passthrough through NGINX)
- PostgreSQL, Redis, and MinIO for storage
- Resource requests of ~4GB to occupy one dedicated CAX11 worker
Prerequisites:
- Completed Aula 08 (Talos cluster)
- Hetzner CSI Driver (Aula 09)
- Hetzner CCM and NGINX Ingress with LoadBalancer (Aula 08)
- DNS configured pointing to LoadBalancer IP
To remove: ./cleanup.sh
App Behavior
The Node.js app (app.js) is intentionally designed to:
- Accept requests normally until
MAX_REQUESTSis reached - Stop responding (hang) after the limit, simulating a crashed but running process
- The
/healthendpoint also stops responding when the app is "stuck"
This behavior demonstrates why process-level monitoring (restart: always) is insufficient and why application-level health checks (liveness probes) are necessary.
Environment Variables
MAX_REQUESTS: Number of requests before the app hangs (default: 3)