aula-07 e aula-08: Cluster Talos HA na Hetzner com Autoscaler
aula-07: Criação de imagem Talos customizada na Hetzner Cloud - Usa Talos Factory para gerar imagem ARM64/AMD64 - Inclui extensões: qemu-guest-agent, hcloud aula-08: Provisionamento de cluster Kubernetes Talos via OpenTofu - 3 Control Planes em HA (CAX11 ARM64) - 1 Worker Node (CAX11 ARM64) - Rede privada, Floating IP, Firewall - Cluster Autoscaler para Hetzner (0-5 workers extras) - Setup interativo com validação de pré-requisitos - Custo estimado: ~€18/mês (base) Também inclui: - .gitignore para ignorar arquivos sensíveis - CLAUDE.md com instruções do projeto
This commit is contained in:
114
CLAUDE.md
Normal file
114
CLAUDE.md
Normal file
@@ -0,0 +1,114 @@
|
||||
# CLAUDE.md
|
||||
|
||||
This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.
|
||||
|
||||
## Project Overview
|
||||
|
||||
This is a workshop repository for teaching Docker and Kubernetes concepts, specifically focusing on container health checks and liveness probes. It contains a deliberately "buggy" Node.js app that hangs after a configurable number of requests to demonstrate how container orchestration handles unhealthy containers.
|
||||
|
||||
## Repository Structure
|
||||
|
||||
- **aula-01/**: Docker Compose lesson - basic container deployment with restart policies
|
||||
- **aula-02/**: Kubernetes lesson - deployment with liveness probes and ConfigMaps
|
||||
- **aula-03/**: Kubernetes lesson - high availability with replicas and readiness probes
|
||||
- **aula-04/**: Kubernetes lesson - NGINX Ingress with Keep Request (Lua) for zero-downtime
|
||||
- **aula-05/**: Kubernetes lesson - KEDA + Victoria Metrics for metrics-based auto-scaling
|
||||
- **aula-06/**: Kubernetes lesson - n8n deployment via Helm with Queue Mode (workers, webhooks, PostgreSQL, Redis)
|
||||
- **aula-07/**: Talos Linux - creating custom Talos image for Hetzner Cloud
|
||||
- **aula-08/**: OpenTofu - provisioning HA Talos Kubernetes cluster on Hetzner Cloud
|
||||
|
||||
## Running the Examples
|
||||
|
||||
### Aula 01 (Docker Compose)
|
||||
```bash
|
||||
cd aula-01
|
||||
docker-compose up
|
||||
```
|
||||
The app runs on port 3000. After MAX_REQUESTS (default 3), the app stops responding.
|
||||
|
||||
### Aula 02 (Kubernetes)
|
||||
```bash
|
||||
cd aula-02
|
||||
kubectl apply -f configmap.yaml
|
||||
kubectl apply -f deployment.yaml
|
||||
kubectl apply -f service.yaml
|
||||
```
|
||||
Access via NodePort 30080. The liveness probe at `/health` will detect when the app hangs and restart the container.
|
||||
|
||||
### Aula 03 (Kubernetes - High Availability)
|
||||
```bash
|
||||
cd aula-03
|
||||
kubectl apply -f configmap.yaml
|
||||
kubectl apply -f deployment.yaml
|
||||
kubectl apply -f service.yaml
|
||||
```
|
||||
Builds on Aula 02 with multiple replicas and a readiness probe. When one pod hangs, the others continue serving requests. The readiness probe removes unhealthy pods from the Service immediately, while the liveness probe restarts them.
|
||||
|
||||
### Aula 04 (Kubernetes - NGINX Ingress with Keep Request)
|
||||
Requires NGINX Ingress Controller with Lua support.
|
||||
|
||||
```bash
|
||||
cd aula-04
|
||||
kubectl apply -f configmap.yaml
|
||||
kubectl apply -f deployment.yaml
|
||||
kubectl apply -f service.yaml
|
||||
kubectl apply -f ingress-nginx.yaml
|
||||
```
|
||||
Access via NGINX Ingress. The Keep Request pattern uses Lua to hold requests when backends are unavailable, waiting up to 99s for a pod to become ready instead of returning 503 immediately. This eliminates user-visible failures during pod restarts.
|
||||
|
||||
### Aula 05 (Kubernetes - KEDA Auto-scaling)
|
||||
```bash
|
||||
cd aula-05
|
||||
./setup.sh
|
||||
```
|
||||
Installs Victoria Metrics (metrics collection), KEDA (event-driven autoscaling), and NGINX Ingress. The ScaledObject monitors metrics like unavailable pods and restart counts, automatically scaling the deployment from 5 to 30 replicas based on demand.
|
||||
|
||||
### Aula 06 (Kubernetes - n8n via Helm)
|
||||
```bash
|
||||
cd aula-06
|
||||
./setup.sh
|
||||
```
|
||||
Deploys n8n workflow automation platform via Helm chart with Queue Mode architecture: main node, workers (2-5 replicas with HPA), webhooks (1-3 replicas with HPA), PostgreSQL, and Redis. Access via http://n8n.localhost (requires NGINX Ingress).
|
||||
|
||||
### Aula 07 (Talos Linux - Custom Image)
|
||||
Follow the instructions in `aula-07/README.md` to create a custom Talos Linux image on Hetzner Cloud using Talos Factory. This is a prerequisite for Aula 08.
|
||||
|
||||
### Aula 08 (OpenTofu - Talos Cluster on Hetzner Cloud)
|
||||
```bash
|
||||
cd aula-08
|
||||
./setup.sh
|
||||
```
|
||||
Provisions a full HA Kubernetes cluster on Hetzner Cloud using OpenTofu:
|
||||
- 3x Control Plane nodes (CAX11 ARM64)
|
||||
- 1x Worker node (CAX11 ARM64)
|
||||
- Private network, Floating IP, Firewall
|
||||
- Cluster Autoscaler support (1-5 workers)
|
||||
- Estimated cost: ~€18/month (base), up to ~€33/month with max autoscaling
|
||||
|
||||
Prerequisites:
|
||||
- OpenTofu (`brew install opentofu`)
|
||||
- talosctl (`brew install siderolabs/tap/talosctl`)
|
||||
- kubectl
|
||||
- Hetzner Cloud API token
|
||||
- Talos image ID from Aula 07
|
||||
|
||||
Optional - Enable cluster autoscaling:
|
||||
```bash
|
||||
./install-autoscaler.sh
|
||||
```
|
||||
This installs the Kubernetes Cluster Autoscaler configured for Hetzner Cloud, automatically scaling workers from 1 to 5 based on pending pods.
|
||||
|
||||
To destroy the infrastructure: `./cleanup.sh`
|
||||
|
||||
## App Behavior
|
||||
|
||||
The Node.js app (`app.js`) is intentionally designed to:
|
||||
1. Accept requests normally until `MAX_REQUESTS` is reached
|
||||
2. Stop responding (hang) after the limit, simulating a crashed but running process
|
||||
3. The `/health` endpoint also stops responding when the app is "stuck"
|
||||
|
||||
This behavior demonstrates why process-level monitoring (restart: always) is insufficient and why application-level health checks (liveness probes) are necessary.
|
||||
|
||||
## Environment Variables
|
||||
|
||||
- `MAX_REQUESTS`: Number of requests before the app hangs (default: 3)
|
||||
Reference in New Issue
Block a user