#!/bin/bash ############################################################ # Aula 08 - OpenTofu + Talos + Hetzner Cloud # Provisiona cluster Kubernetes Talos em HA ############################################################ set -e # Cores para output RED='\033[0;31m' GREEN='\033[0;32m' YELLOW='\033[1;33m' BLUE='\033[0;34m' NC='\033[0m' # No Color # Diretório do script SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)" cd "$SCRIPT_DIR" # Funções de log log_info() { echo -e "${BLUE}[INFO]${NC} $1" } log_success() { echo -e "${GREEN}[OK]${NC} $1" } log_warn() { echo -e "${YELLOW}[WARN]${NC} $1" } log_error() { echo -e "${RED}[ERRO]${NC} $1" } ############################################################ # VERIFICAÇÃO DE PRÉ-REQUISITOS ############################################################ echo "" echo "============================================" echo " Aula 08 - Cluster Talos via OpenTofu" echo "============================================" echo "" log_info "Verificando pré-requisitos..." # Verificar OpenTofu if ! command -v tofu &> /dev/null; then log_error "OpenTofu não encontrado!" echo "" echo "Instale o OpenTofu:" echo " brew install opentofu # macOS" echo " snap install opentofu # Linux" echo "" echo "Mais info: https://opentofu.org/docs/intro/install/" exit 1 fi log_success "OpenTofu $(tofu version | head -1)" # Verificar talosctl if ! command -v talosctl &> /dev/null; then log_error "talosctl não encontrado!" echo "" echo "Instale o talosctl:" echo " brew install siderolabs/tap/talosctl # macOS" echo " curl -sL https://talos.dev/install | sh # Linux" echo "" exit 1 fi log_success "talosctl $(talosctl version --client 2>/dev/null | grep 'Client' | awk '{print $2}' || echo 'instalado')" # Verificar kubectl if ! command -v kubectl &> /dev/null; then log_error "kubectl não encontrado!" echo "" echo "Instale o kubectl:" echo " brew install kubectl # macOS" echo " snap install kubectl # Linux" echo "" exit 1 fi log_success "kubectl $(kubectl version --client -o yaml 2>/dev/null | grep gitVersion | awk '{print $2}' || echo 'instalado')" # Verificar Helm if ! command -v helm &> /dev/null; then log_error "Helm não encontrado!" echo "" echo "Instale o Helm:" echo " brew install helm # macOS" echo " snap install helm --classic # Linux" echo "" exit 1 fi log_success "Helm $(helm version --short 2>/dev/null | head -1)" # Verificar hcloud CLI (opcional, mas útil) if command -v hcloud &> /dev/null; then log_success "hcloud CLI instalado" else log_warn "hcloud CLI não instalado (opcional)" echo " Para listar imagens: brew install hcloud" fi echo "" ############################################################ # COLETA DE CREDENCIAIS ############################################################ # Verificar se terraform.tfvars já existe if [ -f "terraform.tfvars" ]; then log_warn "terraform.tfvars já existe!" read -p "Deseja sobrescrever? (s/N): " overwrite if [[ ! "$overwrite" =~ ^[Ss]$ ]]; then log_info "Usando terraform.tfvars existente" SKIP_CREDENTIALS=true fi fi if [ "$SKIP_CREDENTIALS" != "true" ]; then echo "============================================" echo " Configuração de Credenciais" echo "============================================" echo "" # Token Hetzner echo "1. Token da API Hetzner Cloud" echo " Obtenha em: https://console.hetzner.cloud/projects/*/security/tokens" echo "" read -sp " Digite o token: " HCLOUD_TOKEN echo "" if [ -z "$HCLOUD_TOKEN" ]; then log_error "Token não pode ser vazio!" exit 1 fi log_success "Token configurado" echo "" # SSH Key echo "2. Chave SSH pública" DEFAULT_SSH_KEY="$HOME/.ssh/id_rsa.pub" if [ -f "$DEFAULT_SSH_KEY" ]; then echo " Encontrada: $DEFAULT_SSH_KEY" read -p " Usar esta chave? (S/n): " use_default if [[ ! "$use_default" =~ ^[Nn]$ ]]; then SSH_PUBLIC_KEY=$(cat "$DEFAULT_SSH_KEY") fi fi if [ -z "$SSH_PUBLIC_KEY" ]; then read -p " Caminho da chave pública: " ssh_path if [ -f "$ssh_path" ]; then SSH_PUBLIC_KEY=$(cat "$ssh_path") else log_error "Arquivo não encontrado: $ssh_path" exit 1 fi fi log_success "Chave SSH configurada" echo "" # ID da imagem Talos echo "3. ID da imagem Talos (snapshot da aula-07)" echo " Para listar: hcloud image list --type snapshot" echo "" read -p " Digite o ID da imagem: " TALOS_IMAGE_ID if [ -z "$TALOS_IMAGE_ID" ]; then log_error "ID da imagem não pode ser vazio!" exit 1 fi # Validar que é número if ! [[ "$TALOS_IMAGE_ID" =~ ^[0-9]+$ ]]; then log_error "ID deve ser um número!" exit 1 fi log_success "Image ID: $TALOS_IMAGE_ID" echo "" # Configuração do Cluster echo "============================================" echo " Configuração do Cluster" echo "============================================" echo "" # Cluster HA? echo "4. Modo de Alta Disponibilidade (HA)" echo "" echo " HA = 3 Control Planes (tolerância a falhas)" echo " Single = 1 Control Plane (menor custo)" echo "" read -p " Cluster HA? (S/n): " enable_ha if [[ "$enable_ha" =~ ^[Nn]$ ]]; then ENABLE_HA="false" ENABLE_LB="false" log_info "Modo Single: 1 Control Plane" else ENABLE_HA="true" log_success "Modo HA: 3 Control Planes" echo "" # LoadBalancer? echo "5. LoadBalancer para o Control Plane" echo "" echo " Com LB: HA real (qualquer CP pode cair)" echo " Sem LB: Floating IP (se CP-0 cair, cluster inacessível)" echo "" echo " O LoadBalancer também serve para:" echo " - HTTP/HTTPS (NGINX Ingress)" echo " - SSH (Gitea)" echo " - Talos API" echo "" echo " Custo adicional: ~\$6/mes" echo "" read -p " Usar LoadBalancer? (S/n): " enable_lb if [[ "$enable_lb" =~ ^[Nn]$ ]]; then ENABLE_LB="false" log_info "Usando Floating IP (sem HA real do CP)" else ENABLE_LB="true" log_success "LoadBalancer habilitado" fi fi echo "" # Criar terraform.tfvars log_info "Criando terraform.tfvars..." cat > terraform.tfvars << EOF # Gerado automaticamente por setup.sh # $(date) hcloud_token = "$HCLOUD_TOKEN" ssh_public_key = "$SSH_PUBLIC_KEY" talos_image_id = $TALOS_IMAGE_ID environment = "prod" enable_ha = $ENABLE_HA enable_loadbalancer = $ENABLE_LB EOF log_success "terraform.tfvars criado" fi echo "" ############################################################ # INICIALIZAÇÃO DO OPENTOFU ############################################################ echo "============================================" echo " Inicializando OpenTofu" echo "============================================" echo "" log_info "Executando tofu init..." tofu init log_success "OpenTofu inicializado" echo "" ############################################################ # PLANEJAMENTO ############################################################ echo "============================================" echo " Planejando Infraestrutura" echo "============================================" echo "" log_info "Executando tofu plan..." tofu plan -out=tfplan echo "" log_success "Plano criado!" echo "" # Mostrar resumo baseado na configuração echo "============================================" echo " Recursos a serem criados:" echo "============================================" echo "" # Ler configuração do tfvars ENABLE_HA_CONFIG=$(grep 'enable_ha' terraform.tfvars 2>/dev/null | grep -o 'true\|false' || echo "true") ENABLE_LB_CONFIG=$(grep 'enable_loadbalancer' terraform.tfvars 2>/dev/null | grep -o 'true\|false' || echo "true") if [ "$ENABLE_HA_CONFIG" = "true" ]; then CP_COUNT=3 echo " - 3x CAX11 Control Planes = 3 x \$4.59 = \$13.77" else CP_COUNT=1 echo " - 1x CAX11 Control Plane = 1 x \$4.59 = \$4.59" fi echo " - 1x CAX11 Worker = 1 x \$4.59 = \$4.59" if [ "$ENABLE_LB_CONFIG" = "true" ]; then echo " - 1x Load Balancer LB11 = \$5.99" echo " - Rede/Firewall/Placement = Gratis" LB_COST=5.99 else echo " - 1x Floating IPv4 = \$3.29" echo " - Rede/Firewall/Placement = Gratis" LB_COST=3.29 fi TOTAL_COST=$(echo "scale=2; ($CP_COUNT + 1) * 4.59 + $LB_COST" | bc) echo "" echo " Custo estimado: ~\$${TOTAL_COST}/mes" echo "" ############################################################ # APLICAÇÃO ############################################################ read -p "Deseja aplicar o plano? (s/N): " apply if [[ ! "$apply" =~ ^[Ss]$ ]]; then log_warn "Operação cancelada pelo usuário" echo "" echo "Para aplicar manualmente:" echo " tofu apply tfplan" echo "" exit 0 fi echo "" log_info "Aplicando infraestrutura..." echo "" tofu apply tfplan echo "" log_success "Infraestrutura provisionada!" echo "" ############################################################ # CONFIGURAÇÃO PÓS-DEPLOY ############################################################ echo "============================================" echo " Configuração Pós-Deploy" echo "============================================" echo "" # Aguardar cluster ficar pronto log_info "Aguardando cluster Talos ficar pronto..." sleep 10 # Configurar talosctl if [ -f "talosconfig" ]; then log_info "Configurando talosctl..." export TALOSCONFIG="$SCRIPT_DIR/talosconfig" # Obter IP do control plane CP_IP=$(tofu output -raw control_plane_ip 2>/dev/null || echo "") if [ -n "$CP_IP" ]; then log_info "Aguardando API do Talos em $CP_IP..." # Tentar health check (pode demorar alguns minutos) for i in {1..30}; do if talosctl --talosconfig talosconfig -n "$CP_IP" health --wait-timeout 10s 2>/dev/null; then log_success "Cluster Talos saudável!" break fi echo -n "." sleep 10 done echo "" fi fi # Configurar kubectl if [ -f "kubeconfig" ]; then log_info "Configurando kubectl..." export KUBECONFIG="$SCRIPT_DIR/kubeconfig" log_info "Aguardando nodes ficarem Ready..." for i in {1..30}; do if kubectl get nodes 2>/dev/null | grep -q "Ready"; then log_success "Nodes prontos!" kubectl get nodes break fi echo -n "." sleep 10 done echo "" fi echo "" ############################################################ # INSTALAÇÃO DO CCM (Cloud Controller Manager) ############################################################ echo "============================================" echo " Instalando Hetzner Cloud Controller Manager" echo "============================================" echo "" # Obter token do terraform.tfvars HCLOUD_TOKEN=$(grep 'hcloud_token' terraform.tfvars | cut -d'"' -f2) NETWORK_ID=$(tofu output -raw network_id 2>/dev/null || echo "") if [ -z "$HCLOUD_TOKEN" ]; then log_error "Não foi possível obter HCLOUD_TOKEN!" exit 1 fi # Criar secret para o CCM log_info "Criando secret hcloud..." SECRET_DATA="--from-literal=token=$HCLOUD_TOKEN" if [ -n "$NETWORK_ID" ]; then SECRET_DATA="$SECRET_DATA --from-literal=network=$NETWORK_ID" fi kubectl create secret generic hcloud \ $SECRET_DATA \ -n kube-system \ --dry-run=client -o yaml | kubectl apply -f - log_success "Secret criado" # Instalar CCM via Helm log_info "Instalando CCM via Helm..." helm repo add hcloud https://charts.hetzner.cloud 2>/dev/null || true helm repo update hcloud HELM_ARGS="--set networking.enabled=true" HELM_ARGS="$HELM_ARGS --set networking.clusterCIDR=10.244.0.0/16" if [ -n "$NETWORK_ID" ]; then HELM_ARGS="$HELM_ARGS --set networking.network.id=$NETWORK_ID" fi helm upgrade --install hccm hcloud/hcloud-cloud-controller-manager \ -n kube-system \ $HELM_ARGS \ --wait log_success "CCM instalado!" # Aguardar taint ser removido dos workers log_info "Aguardando CCM inicializar workers..." for i in {1..30}; do if ! kubectl get nodes -o jsonpath='{.items[*].spec.taints[*].key}' 2>/dev/null | grep -q "node.cloudprovider.kubernetes.io/uninitialized"; then log_success "Workers inicializados!" break fi echo -n "." sleep 5 done echo "" ############################################################ # INSTALAÇÃO DO CLUSTER AUTOSCALER ############################################################ echo "" echo "============================================" echo " Instalando Cluster Autoscaler" echo "============================================" echo "" # Obter configurações do OpenTofu log_info "Obtendo configurações do OpenTofu..." WORKER_CONFIG_BASE64=$(tofu output -raw autoscaler_worker_config 2>/dev/null) TALOS_IMAGE_ID=$(tofu output -raw autoscaler_image_id 2>/dev/null) CLUSTER_NAME=$(tofu output -raw cluster_name 2>/dev/null) FIREWALL_ID=$(tofu output -raw firewall_id 2>/dev/null) SSH_KEY_NAME=$(tofu output -raw ssh_key_name 2>/dev/null) if [ -z "$WORKER_CONFIG_BASE64" ]; then log_error "Não foi possível obter configuração do worker!" exit 1 fi log_success "Configurações obtidas" echo " - Cluster: $CLUSTER_NAME" echo " - Image ID: $TALOS_IMAGE_ID" echo " - Network ID: $NETWORK_ID" echo "" # Criar namespace log_info "Criando namespace cluster-autoscaler..." kubectl create namespace cluster-autoscaler --dry-run=client -o yaml | kubectl apply -f - kubectl label namespace cluster-autoscaler pod-security.kubernetes.io/enforce=privileged --overwrite # Criar secret log_info "Criando secret do autoscaler..." kubectl create secret generic hcloud-autoscaler \ --namespace cluster-autoscaler \ --from-literal=token="$HCLOUD_TOKEN" \ --from-literal=cloud-init="$WORKER_CONFIG_BASE64" \ --dry-run=client -o yaml | kubectl apply -f - log_success "Secret criado" # Aplicar manifesto log_info "Aplicando manifesto do cluster-autoscaler..." cat cluster-autoscaler.yaml | \ sed "s|\${TALOS_IMAGE_ID}|$TALOS_IMAGE_ID|g" | \ sed "s|\${NETWORK_NAME}|$CLUSTER_NAME-network|g" | \ sed "s|\${FIREWALL_NAME}|$CLUSTER_NAME-firewall|g" | \ sed "s|\${SSH_KEY_NAME}|$SSH_KEY_NAME|g" | \ kubectl apply -f - # Aguardar pod ficar pronto log_info "Aguardando pod do autoscaler..." kubectl wait --for=condition=ready pod \ -l app=cluster-autoscaler \ -n cluster-autoscaler \ --timeout=120s log_success "Cluster Autoscaler instalado!" echo "" ############################################################ # INSTALAÇÃO DO HETZNER CSI DRIVER ############################################################ echo "============================================" echo " Instalando Hetzner CSI Driver" echo "============================================" echo "" log_info "Instalando CSI Driver via Helm..." helm upgrade --install hcloud-csi hcloud/hcloud-csi \ -n kube-system \ -f "$SCRIPT_DIR/hcloud-csi-values.yaml" \ --wait \ --timeout 5m log_success "Hetzner CSI Driver instalado!" # Verificar StorageClass log_info "Verificando StorageClass..." kubectl get storageclass hcloud-volumes # Configurar PDB para StatefulSets (protecao durante drain) log_info "Criando PodDisruptionBudget para StatefulSets..." kubectl apply -f "$SCRIPT_DIR/statefulset-pdb.yaml" log_success "PDB criado" echo "" ############################################################ # INSTALAÇÃO DO NGINX INGRESS CONTROLLER ############################################################ echo "============================================" echo " Instalando NGINX Ingress Controller" echo "============================================" echo "" # Detectar localização do cluster para o LoadBalancer CLUSTER_LOCATION=$(kubectl get nodes -o jsonpath='{.items[0].metadata.labels.topology\.kubernetes\.io/zone}' 2>/dev/null | cut -d'-' -f1) if [ -z "$CLUSTER_LOCATION" ]; then CLUSTER_LOCATION="nbg1" # Default para Nuremberg fi log_info "Localização do cluster: $CLUSTER_LOCATION" log_info "Instalando NGINX Ingress via Helm..." helm repo add ingress-nginx https://kubernetes.github.io/ingress-nginx 2>/dev/null || true helm repo update ingress-nginx helm upgrade --install ingress-nginx ingress-nginx/ingress-nginx \ --namespace ingress-nginx \ --create-namespace \ --set controller.allowSnippetAnnotations=true \ --set controller.config.annotations-risk-level=Critical \ --set controller.admissionWebhooks.enabled=false \ --set "controller.service.annotations.load-balancer\.hetzner\.cloud/location=${CLUSTER_LOCATION}" \ --set "controller.service.annotations.load-balancer\.hetzner\.cloud/use-private-ip=true" \ --wait --timeout 5m log_success "NGINX Ingress Controller instalado!" # Aguardar LoadBalancer obter IP log_info "Aguardando LoadBalancer obter IP externo..." for i in {1..30}; do LB_IP=$(kubectl get svc -n ingress-nginx ingress-nginx-controller \ -o jsonpath='{.status.loadBalancer.ingress[0].ip}' 2>/dev/null) if [ -n "$LB_IP" ]; then log_success "LoadBalancer IP: $LB_IP" break fi echo -n "." sleep 5 done echo "" ############################################################ # RESUMO FINAL ############################################################ echo "============================================" echo " Cluster Provisionado com Sucesso!" echo "============================================" echo "" # Mostrar outputs echo "Endpoints:" tofu output -raw kubernetes_api_endpoint 2>/dev/null && echo "" || true tofu output -raw talos_api_endpoint 2>/dev/null && echo "" || true echo "" echo "Componentes instalados:" echo " - Hetzner Cloud Controller Manager (CCM)" echo " - Cluster Autoscaler (1-5 workers)" echo " - Hetzner CSI Driver (StorageClass: hcloud-volumes)" echo " - NGINX Ingress Controller + LoadBalancer" echo "" # Mostrar IP do LoadBalancer LB_IP=$(kubectl get svc -n ingress-nginx ingress-nginx-controller \ -o jsonpath='{.status.loadBalancer.ingress[0].ip}' 2>/dev/null || echo "pendente") echo "LoadBalancer IP: $LB_IP" echo "" echo "Arquivos gerados:" echo " - kubeconfig : Configuração do kubectl" echo " - talosconfig : Configuração do talosctl" echo "" echo "Comandos úteis:" echo "" echo " # Usar kubectl com este cluster" echo " export KUBECONFIG=$SCRIPT_DIR/kubeconfig" echo " kubectl get nodes" echo "" echo " # Ver logs do autoscaler" echo " kubectl logs -n cluster-autoscaler -l app=cluster-autoscaler -f" echo "" echo " # Destruir infraestrutura (CUIDADO!)" echo " ./cleanup.sh" echo "" log_success "Setup concluído!" echo ""