# Queries PromQL Úteis Queries prontas para uso no Grafana ou diretamente na API do Victoria Metrics. ## Como usar ### Via Grafana 1. Acesse Grafana → Explore 2. Selecione datasource "VictoriaMetrics" 3. Cole a query no editor ### Via API ```bash # Port-forward kubectl port-forward -n monitoring svc/vmsingle-vm-victoria-metrics-k8s-stack 8429:8429 # Query curl "http://localhost:8429/api/v1/query?query=up" ``` --- ## Storage / PVC ### Uso de PVC em porcentagem ```promql kubelet_volume_stats_used_bytes / kubelet_volume_stats_capacity_bytes * 100 ``` ### PVCs acima de 80% ```promql (kubelet_volume_stats_used_bytes / kubelet_volume_stats_capacity_bytes) > 0.8 ``` ### Espaço disponível por PVC (bytes) ```promql kubelet_volume_stats_available_bytes ``` ### Espaço disponível por PVC (GB) ```promql kubelet_volume_stats_available_bytes / 1024 / 1024 / 1024 ``` ### Inodes disponíveis ```promql kubelet_volume_stats_inodes_free / kubelet_volume_stats_inodes * 100 ``` ### PVCs que vão encher em 24h (previsão) ```promql predict_linear(kubelet_volume_stats_available_bytes[6h], 24 * 3600) < 0 ``` --- ## CPU ### CPU por pod (cores) ```promql sum(rate(container_cpu_usage_seconds_total{container!=""}[5m])) by (pod, namespace) ``` ### CPU por namespace (cores) ```promql sum(rate(container_cpu_usage_seconds_total{container!=""}[5m])) by (namespace) ``` ### CPU por node (%) ```promql 100 - (avg by(instance) (rate(node_cpu_seconds_total{mode="idle"}[5m])) * 100) ``` ### Top 10 pods por CPU ```promql topk(10, sum(rate(container_cpu_usage_seconds_total{container!=""}[5m])) by (pod, namespace)) ``` ### Uso de CPU vs Request ```promql sum(rate(container_cpu_usage_seconds_total{container!=""}[5m])) by (pod, namespace) / sum(kube_pod_container_resource_requests{resource="cpu"}) by (pod, namespace) ``` --- ## Memória ### Memória por pod (bytes) ```promql sum(container_memory_working_set_bytes{container!=""}) by (pod, namespace) ``` ### Memória por namespace (GB) ```promql sum(container_memory_working_set_bytes{container!=""}) by (namespace) / 1024 / 1024 / 1024 ``` ### Memória disponível por node (%) ```promql (node_memory_MemAvailable_bytes / node_memory_MemTotal_bytes) * 100 ``` ### Top 10 pods por memória ```promql topk(10, sum(container_memory_working_set_bytes{container!=""}) by (pod, namespace)) ``` ### Uso de memória vs Limit ```promql sum(container_memory_working_set_bytes{container!=""}) by (pod, namespace) / sum(kube_pod_container_resource_limits{resource="memory"}) by (pod, namespace) ``` --- ## Pods e Containers ### Pods restartando na última hora ```promql sum(increase(kube_pod_container_status_restarts_total[1h])) by (pod, namespace) > 0 ``` ### Pods não Ready ```promql kube_pod_status_ready{condition="false"} ``` ### Pods em CrashLoopBackOff ```promql kube_pod_container_status_waiting_reason{reason="CrashLoopBackOff"} ``` ### Pods pendentes ```promql kube_pod_status_phase{phase="Pending"} ``` ### Containers OOMKilled ```promql kube_pod_container_status_last_terminated_reason{reason="OOMKilled"} ``` ### Total de pods por namespace ```promql sum(kube_pod_info) by (namespace) ``` ### Pods por node ```promql sum(kube_pod_info) by (node) ``` --- ## Deployments ### Deployments com réplicas indisponíveis ```promql kube_deployment_status_replicas_unavailable > 0 ``` ### Deployments não atualizados ```promql kube_deployment_status_observed_generation != kube_deployment_metadata_generation ``` ### Proporção de réplicas disponíveis ```promql kube_deployment_status_replicas_available / kube_deployment_spec_replicas ``` --- ## Network ### Bytes recebidos por pod (rate) ```promql sum(rate(container_network_receive_bytes_total[5m])) by (pod, namespace) ``` ### Bytes enviados por pod (rate) ```promql sum(rate(container_network_transmit_bytes_total[5m])) by (pod, namespace) ``` ### Erros de rede por interface ```promql sum(rate(node_network_receive_errs_total[5m])) by (instance, device) ``` ### Conexões TCP por estado ```promql node_netstat_Tcp_CurrEstab ``` --- ## Nodes ### Nodes não Ready ```promql kube_node_status_condition{condition="Ready",status="true"} == 0 ``` ### Pressão de memória ```promql kube_node_status_condition{condition="MemoryPressure",status="true"} == 1 ``` ### Pressão de disco ```promql kube_node_status_condition{condition="DiskPressure",status="true"} == 1 ``` ### Disco disponível por node (%) ```promql (node_filesystem_avail_bytes{fstype!~"tmpfs|overlay"} / node_filesystem_size_bytes{fstype!~"tmpfs|overlay"}) * 100 ``` ### Load average (1 min) ```promql node_load1 ``` --- ## Cluster Overview ### Total de pods Running ```promql count(kube_pod_status_phase{phase="Running"}) ``` ### Total de namespaces ```promql count(kube_namespace_created) ``` ### Total de deployments ```promql count(kube_deployment_created) ``` ### Total de PVCs ```promql count(kube_persistentvolumeclaim_info) ``` ### Idade do cluster (dias) ```promql (time() - min(kube_namespace_created{namespace="kube-system"})) / 86400 ``` --- ## Victoria Metrics ### Métricas sendo coletadas (por job) ```promql count by (job) ({__name__!=""}) ``` ### Taxa de ingestão ```promql sum(rate(vm_rows_inserted_total[5m])) ``` ### Uso de disco do VM ```promql vm_data_size_bytes ``` ### Queries por segundo ```promql sum(rate(vm_http_requests_total{path="/api/v1/query"}[5m])) ``` --- ## Dicas ### Filtrar por namespace ```promql # Adicione {namespace="meu-namespace"} a qualquer query sum(container_memory_working_set_bytes{namespace="gitlab"}) by (pod) ``` ### Excluir namespaces de sistema ```promql {namespace!~"kube-system|argocd|monitoring|gitlab"} ``` ### Agregar por label ```promql sum by (label_app) (kube_pod_info) ``` ### Ordenar resultados ```promql sort_desc(sum(container_memory_working_set_bytes) by (namespace)) ``` ### Top N ```promql topk(5, sum(rate(container_cpu_usage_seconds_total[5m])) by (pod)) ``` ### Valor no tempo (offset) ```promql # Valor de 1 hora atrás container_memory_working_set_bytes offset 1h ``` --- ## Referências - [PromQL Cheat Sheet](https://promlabs.com/promql-cheat-sheet/) - [Victoria Metrics MetricsQL](https://docs.victoriametrics.com/metricsql/) - [Grafana Dashboards](https://grafana.com/grafana/dashboards/)