본문 바로가기
문제 해결 모음

[k8s] Prometheus Error scraping target 해결하기 (connect: connection refused) (controller-manager, etcd, proxy, scheduler)

by study4me 2025. 2. 13.
반응형

🔹 구성 환경

  • Master Node: 3대
  • Worker Node: 10대
  • kube-prometheus-stack 버전: 67.5.0

🔹 사전 지식

Static Pod란?

kubelet이 직접 관리하는 Pod로 일반적인 Pod와 다르게 API 서버와 상관없이 동작한다.
Kubernetes API 서버에 의해 스케줄링되지 않고, 각 노드의 Kubelet이 로컬 설정 파일(/etc/kubernetes/manifests/)을 기반으로 실행한다.

kube-apiserver.yaml
kube-controller-manager.yaml
kube-scheduler.yaml
etcd.yaml

/etc/kubernetes/manifests/ 하위 yaml을 수정하면 즉시 적용된다.

 

yaml을 수정하니 Pod가 terminated되고 새로 생성됐다.
yaml을 기반으로 각 Master Node에 Pod가 생성된다.

즉, Master Node가 3대면 api server, controller manager, scheduler, ectd Pod가 3개씩 떠있다.
각 Pod는 자신이 위치한 Node의 yaml에 의해 생성된다.

즉, Master Node 1번에서만 kube-apiserver.yaml을 수정하면 1번 노드에서만 Pod가 새로 생성되고 반영된다고한다.

🔹 문제 상황

helm으로 kube-prometheus-stack 배포 후 Prometheus Target health에 들어가보니 Down으로 표시된 부분들이 발생했다.
Error 내용은 connect: connection refused. 즉, 제대로 연결 못하고 있다.
Prometheus에서 수집을 못하니 Grafana에서 No Data로 나왔다.

# 대상 Master Node
serviceMonitor/monitoring/kube-prometheus-stack-kube-controller-manager/0
Error scraping target: Get "https://mater-node-01-private-ip:10257/metrics": dial tcp mater-node-01-private-ip:10257: connect: connection refused
Error scraping target: Get "https://mater-node-02-private-ip:10257/metrics": dial tcp mater-node-02-private-ip:10257: connect: connection refused
Error scraping target: Get "https://mater-node-03-private-ip:10257/metrics": dial tcp mater-node-03-private-ip:10257: connect: connection refused
# 대상 Master Node
serviceMonitor/monitoring/kube-prometheus-stack-kube-etcd/0
Error scraping target: Get "https://mater-node-01-private-ip:2381/metrics": dial tcp mater-node-01-private-ip:2381: connect: connection refused
Error scraping target: Get "https://mater-node-02-private-ip:2381/metrics": dial tcp mater-node-02-private-ip:2381: connect: connection refused
Error scraping target: Get "https://mater-node-03-private-ip:2381/metrics": dial tcp mater-node-03-private-ip:2381: connect: connection refused
# 대상 All Node
serviceMonitor/monitoring/kube-prometheus-stack-kube-proxy/0
Error scraping target: Get "https://mater-node-01-private-ip:10249/metrics": dial tcp mater-node-01-private-ip:10249: connect: connection refused
Error scraping target: Get "https://mater-node-02-private-ip:10249/metrics": dial tcp mater-node-02-private-ip:10249: connect: connection refused
...
Error scraping target: Get "https://worker-node-09-private-ip:10249/metrics": dial tcp worker-node-09-private-ip:10249: connect: connection refused
Error scraping target: Get "https://worker-node-10-private-ip:10249/metrics": dial tcp worker-node-10-private-ip:10249: connect: connection refused
# 대상 Master Node
serviceMonitor/monitoring/kube-prometheus-stack-kube-scheduler/0
Error scraping target: Get "https://mater-node-01-private-ip:10259/metrics": dial tcp mater-node-01-private-ip:10259: connect: connection refused
Error scraping target: Get "https://mater-node-02-private-ip:10259/metrics": dial tcp mater-node-02-private-ip:10259: connect: connection refused
Error scraping target: Get "https://mater-node-03-private-ip:10259/metrics": dial tcp mater-node-03-private-ip:10259: connect: connection refused

 

 

🔹 구성 환경

확인해보니 각 kubernetes 기본 설정에서 127.0.0.1만 접근을 허용하고 있었다.

bind address는 여러개 설정 못한다고해서 0.0.0.0으로 열었다.

# [Proxy]
k edit cm kube-proxy -n kube-system -o yaml
---------------------------------------------------------------
metricsBindAddress: 0.0.0.0:10249    # 변경
---------------------------------------------------------------
kubectl delete pod -l k8s-app=kube-proxy -n kube-system
kubectl get pod -n kube-system |grep proxy



# [Controller-manager]
vi /etc/kubernetes/manifests/kube-controller-manager.yaml
---------------------------------------------------------------
    - --bind-address=0.0.0.0   # 변경
---------------------------------------------------------------
kubectl get pod -n kube-system |grep controller



# [Scheduler]
vi /etc/kubernetes/manifests/kube-scheduler.yaml
---------------------------------------------------------------
    - --bind-address=0.0.0.0   # 변경
---------------------------------------------------------------
kubectl get pod -n kube-system |grep scheduler



# [etcd]
vi /etc/kubernetes/manifests/etcd.yaml
---------------------------------------------------------------
    - --listen-client-urls=https://127.0.0.1:2379,https://10.0.4.180:2379  # 변경 없음 기존에 있던거임
    - --listen-metrics-urls=http://127.0.0.1:2381,http://[각 Master node ip]:2381   # 뒷부분 추가 또는 http://0.0.0.0:2381로 추가
---------------------------------------------------------------
kubectl get pod -n kube-system |grep etcd

 

반응형