Kubernetes CoreDNS 알아보기

By Bys on June 27, 2023

CoreDNS

CoreDNS는 쿠버네티스 클러스터의 DNS 역할을 수행할 수 있는, 유연하고 확장 가능한 DNS 서버이다.

CoreDNS is a DNS server. CoreDNS integrates with Kubernetes via the Kubernetes plugin, or with etcd with the etcd plugin. All major cloud providers have plugins too

1. DNS Policy

DNS Policy

kubelet은 spec.dnsPolicy를 확인하여 파드 내 /etc/resolv.conf 내용을 업데이트한다.

파드의 Spec을 살펴보면 dnsPolicy설정이 존재한다. 설정된 DNS Policy 정책에 따라 동작방식이 다르므로 각 특징을 먼저 살펴본다.

ClusterFirst
- DNS Policy의 Default 설정이다. Kubernetes에 존재하는 CoreDNS로 DNS resolve 요청을 하며 만약 일치하는 도메인이 없는 경우 upstream nameserver로 요청을 전달한다.
  
  Any DNS query that does not match the configured cluster domain suffix, such as “www.kubernetes.io”, is forwarded to an upstream nameserver by the DNS server
Default
- Default 설정은 파드의 DNS resolve 설정을 노드로 부터 상속 받는 방법이다.
  
  The Pod inherits the name resolution configuration from the node that the Pods run.
ClusterFirstWithHostNet
- hostNetwork설정이 true인 경우 파드는 노드의 IP를 사용하게 되며 DNS 정책에 명시적으로 ClusterFirstWithHostNet 설정을 하지 않으면 Default 설정과 같이 노드로 부터 name resolution 설정을 상속 받는다. 명시적으로 ClusterFirstWithHostNet 설정을 해야지만 ClusterFirst와 같은 동작을 한다.
  
  For Pods running with hostNetwork, you should explicitly set its DNS policy to “ClusterFirstWithHostNet”. Otherwise, Pods running with hostNetwork and “ClusterFirst” will fallback to the behavior of the “Default” policy.

None

DNS 설정이 존재하지 않으며 모든 DNS 설정은 파드의 dnsConfig 필드에 별도로 설정해주어야 한다.

It allows a Pod to ignore DNS settings from the Kubernetes environment. All DNS settings are supposed to be provided using the dnsConfig field in the Pod Spec.

apiVersion: v1
kind: Pod
metadata:
namespace: default
name: dns-example
spec:
containers:
    - name: test
    image: nginx
dnsPolicy: "None"
dnsConfig:
    nameservers:
    - 192.0.2.1 # this is an example
    searches:
    - ns1.svc.cluster-domain.example
    - my.dns.search.suffix
    options:
    - name: ndots
        value: "2"
    - name: edns0

2. 동작방식

ClusterFirst

coredns

Kubernetes 클러스터에서 실행 되는 파드는 별도의 DNS 정책이 설정되지 않는 경우 ClusterFirst 값이 설정된다.

# kubectl get po netutil-7d85b6bbd9-5bfvq -o jsonpath='{"dnsPolicy:"}{"\t"}{.spec.dnsPolicy}{"\n"}'
dnsPolicy:	ClusterFirst

이런 경우 각 파드의 /etc/resolv.conf 설정 파일의 nameserver는 kube-dns 서비스의 ClusterIP와 같다.

# kubectl exec -it netutil-7d85b6bbd9-5bfvq -- cat /etc/resolv.conf
search default.svc.cluster.local svc.cluster.local cluster.local ap-northeast-2.compute.internal
nameserver 172.20.0.10
options ndots:5

# kubectl get svc kube-dns -n kube-system -o wide
NAME       TYPE        CLUSTER-IP    EXTERNAL-IP   PORT(S)         AGE    SELECTOR
kube-dns   ClusterIP   172.20.0.10   <none>        53/UDP,53/TCP   248d   k8s-app=kube-dns

즉, 파드는 DNS Resolve를 위해 nameserver인 172.20.0.10로 요청을 하게 되며 172.20.0.10 IP는 kube-dns 서비스다.
이 때 search 설정에 따라 순서대로 namespace.svc.cluster.local을 찾게 된다.

다시 kube-dns 서비스는 Selector로 ‘k8s-app=kube-dns’ 값을 가지고 있으며 CoreDNS 파드의 IP를 Endpoints로 갖는다.

# kubectl describe svc kube-dns -n kube-system
Name:              kube-dns
Namespace:         kube-system
Labels:            eks.amazonaws.com/component=kube-dns
                   k8s-app=kube-dns
                   kubernetes.io/cluster-service=true
                   kubernetes.io/name=CoreDNS
Annotations:       prometheus.io/port: 9153
                   prometheus.io/scrape: true
Selector:          k8s-app=kube-dns
Type:              ClusterIP
IP Family Policy:  SingleStack
IP Families:       IPv4
IP:                172.20.0.10
IPs:               172.20.0.10
Port:              dns  53/UDP
TargetPort:        53/UDP
Endpoints:         10.20.10.149:53,10.20.10.84:53
Port:              dns-tcp  53/TCP
TargetPort:        53/TCP
Endpoints:         10.20.10.149:53,10.20.10.84:53
Session Affinity:  None
Events:            <none>


# kubectl get po -n kube-system -o wide | grep -i coredns
coredns-6b78c9b97-mtqnc                        1/1     Running   0          4d20h   10.20.10.149   ip-10-20-10-226.ap-northeast-2.compute.internal   <none>           <none>
coredns-6b78c9b97-r7r5m                        1/1     Running   0          4d20h   10.20.10.84    ip-10-20-10-207.ap-northeast-2.compute.internal   <none>           <none>

EKS에서 CoreDNS 파드의 /etc/resolv.conf 는 VPC DNS 서버로 설정되어 있으며 만약 resolve를 할 수 없는 도메인의 경우는 Upstream 서버로 요청되어 resolve가 된다.

Default

파드의 dnsPolicy를 Default로 설정한 후 배포한다.

apiVersion: v1
kind: Pod
metadata:
  name: busybox-default
spec:
  containers:
  - image: busybox:1.28
    command:
      - sleep
      - "7200"
    imagePullPolicy: IfNotPresent
    name: busybox
  dnsPolicy: "Default"
  restartPolicy: Always

Default 정책을 가진 파드의 /etc/resolv.conf 를 살펴보면 ClusterFirst와는 다르게 nameserver가 10.20.0.2로 AWS VPC에서 제공하는 DNS서버를 사용하고 있는 것을 확인할 수 있다.

# kubectl exec -it busybox-default -- cat /etc/resolv.conf
search ap-northeast-2.compute.internal
nameserver 10.20.0.2
options timeout:2 attempts:5

따라서 EKS 클러스터내 도메인에 대해서는 DNS 리졸브가 불가능하다.

# kubectl exec -it busybox-default -- nslookup kubernetes
Server:    10.20.0.2
Address 1: 10.20.0.2 ip-10-20-0-2.ap-northeast-2.compute.internal

nslookup: can't resolve 'kubernetes'
command terminated with exit code 1

하지만 외부에 등록된 DNS의 경우 리졸브가 가능하다.

# kubectl exec -it busybox-default -- nslookup www.amazon.com
Server:    10.20.0.2
Address 1: 10.20.0.2 ip-10-20-0-2.ap-northeast-2.compute.internal

Name:      www.amazon.com
Address 1: 2600:9000:2139:4800:7:49a5:5fd2:8621
Address 2: 2600:9000:2139:5600:7:49a5:5fd2:8621
Address 3: 2600:9000:2139:7000:7:49a5:5fd2:8621
Address 4: 2600:9000:2139:9e00:7:49a5:5fd2:8621
Address 5: 2600:9000:2139:ac00:7:49a5:5fd2:8621
Address 6: 2600:9000:2139:b800:7:49a5:5fd2:8621
Address 7: 2600:9000:2139:7c00:7:49a5:5fd2:8621
Address 8: 2600:9000:2139:3c00:7:49a5:5fd2:8621
Address 9: 99.86.147.122 server-99-86-147-122.icn51.r.cloudfront.net

ClusterFirstWithHostNet

먼저 hostNetwork이 true인 설정의 파드를 살펴본다.

apiVersion: v1
kind: Pod
metadata:
  name: busybox-clusterfirstwithhostnet
spec:
  containers:
  - image: busybox:1.28
    command:
      - sleep
      - "7200"
    imagePullPolicy: IfNotPresent
    name: busybox
  hostNetwork: true
  restartPolicy: Always

hostNetwork이 true인 경우 파드의 IP는 노드와 같은 IP를 사용하게 된다. 이 때 /etc/resolv.conf 또한 노드의 DNS 설정을 상속받는다.

# kubectl get po busybox-clusterfirstwithhostnet -o wide
NAME                              READY   STATUS    RESTARTS   AGE   IP             NODE                                              NOMINATED NODE   READINESS GATES
busybox-clusterfirstwithhostnet   1/1     Running   0          54s   10.20.11.149   ip-10-20-11-149.ap-northeast-2.compute.internal   <none>           <none>


# kubectl exec -it busybox-clusterfirstwithhostnet -- cat /etc/resolv.conf
search ap-northeast-2.compute.internal
nameserver 10.20.0.2
options timeout:2 attempts:5

따라서 hostNetwork이 true로 설정되어 있으면서 ClusterFirst 정책을 사용하기 위해서는 dnsPolicy를 ClusterFirstWithHostNet로 명시적 지정을 해주어야 한다.

apiVersion: v1
kind: Pod
metadata:
  name: busybox-clusterfirstwithhostnet
spec:
  containers:
  - image: busybox:1.28
    command:
      - sleep
      - "7200"
    imagePullPolicy: IfNotPresent
    name: busybox
  hostNetwork: true
  dnsPolicy: "ClusterFirstWithHostNet"
  restartPolicy: Always

위 와 같이 명시적으로 설정된 경우 CoreDNS를 nameserver로 사용할 수 있다.

# kubectl exec -it busybox-clusterfirstwithhostnet -- cat /etc/resolv.conf
search default.svc.cluster.local svc.cluster.local cluster.local ap-northeast-2.compute.internal
nameserver 172.20.0.10
options ndots:5

3. CoreDNS Config

CoreDNS는 설정 파일을 통해 어떤 플러그인을 사용할지와 요청을 어떻게 처리해야 할지 등 동작에 대해 정의 되어있다.

이 설정 파일은 coredns 이름으로 생성된 ConfigMap으로 정의 되어 있다.

# kubectl get cm coredns -n kube-system -o yaml
apiVersion: v1
data:
  Corefile: |
    .:53 {
        errors
        log
        health
        kubernetes cluster.local in-addr.arpa ip6.arpa {
          pods insecure
          fallthrough in-addr.arpa ip6.arpa
        }
        prometheus :9153
        forward . /etc/resolv.conf
        cache 30
        loop
        reload
        loadbalance
    }
kind: ConfigMap
metadata:
  labels:
    eks.amazonaws.com/component: coredns
    k8s-app: kube-dns
  name: coredns
  namespace: kube-system

.:53을 통해 모든 요청을 처리하기 위한 53 포트를 Listening 하고 있는 하나의 서버에 대한 설정이 존재한다. .:port-num
errors, log, health, kubernetes 등의 플러그인 리스트를 확인할 수 있다.
- log의 경우 요청에 대한 로그를 활성화 하기 위해서 사용한다.
- errors의 경우 요청에 대한 오류들을 로깅하기 위해 사용한다.
- kubernetes의 경우 cluster.local 과 같은 도메인에 대한 서비스 디스커버리를 위해 사용한다.
- forward의 경우 kubernetes cluster domain에서 조회되지 않는 요청들은 /etc/resolv.conf 에 사전 정의된 resolver로 전달하기 위해 사용한다.
- cache의 경우 캐쉬 시간을 설정하기 위해 사용한다.
- autopath의 경우 쿼리 latency를 줄이기 위해 서버측에서 search를 수행하기 위해 사용한다.

10. Trouble Shooting

1. CoreDNS를 이용하는 파드는 CoreDNS파드의 보안그룹에 53번 TCP/UDP 포트가 허용되어 있어야 한다.

Connection 오류가 발생하는 경우 dnsutil 파드 등을 생성하여 각 CoreDNS 파드로의 ping, telnet 등을 통해 컨넥션을 확인한다.
bash ;; connection timed out; no servers could be reached

2. CoreDNS의 리소스 설정

To estimate the amount of memory required for a CoreDNS instance (using default settings), you can use the following formula:

MB required (default settings) = (Pods + Services) / 1000 + 54
Monitoring 툴 등을 통해 리소스 사용량을 모니터링 해야 한다.

3. EKS CoreDNS를 사용할 때 Throttling

- Hard limit of 1024 packets per second (PPS) set at the ENI level.
- 요청이 많은 경우 CoreDNS의 파드 수를 늘리며 노드 분산을 시켜야 한다. 

4. 간헐적인 DNS Query 실패

- 패킷 전송에 대한 확인. [ENA(bw_in_allowance_exceeded, bw_out_allowance_exceeded)지표](https://docs.aws.amazon.com/ko_kr/AmazonCloudWatch/latest/monitoring/CloudWatch-Agent-network-performance.html) 확인
  ```bash
  ## i-aaa
  Interface eth0
      bw_in_allowance_exceeded: 1978
      bw_out_allowance_exceeded: 4653
  Interface eth1
      bw_in_allowance_exceeded: 1929579
      bw_out_allowance_exceeded: 16206642

  ## i-bbb
  Interface eth0
      bw_in_allowance_exceeded: 116227
      bw_out_allowance_exceeded: 60238163
  Interface eth1
      bw_in_allowance_exceeded: 0
      bw_out_allowance_exceeded: 2

  ## i-ccc
  Interface eth0
      bw_in_allowance_exceeded: 608068
      bw_out_allowance_exceeded: 1787157
  Interface eth1
      bw_in_allowance_exceeded: 0
      bw_out_allowance_exceeded: 399
  ```
    - bw_in_allowance_exceeded: 인바운드 집계 대역폭이 인스턴스의 최댓값을 초과하여 대기열에 있거나 삭제된 패킷 수
      > The number of packets queued or dropped because the inbound aggregate bandwidth exceeded the maximum for the instance.
   
    - bw_out_allowance_exceeded: 아웃바운드 집계 대역폭이 인스턴스의 최댓값을 초과하여 대기열에 있거나 삭제된 패킷 수
      > The number of packets queued or dropped because the outbound aggregate bandwidth exceeded the maximum for the instance.

5. ndots 이란?

  search default.svc.cluster.local svc.cluster.local cluster.local dns.podman
  nameserver 10.96.0.10
  options ndots:5

If your requested domain name don’t have 5 dots in there, pod will append default.svc.cluster.local, svc.cluster.local, cluster.local and dns.podman with your requested domain name and send request one by one to core-dns

References
[1] https://github.com/coredns/coredns

이 중, 3개의 모든 노드에서 아래와 같이 ENA(bw_in_allowance_exceeded, bw_out_allowance_exceeded) 지표를 확인할 수 있었습니다.

i-0862318e0d7604794

Interface eth0 bw_in_allowance_exceeded: 1978 bw_out_allowance_exceeded: 4653 Interface eth1 bw_in_allowance_exceeded: 1929579 bw_out_allowance_exceeded: 16206642

i-0c5747e38cb1ae15c

Interface eth0 bw_in_allowance_exceeded: 116227 bw_out_allowance_exceeded: 60238163 Interface eth1 bw_in_allowance_exceeded: 0 bw_out_allowance_exceeded: 2

i-01d88d10e05ce87af

Interface eth0 bw_in_allowance_exceeded: 608068 bw_out_allowance_exceeded: 1787157 Interface eth1 bw_in_allowance_exceeded: 0 bw_out_allowance_exceeded: 399 ——————————————————–

문서[1]에 따라 각 지표는 다음을 의미합니다.

bw_in_allowance_exceeded: 인바운드 집계 대역폭이 인스턴스의 최댓값을 초과하여 대기열에 있거나 삭제된 패킷 수입니다.

The number of packets queued or dropped because the inbound aggregate bandwidth exceeded the maximum for the instance.
bw_out_allowance_exceeded: 아웃바운드 집계 대역폭이 인스턴스의 최댓값을 초과하여 대기열에 있거나 삭제된 패킷 수입니다.

The number of packets queued or dropped because the outbound aggregate bandwidth exceeded the maximum for the instance.

다만, 이 지표는 누적된 데이터로 과거의 데이터일 수도 있으며 CoreDNS만의 지표가 아니기 때문에 유의미한 분석을 위해서는 각 인스턴스에서 해당 지표가 현 시점으로부터 늘어나는지 모니터링할 필요가 있습니다. 또한 상세한 분석을 위해서는 인스턴스에 CWAgent(EC2 메트릭 수집을 위해)[2]를 설치해야 할 수 도 있습니다. 다만 CWAgent에 대한 부분은 당장의 설치보다 지표가 늘어나는지 수작업으로 몇 번의 데이터 수집을 통해 모니터링 후 추후 재논의가 필요합니다.

Tag: [ aws cloud eks coredns dns ]