Monitor DNS responses


This tool records DNS responses and exports them as prometheus values to /metrics. The tool was created because some “UnknownHostException” occurred in Kubernetes and monitoring via Grafana looks good ;-)

#cargo run --release -- -s 900 -w 4000 -d www.heise.de www.ka.ka

docker build . -t dnsmonitor:v1
docker run -p 8080:8080 dnsmonitor:v1

curl localhost:8080/metrics

# HELP yodaforce_application_de_codecoverage_crm_dns_monitor_dnsMonitor_failurecounter Custom metric returning the failures per dns name.
# TYPE yodaforce_application_de_codecoverage_crm_dns_monitor_dnsMonitor_failurecounter gauge
yodaforce_application_de_codecoverage_crm_dns_monitor_dnsMonitor_failurecounter{dns="www.notfound.ka"} 11
yodaforce_application_de_codecoverage_crm_dns_monitor_dnsMonitor_failurecounter{dns="www.heise.de"} 0
yodaforce_application_de_codecoverage_crm_dns_monitor_dnsMonitor_failurecounter{dns="localhost"} 0
# HELP yodaforce_application_de_codecoverage_crm_dns_monitor_dnsMonitor_min Custom metric returning the min time in microseconds per dns name.
# TYPE yodaforce_application_de_codecoverage_crm_dns_monitor_dnsMonitor_min gauge
yodaforce_application_de_codecoverage_crm_dns_monitor_dnsMonitor_min{dns="www.notfound.ka"} 3842
# HELP yodaforce_application_de_codecoverage_crm_dns_monitor_dnsMonitor_min Custom metric returning the min time in microseconds per dns name.
# TYPE yodaforce_application_de_codecoverage_crm_dns_monitor_dnsMonitor_min gauge
yodaforce_application_de_codecoverage_crm_dns_monitor_dnsMonitor_min{dns="www.heise.de"} 1842
# HELP yodaforce_application_de_codecoverage_crm_dns_monitor_dnsMonitor_min Custom metric returning the min time in microseconds per dns name.
# TYPE yodaforce_application_de_codecoverage_crm_dns_monitor_dnsMonitor_min gauge
yodaforce_application_de_codecoverage_crm_dns_monitor_dnsMonitor_min{dns="localhost"} 135
# HELP yodaforce_application_de_codecoverage_crm_dns_monitor_dnsMonitor_max Custom metric returning the max time in microseconds per dns name.
# TYPE yodaforce_application_de_codecoverage_crm_dns_monitor_dnsMonitor_max gauge
yodaforce_application_de_codecoverage_crm_dns_monitor_dnsMonitor_max{dns="www.notfound.ka"} 4884
yodaforce_application_de_codecoverage_crm_dns_monitor_dnsMonitor_max{dns="www.heise.de"} 3620
yodaforce_application_de_codecoverage_crm_dns_monitor_dnsMonitor_max{dns="localhost"} 396
# HELP yodaforce_application_de_codecoverage_crm_dns_monitor_dnsMonitor_average Custom metric returning the average time in microseconds per dns name.
# TYPE yodaforce_application_de_codecoverage_crm_dns_monitor_dnsMonitor_average gauge
yodaforce_application_de_codecoverage_crm_dns_monitor_dnsMonitor_average{dns="www.notfound.ka"} 4222.454545454545
# HELP yodaforce_application_de_codecoverage_crm_dns_monitor_dnsMonitor_average Custom metric returning the average time in microseconds per dns name.
# TYPE yodaforce_application_de_codecoverage_crm_dns_monitor_dnsMonitor_average gauge
yodaforce_application_de_codecoverage_crm_dns_monitor_dnsMonitor_average{dns="www.heise.de"} 2307.909090909091
# HELP yodaforce_application_de_codecoverage_crm_dns_monitor_dnsMonitor_average Custom metric returning the average time in microseconds per dns name.
# TYPE yodaforce_application_de_codecoverage_crm_dns_monitor_dnsMonitor_average gauge
yodaforce_application_de_codecoverage_crm_dns_monitor_dnsMonitor_average{dns="localhost"} 269.90909090909093

source code

Plotted with Grafana:

Grafana

Inside an POD:

$ cat /etc/resolv.conf
search NAMESPACE.svc.cluster.local svc.cluster.local cluster.local HOST.SUB.SUB1.com openstacklocal novalocal
nameserver XXX.YYY.ZZZ.10
options ndots:5 

Debian GNU/Linux 10/12 honors ndots:5

$ host -t A -d SOABP-PRD.de.X.com
Trying "SOABP-PRD.de.X.com.NAMESPACE.svc.cluster.local"
Trying "SOABP-PRD.de.X.com.svc.cluster.local"
Trying "SOABP-PRD.de.X.com.cluster.local"
Trying "SOABP-PRD.de.X.com.tc02.otc.X.com"
Trying "SOABP-PRD.de.X.com.openstacklocal"
Trying "SOABP-PRD.de.X.com.novalocal"
Trying "SOABP-PRD.de.X.com"

;; QUESTION SECTION:
;SOABP-PRD.de.X.com.   IN      A
 
;; ANSWER SECTION:
SOABP-PRD.de.X.com. 2  IN      CNAME   qde4aq.de.X.com.
qde4aq.de.X.com. 2     IN      A       1.2.3.97

With “.” FQDN

$  host -d -t A SOABP-PRD.de.X.com.
Trying "SOABP-PRD.de.X.com"
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 39829
;; flags: qr aa rd ra; QUERY: 1, ANSWER: 2, AUTHORITY: 0, ADDITIONAL: 0
 
;; QUESTION SECTION:
;SOABP-PRD.de.X.com.   IN      A

;; ANSWER SECTION:
SOABP-PRD.de.X.com. 11 IN      CNAME   qde4aq.de.X.com.
qde4aq.de.X.com. 11    IN      A       1.2.3.97

resolv.conf

Example for an Unqualified Single Label Domain Name (alpine)

  • A = IPv4
  • AAAA = IPv6
$ nslookup -debug -query=A yoda-party
Server:         XXX.YYY.ZZZ.10
Address:        XXX.YYY.ZZZ.10:53
 
Query #0 completed in 1ms:
authoritative answer:
Name:   yoda-party.NAMESPACE.svc.cluster.local
Address: 1.1.1.161
 
Query #1 completed in 1ms:
** server can't find yoda-party.svc.cluster.local: NXDOMAIN
 
Query #2 completed in 1ms:
** server can't find yoda-party.cluster.local: NXDOMAIN
 
Query #3 completed in 1ms:
** server can't find yoda-party.tc02.otc.X.com: NXDOMAIN
 
Query #4 completed in 1ms:
** server can't find yoda-party.openstacklocal: NXDOMAIN
 
Query #5 completed in 1ms:
** server can't find yoda-party.novalocal: NXDOMAIN

It was very nice to see in the graphs that the performance of DNS resolve more than doubles when you use DNS names specified as Fully Qualified Domain Names(FQDN) with a dot “.” sign at the end (www.codecoverage.de.). Alpine queries the DNS if “.” is in the Name: Advisory on Search List Processing

nslookup -debug -query=A collector-http-drax-cetus.prod.ZZ.YY.de.
Server:         XXX.YYY.ZZZ.10
Address:        XXX.YYY.ZZZ.10:53
 
Query #0 completed in 0ms:
authoritative answer:
collector-http-drax-cetus.prod.ZZ.YY.de        canonical name = collector-http-drax-guardians-prod.caas-p21.YY.de
collector-http-drax-guardians-prod.caas-p21.YY.de  canonical name = edey9t.de.X.com
Name:   edey9t.de.X.com
Address: 1.2.3.4