I've explored every nook and cranny I can imagine, and I can't figure out what's going wrong.
I've got a k3s setup on my server where I'm hosting technitium in hostNetwork
mode. Technitium is presently serving requests to my network and handling DHCP no problem.
I want to setup wireguard to access services remotely, and in order to resolve services I've settled upon setting technitium as my coredns upstream, and using split-horizon resolution to make the wireguard traffic experience as indistinguishable as possible from a device in-network.
I attempted to set the upstream as my host ip (10.1.2.3
for the purpose of this post) and containers became unable to resolve DNS so I reverted the change and resorted to pulling up a one-time netshoot container and attempting to resolve DNS to the host manually. Here's where the oddities happen.
netshoot$ dig google.com # works fine
host$ dig @10.1.2.3 google.com # works fine from the host
netshoot$ @10.1.2.3 google.com # times out as follows:
;; communications error to 10.1.2.3#53: timed out
;; communications error to 10.1.2.3#53: timed out
;; communications error to 10.1.2.3#53: timed out
; <<>> DiG 9.18.13 <<>> @10.1.2.3 google.com
; (1 server found)
;; global options: +cmd
;; no servers could be reached
alright, let's see what tcpdump
says. note that 10.4.5.6
is the container ip for the purpose of this post. node-host
is the name of the node host for the purpose of this post.
20:10:02.884944 veth640b06a5 P IP (tos 0x0, ttl 64, id 35863, offset 0, flags [none], proto UDP (17), length 79)
10.4.5.6.48995 > node-host.domain: 33766+ [1au] A? google.com. (51)
20:10:02.884956 cni0 In IP (tos 0x0, ttl 64, id 35863, offset 0, flags [none], proto UDP (17), length 79)
10.4.5.6.48995 > node-host.domain: 33766+ [1au] A? google.com. (51)
20:10:02.885180 cni0 Out IP (tos 0x0, ttl 64, id 6664, offset 0, flags [DF], proto UDP (17), length 83)
node-host.domain > 10.4.5.6.48995: 33766 1/0/1 google.com. A 142.251.46.206 (55)
20:10:02.885182 veth640b06a5 Out IP (tos 0x0, ttl 64, id 6664, offset 0, flags [DF], proto UDP (17), length 83)
node-host.domain > 10.4.5.6.48995: 33766 1/0/1 google.com. A 142.251.46.206 (55)
20:10:02.885192 veth640b06a5 P IP (tos 0xc0, ttl 64, id 57398, offset 0, flags [none], proto ICMP (1), length 111)
10.4.5.6 > node-host: ICMP 10.4.5.6 udp port 48995 unreachable, length 91
IP (tos 0x0, ttl 64, id 6664, offset 0, flags [DF], proto UDP (17), length 83)
node-host.domain > 10.4.5.6.48995: 33766 1/0/1 google.com. A 142.251.46.206 (55)
20:10:02.885195 cni0 In IP (tos 0xc0, ttl 64, id 57398, offset 0, flags [none], proto ICMP (1), length 111)
10.4.5.6 > node-host: ICMP 10.4.5.6 udp port 48995 unreachable, length 91
IP (tos 0x0, ttl 64, id 6664, offset 0, flags [DF], proto UDP (17), length 83)
node-host.domain > 10.4.5.6.48995: 33766 1/0/1 google.com. A 142.251.46.206 (55)
20:10:07.890371 veth640b06a5 P IP (tos 0x0, ttl 64, id 48767, offset 0, flags [none], proto UDP (17), length 79)
10.4.5.6.49549 > node-host.domain: 33766+ [1au] A? google.com. (51)
20:10:07.890398 cni0 In IP (tos 0x0, ttl 64, id 48767, offset 0, flags [none], proto UDP (17), length 79)
10.4.5.6.49549 > node-host.domain: 33766+ [1au] A? google.com. (51)
20:10:07.890867 cni0 Out IP (tos 0x0, ttl 64, id 7896, offset 0, flags [DF], proto UDP (17), length 83)
node-host.domain > 10.4.5.6.49549: 33766 1/0/1 google.com. A 142.251.46.206 (55)
20:10:07.890874 veth640b06a5 Out IP (tos 0x0, ttl 64, id 7896, offset 0, flags [DF], proto UDP (17), length 83)
node-host.domain > 10.4.5.6.49549: 33766 1/0/1 google.com. A 142.251.46.206 (55)
20:10:07.890921 veth640b06a5 P IP (tos 0xc0, ttl 64, id 58748, offset 0, flags [none], proto ICMP (1), length 111)
10.4.5.6 > node-host: ICMP 10.4.5.6 udp port 49549 unreachable, length 91
IP (tos 0x0, ttl 64, id 7896, offset 0, flags [DF], proto UDP (17), length 83)
node-host.domain > 10.4.5.6.49549: 33766 1/0/1 google.com. A 142.251.46.206 (55)
20:10:07.890935 cni0 In IP (tos 0xc0, ttl 64, id 58748, offset 0, flags [none], proto ICMP (1), length 111)
10.4.5.6 > node-host: ICMP 10.4.5.6 udp port 49549 unreachable, length 91
IP (tos 0x0, ttl 64, id 7896, offset 0, flags [DF], proto UDP (17), length 83)
node-host.domain > 10.4.5.6.49549: 33766 1/0/1 google.com. A 142.251.46.206 (55)
20:10:07.932176 cni0 Out ARP, Ethernet (len 6), IPv4 (len 4), Request who-has 10.4.5.6 tell node-host, length 28
20:10:07.932185 veth640b06a5 Out ARP, Ethernet (len 6), IPv4 (len 4), Request who-has 10.4.5.6 tell node-host, length 28
20:10:07.932228 veth640b06a5 P ARP, Ethernet (len 6), IPv4 (len 4), Request who-has node-host tell 10.4.5.6, length 28
20:10:07.932234 veth640b06a5 P ARP, Ethernet (len 6), IPv4 (len 4), Reply 10.4.5.6 is-at 01:23:45:67:89:0a (oui Unknown), length 28
20:10:07.932239 cni0 In ARP, Ethernet (len 6), IPv4 (len 4), Reply 10.4.5.6 is-at 01:23:45:67:89:0a (oui Unknown), length 28
20:10:07.932240 cni0 In ARP, Ethernet (len 6), IPv4 (len 4), Request who-has node-host tell 10.4.5.6, length 28
20:10:07.932253 cni0 Out ARP, Ethernet (len 6), IPv4 (len 4), Reply node-host is-at bc:de:f1:23:45:67 (oui Unknown), length 28
20:10:07.932258 veth640b06a5 Out ARP, Ethernet (len 6), IPv4 (len 4), Reply node-host is-at bc:de:f1:23:45:67 (oui Unknown), length 28
It seems like the DNS ID matches up fine. The server is responding at least. Somehow I'm getting ICMP Unreachable, code 3, port unreachable. Okay, how about IPTables?
Chain KUBE-ROUTER-INPUT (1 references)
6 570 KUBE-POD-<ID> 0 -- * * 10.4.5.6 0.0.0.0/0 /* rule to jump traffic from POD name:netshoot namespace: default to chain KUBE-POD-<ID> */
Chain KUBE-NWPLCY-DEFAULT (18 references)
pkts bytes target prot opt in out source destination
3 237 MARK 0 -- * * 0.0.0.0/0 0.0.0.0/0 /* rule to mark traffic matching a network policy */ MARK or 0x10000
Chain KUBE-POD-<ID> (7 references)
pkts bytes target prot opt in out source destination
3 333 ACCEPT 0 -- * * 0.0.0.0/0 0.0.0.0/0 /* rule for stateful firewall for pod */ ctstate RELATED,ESTABLISHED
0 0 DROP 0 -- * * 0.0.0.0/0 0.0.0.0/0 /* rule to drop invalid state for pod */ ctstate INVALID
3 249 ACCEPT 0 -- * * 0.0.0.0/0 10.4.5.6 /* rule to permit the traffic traffic to pods when source is the pod's local node */ ADDRTYPE match src-type LOCAL
3 237 KUBE-NWPLCY-DEFAULT 0 -- * * 10.4.5.6 0.0.0.0/0 /* run through default egress network policy chain */
0 0 KUBE-NWPLCY-DEFAULT 0 -- * * 0.0.0.0/0 10.4.5.6 /* run through default ingress network policy chain */
0 0 NFLOG 0 -- * * 0.0.0.0/0 0.0.0.0/0 /* rule to log dropped traffic POD name:netshoot namespace: default */ mark match ! 0x10000/0x10000 limit: avg 10/min burst 10 nflog-group 100
0 0 REJECT 0 -- * * 0.0.0.0/0 0.0.0.0/0 /* rule to REJECT traffic destined for POD name:netshoot namespace: default */ mark match ! 0x10000/0x10000 reject-with icmp-port-unreachable
3 237 MARK 0 -- * * 0.0.0.0/0 0.0.0.0/0 MARK and 0xfffeffff
3 237 MARK 0 -- * * 0.0.0.0/0 0.0.0.0/0 /* set mark to ACCEPT traffic that comply to network policies */ MARK or 0x20000
Are these rules getting reached? well watch iptables -vn -L KUBE-POD-<ID>
shows that the top ACCEPT
rule for ESTABLISHED
connections is appropriately incrementing whenever dig
sends in a query. It increments 3 times over the dig
request and then rests. the REJECT
rule never increments. In fact, there isn't a single REJECT
rule that increments it's packet count for the entire duration of the dig
request (verified by watch "iptables -vn -L | grep REJECT
and watch "iptables -vn -L | grep icmp-port-unreachable
).
Hm, maybe the namespace iptables rules are different.
well I ran this fun little script and looked at other namespace's iptables rules. There were some for some services I have but none related to the netshoot container or the technitium container. (Technitium is running in host mode anyways.) There's probably an easier way to do this lol but it was alright. Wouldn't hate to see how someone else would handle this.
#!/bin/bash
#numbers.txt contains the PID of every running process
while IFS= read -r num; do
echo "Output for number $num:" >> output.txt
nsenter --net=/proc/$num/ns/net iptables -nv -L >> output.txt
echo "" >> output.txt # Optional: adds an extra newline for readability
done < numbers.txt
I also filtered this output for REJECT
rules and none of them incremented when running dig
dig source code for error message (and here), (these are the only places where the search term for that error shows up)
On a hunch, ip route shows:
debug:~# ip route
default via 10.4.0.1 dev eth0
10.4.0.0/24 dev eth0 proto kernel scope link src 10.4.5.6
10.4.0.0/16 via 10.4.0.1 dev eth0
There is clearly no route to 10.1.2.3
.
However, it seems like the container is able to access the host somehow because of this series of commands:
node-host$ nc -l -p 12345
netshoot$ echo "hello" | nc 10.4.5.6 12345
# the above works fine
netshoot$ nc -l -p 12345
node-host$ echo "hello" | nc 10.1.2.3 12345
# Also works fine
adding dig -d
to the initial DNS command results in:
(10.6.7.8 is the coredns ip)
debug:~# dig -d @10.1.2.3 google.com
setup_libs()
setup_system()
create_search_list()
ndots is 5.
timeout is 0.
retries is 3.
get_server_list()
make_server(10.6.7.8)
dig_query_setup
parse_args()
making new lookup
make_empty_lookup()
make_empty_lookup() = 0x7f230e3dd050->references = 1
digrc (open)
main parsing -d
main parsing @10.1.2.3
make_server(10.1.2.3)
main parsing google.com
clone_lookup()
make_empty_lookup()
make_empty_lookup() = 0x7f230e3de590->references = 1
clone_server_list()
make_server(10.1.2.3)
looking up google.com
dig_startup()
lock_lookup dighost.c:4659
success
start_lookup()
setup_lookup(0x7f230e3de590)
resetting lookup counter.
using root origin
recursive query
AD query
add_question()
starting to render the message
add_opt()
done rendering
create query 0x7f230e64ccc0 linked to lookup 0x7f230e3de590
dighost.c:2177:lookup_attach(0x7f230e3de590) = 2
dighost.c:2690:new_query(0x7f230e64ccc0) = 1
do_lookup()
start_udp(0x7f230e64ccc0)
dighost.c:3301:query_attach(0x7f230e64ccc0) = 2
working on lookup 0x7f230e3de590, query 0x7f230e64ccc0
dighost.c:3346:query_attach(0x7f230e64ccc0) = 3
unlock_lookup dighost.c:4661
udp_ready()
udp_ready(0x7f230e64ce60, success, 0x7f230e64ccc0)
lock_lookup dighost.c:3188
success
dighost.c:3189:lookup_attach(0x7f230e3de590) = 3
dighost.c:3261:query_attach(0x7f230e64ccc0) = 4
recving with lookup=0x7f230e3de590, query=0x7f230e64ccc0, handle=0x7f230e64ce60
recvcount=1
have local timeout of 5000
dighost.c:3135:query_attach(0x7f230e64ccc0) = 5
sending a request
sendcount=1
dighost.c:1761:query_detach(0x7f230e64ccc0) = 4
dighost.c:3281:query_detach(0x7f230e64ccc0) = 3
dighost.c:3282:lookup_detach(0x7f230e3de590) = 2
unlock_lookup dighost.c:3283
send_done(0x7f230e64ce60, success, 0x7f230e64ccc0)
sendcount=0
lock_lookup dighost.c:2765
success
dighost.c:2769:lookup_attach(0x7f230e3de590) = 3
dighost.c:2787:query_detach(0x7f230e64ccc0) = 2
dighost.c:2788:lookup_detach(0x7f230e3de590) = 2
check_if_done()
list empty
unlock_lookup dighost.c:2792
recv_done(0x7f230e64ce60, timed out, 0x7f230e4f78d8, 0x7f230e64ccc0)
lock_lookup dighost.c:3955
success
recvcount=0
dighost.c:3960:lookup_attach(0x7f230e3de590) = 3
;; communications error to 10.1.2.3#53: timed out
I'm really lost. It seems like IPTables is the right place to look because that's the only origin I could think of for the icmp-port-unreachable message.