You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Is your feature request related to a problem? Please describe.
When executing a host based network disruption the address resolved doens't provide the the result we need for the traffic to be successfully disrupted.
What we see is that when using the POD nameserver to resolve the IPs we end up with the Class E subnet (240.xxx.xxx.xxx) as the resolved address. This is then used for the traffic disruption but doesn't impact the traffic as expected.(No disruption at all)
However if we don't use the POD nameserver and fallback to use the NODE nameserver to get the IPs for the host, then build a spec using those IPs (rather than the host) then the traffic is disrupted as expected.
As you can see from the below, different results depending on how the lookup is performed (address and ips altered)
[root@ip-10-1-1-1/]# nsenter -t 2903880 -n dig @172.20.0.10 exteranlservice.service.us-west-2.mycompany.test.net
; <<>> DiG 9.11.4-P2-RedHat-9.11.4-26.P2.amzn2.13.6 <<>> @172.20.0.10 exteranlservice.service.us-west-2.mycompany.test.net
; (1 server found)
;; global options: +cmd
;; Got answer:
;; >>HEADER<< opcode: QUERY, status: NOERROR, id: 42257
;; flags: qr aa rd; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 0
;; WARNING: recursion requested but not available
;; QUESTION SECTION:
;exteranlservice.service.us-west-2.mycompany.test.net. IN A
;; ANSWER SECTION:
exteranlservice.service.us-west-2.mycompany.test.net. 30 IN A 240.240.196.125
;; Query time: 0 msec
;; SERVER: 172.20.0.10#53(172.20.0.10)
;; WHEN: Mon Jul 15 10:07:45 UTC 2024
;; MSG SIZE rcvd: 182
[root@ip-10-1-1-1 /]# nsenter -t 2903880 -n dig exteranlservice.service.us-west-2.mycompany.test.net
; <<>> DiG 9.11.4-P2-RedHat-9.11.4-26.P2.amzn2.13.6 <<>> exteranlservice.service.us-west-2.mycompany.test.net
;; global options: +cmd
;; Got answer:
;; >>HEADER<< opcode: QUERY, status: NOERROR, id: 19732
;; flags: qr rd ra; QUERY: 1, ANSWER: 3, AUTHORITY: 0, ADDITIONAL: 1
;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 4096
;; QUESTION SECTION:
;exteranlservice.service.us-west-2.mycompany.test.net. IN A
;; ANSWER SECTION:
exteranlservice.service.us-west-2.mycompany.test.net. 2 IN A 10.218.218.211
exteranlservice.service.us-west-2.mycompany.test.net. 2 IN A 10.218.218.212
exteranlservice.service.us-west-2.mycompany.test.net. 2 IN A 10.218.218.213
;; Query time: 0 msec
;; SERVER: 10.123.123.1#53(10.123.123.1)
;; WHEN: Mon Jul 15 10:06:28 UTC 2024
;; MSG SIZE rcvd: 151
Describe the solution you'd like
The ability to control if the host resolution is resolved by the pod or the node on a per disruption basis.
The ability to set the default behaviour through the controller configuration, i.e default to POD and override to node on a per disruptions basis and vice versa
Additional context
In our environment the DNS proxying described above is only applied on a per namespace and host basis, therefore configuration on a per disruption basis would be the most desirable solution.
The text was updated successfully, but these errors were encountered:
The current behavior of the injector pod is to read both /etc/resolv.conf files from within the pod and from the node (here), the pod one taking precedence over the node configuration. Ideally, it is where we want to add a condition to pick one over the other.
Missed the github notification, sorry! This is a very reasonable request, but with August holidays, I don't think we'll be adding it until mid-September at the earliest. We'd accept PRs, of course
Is your feature request related to a problem? Please describe.
When executing a host based network disruption the address resolved doens't provide the the result we need for the traffic to be successfully disrupted.
The reason is that on some of our services we are using ISTIO DNS proxying and more specifically. https://istio.io/latest/docs/ops/configuration/traffic-management/dns-proxy/#external-tcp-services-without-vips
and
https://istio.io/latest/blog/2020/dns-proxy/#automatic-vip-allocation-where-possible
What we see is that when using the POD nameserver to resolve the IPs we end up with the Class E subnet (240.xxx.xxx.xxx) as the resolved address. This is then used for the traffic disruption but doesn't impact the traffic as expected.(No disruption at all)
However if we don't use the POD nameserver and fallback to use the NODE nameserver to get the IPs for the host, then build a spec using those IPs (rather than the host) then the traffic is disrupted as expected.
As you can see from the below, different results depending on how the lookup is performed (address and ips altered)
Describe the solution you'd like
The ability to control if the host resolution is resolved by the pod or the node on a per disruption basis.
The ability to set the default behaviour through the controller configuration, i.e default to POD and override to node on a per disruptions basis and vice versa
Additional context
In our environment the DNS proxying described above is only applied on a per namespace and host basis, therefore configuration on a per disruption basis would be the most desirable solution.
The text was updated successfully, but these errors were encountered: