-
Notifications
You must be signed in to change notification settings - Fork 370
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implementation for minTTL #6808
base: main
Are you sure you want to change the base?
Conversation
Signed-off-by: Hemant <[email protected]>
pkg/config/agent/config.go
Outdated
// The minTTL setting helps address the problem of applications caching DNS response IPs indefinitely. | ||
// The Cluster administrators should configure this value, ideally setting it to be equal to or greater than the maximum TTL | ||
// value of the application's DNS cache. | ||
MinTTL uint32 `yaml:"minTTL,omitempty"` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We avoid unsigned integers in the config as far as I can tell, so maybe use int
(personally I think int32
would be better, but we do use int
consistently in the config apparently, so probably best to stick to int
).
Signed-off-by: Hemant <[email protected]>
cmd/antrea-agent/options.go
Outdated
// If minTTL is greater than 1, it indicates that the value has been set externally by a cluster admin, and should be respected. | ||
if o.config.MinTTL > 1 { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why 1 and not 0?
# The minTTL setting helps address the problem of applications caching DNS response IPs indefinitely. | ||
# The Cluster administrators should configure this value, ideally setting it to be equal to or greater than the maximum TTL | ||
# value of the application's DNS cache. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would remove "indefinitely". If the application does indeed cache the DNS entry forever, there is not much we can do.
We should also mention that this is for FQDN policy enforcement.
So maybe something like this:
The minTTL setting helps address the problem of applications caching DNS response IPs beyond the TTL value for the DNS record.
It is used to enforce FQDN policy rules, ensuring that resolved IPs are included in datapath rules for as long as the application is caching them.
This value should ideally be set to the maximum caching duration across all applications.
cmd/antrea-agent/options.go
Outdated
@@ -90,6 +90,7 @@ type Options struct { | |||
nplEndPort int | |||
dnsServerOverride string | |||
nodeType config.NodeType | |||
minTTL uint32 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this field doesn't seem necessary?
getMaxTTL := func(ttl1, ttl2 uint32) uint32 { | ||
if ttl1 > ttl2 { | ||
return ttl1 | ||
} else { | ||
return ttl2 | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
there is a max
built in in Golang: https://pkg.go.dev/builtin#max
It was introduced in Go 1.21
currentTime := f.clock.Now() | ||
for _, ans := range msg.Answer { | ||
switch r := ans.(type) { | ||
case *dns.A: | ||
if f.ipv4Enabled { | ||
responseIPs[r.A.String()] = ipWithExpiration{ | ||
ip: r.A, | ||
expirationTime: currentTime.Add(time.Duration(r.Header().Ttl) * time.Second), | ||
expirationTime: currentTime.Add(time.Duration(getMaxTTL(f.minTTL, r.Header().Ttl)) * time.Second), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I feel like technically, minTTL
should only apply to DNS responses which were not initiated by Antrea, but intercepted by Antrea (responses to DNS queries generated by the application). However, this code applies to responses to DNS queries sent by Antrea (when an override DNS server is configured). Maybe we need to introduce a flag to distinguish between the 2 cases?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We actually spent the length of last syncup meeting to discuss this, because I had the same suggestion but @tnqn’s opinion was that we don’t have to differentiate between the two cases. For security purpose we advise users to use FQDN rules only in allowlists for Antrea-native policies, and he thinks it’s okay that the clients’ TTL for a FQDN goes “out of sync” with the antrea agent since that’s not the gaurentee we want: we only want to enforce that client cannot access unintended addresses. So having an address for a domain which has longer TTL in antrea cache compared to the client is ok. I’ll let Quan chime in to see if I’m summarizing this correctly, but the end result of the discussion was we told Hemant to not worry about differentiating Antrea and client initiated dns queries
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For security purpose we advise users to use FQDN rules only in allowlists for Antrea-native policies
Got it, makes sense to me
@hkiiita please ignore this comment
- Remove the redundant minTTL field from Options struct. - Use in built max function for comparison of max TTL values. - Improve descriptive comment about minTTL in config file. Signed-off-by: Hemant <[email protected]>
Signed-off-by: Hemant <[email protected]>
# The minTTL setting helps address the problem of applications caching DNS response IPs beyond the TTL value for the DNS record. | ||
# It is used to enforce FQDN policy rules, ensuring that resolved IPs are included in datapath rules for as long as the application is caching them. | ||
# This value should ideally be set to the maximum caching duration across all applications. | ||
minTTL: {{ .Values.minTTL }} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the name is a bit generic, maybe we should go with fqdnCacheMinTTL
?
cmd/antrea-agent/options.go
Outdated
// Ensure that the minTTL is not negative. | ||
o.config.MinTTL = max(o.config.MinTTL, 0) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it would be better to return an error if o.config.MinTTL < 0
IMO
while this code is in validateK8sNodeOptions
, I am not sure this configuration parameter is specific to the "K8sNode" case. Maybe the check should be in the parent function? cc @tnqn
@@ -160,7 +161,7 @@ type fqdnController struct { | |||
clock clock.Clock | |||
} | |||
|
|||
func newFQDNController(client openflow.Client, allocator *idAllocator, dnsServerOverride string, dirtyRuleHandler func(string), v4Enabled, v6Enabled bool, gwPort uint32, clock clock.WithTicker) (*fqdnController, error) { | |||
func newFQDNController(client openflow.Client, allocator *idAllocator, dnsServerOverride string, dirtyRuleHandler func(string), v4Enabled, v6Enabled bool, gwPort uint32, clock clock.WithTicker, minTTL int) (*fqdnController, error) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if minTTL
should be uint32
here. This way we know it is a positive number. The conversion from int
to uint32
can happen in cmd/antrea-agent
.
Signed-off-by: Hemant <[email protected]>
# It is used to enforce FQDN policy rules, ensuring that resolved IPs are included in datapath rules for as long as the application is caching them. | ||
# This value should ideally be set to the maximum caching duration across all applications. | ||
minTTL: {{ .Values.minTTL }} | ||
fqdnCacheMinTTL: {{ .Values.minTTL }} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fqdnCacheMinTTL: {{ .Values.minTTL }} | |
fqdnCacheMinTTL: {{ .Values.fqdnCacheMinTTL }} |
This is why the manifests are not generated correctly (fqdnCacheMinTTL:
instead of fqdnCacheMinTTL: 0
)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry , my bad. Will correct that.
cmd/antrea-agent/options.go
Outdated
switch o.config.NodeType { | ||
case config.ExternalNode.String(): | ||
o.nodeType = config.ExternalNode | ||
return o.validateExternalNodeOptions() | ||
} else if o.config.NodeType == config.K8sNode.String() { | ||
case config.K8sNode.String(): | ||
o.nodeType = config.K8sNode | ||
return o.validateK8sNodeOptions() | ||
} else { | ||
default: | ||
return fmt.Errorf("unsupported nodeType %s", o.config.NodeType) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is not a bad change, but I would avoid doing it in this PR as it is unrelated
cmd/antrea-agent/options.go
Outdated
if o.config.NodeType == config.ExternalNode.String() { | ||
// validate FqdnCacheMinTTL | ||
if o.config.FqdnCacheMinTTL < 0 { | ||
return fmt.Errorf("fqdnCacheMinTTL set to an invalid value, its must be a positive integer") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
return fmt.Errorf("fqdnCacheMinTTL set to an invalid value, its must be a positive integer") | |
return fmt.Errorf("fqdnCacheMinTTL must be greater than or equal to 0") |
pkg/config/agent/config.go
Outdated
@@ -158,7 +158,7 @@ type AgentConfig struct { | |||
// The minTTL setting helps address the problem of applications caching DNS response IPs indefinitely. | |||
// The Cluster administrators should configure this value, ideally setting it to be equal to or greater than the maximum TTL | |||
// value of the application's DNS cache. | |||
MinTTL int `yaml:"minTTL,omitempty"` | |||
FqdnCacheMinTTL int `yaml:"fqdnCacheMinTTL,omitempty"` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the field name here should be FQDNCacheMinTTL
per our conventions
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should also add unit test cases to verify that with fqdnController with minTTL set will set correct dnsEntry expiration times in cache
… per review. Signed-off-by: Hemant <[email protected]>
Signed-off-by: Hemant <[email protected]>
@@ -306,6 +306,11 @@ kubeAPIServerOverride: {{ .Values.kubeAPIServerOverride | quote }} | |||
# 10.96.0.10:53, [fd00:10:96::a]:53). | |||
dnsServerOverride: {{ .Values.dnsServerOverride | quote }} | |||
|
|||
# The fqdnCacheMinTTL setting helps address the problem of applications caching DNS response IPs beyond the TTL value for the DNS record. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
s/The fqdnCacheMinTTL setting helps address/fqdnCacheMinTTL helps address
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@hkiiita not addressed correctly, the current sentence is not grammatically correct
@@ -744,3 +747,140 @@ func TestOnDNSResponse(t *testing.T) { | |||
}) | |||
} | |||
} | |||
func TestFQDNCacheMinTTL(t *testing.T) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I feel like in theory we only need to test parseDNSResponse
, because this is the only place where minTTL
is actually used. We just need to make sure that minTTL
can override the TTL included in the DNS response. cc @Dyanngg
However, I don't feel super strongly about it, so if others think it is better to test onDNSResponseMsg
"end-to-end", it is fine by me.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am awaiting a comment on this one to implement the nit change asked above in previous comment , moreover, i had also pushed the commit below refactor to test just the parseDNSResponse considering the feedback.
Signed-off-by: Hemant <[email protected]>
name: "Response TTL less than FQDNCacheTTL", | ||
expectedTTL: currentTime.Add(10 * time.Second), | ||
fqdnCacheMinTTL: 10, | ||
dnsMsg: &dns.Msg{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is a lot of repetition: we use the same DNS message every time as far as I can tell (with the only change being the ttl value).
One option is to use a closure such as this one:
getDNSMsg := func(ttl in) *dns.Msg {
return &dns.Msg{ ... }
}
fqdnCacheMinTTL uint32 | ||
dnsMsg *dns.Msg | ||
}{ | ||
{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we should test the IPv6 case (dns.AAAA
) as well, just because it is a different code path
fakeClock := newFakeClock(currentTime) | ||
controller := gomock.NewController(t) | ||
f, _ := newMockFQDNController(t, controller, nil, fakeClock, tc.fqdnCacheMinTTL) | ||
require.Zero(t, fakeClock.TimersAdded()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is this useful in the context of this test? Please add a short comment if you think it is necessary.
f, _ := newMockFQDNController(t, controller, nil, fakeClock, tc.fqdnCacheMinTTL) | ||
require.Zero(t, fakeClock.TimersAdded()) | ||
_, responseIPs, err := f.parseDNSResponse(tc.dnsMsg) | ||
assert.NoError(t, err) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this one should be require.NoError
This PR introduces a
minTTL
setting which would help address the problem of applications caching DNS response IPs indefinitely. Cluster administrators should be able to configure this value, ideally setting it to be equal to or greater than the maximum TTL value of the application's DNS cache.This feature is a work towards resolving the issue of indefinite caching of DNS response IPs by certain applications.