Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Don't unwrap when removing invalid TCP state from fastpath. #621

Merged
merged 3 commits into from
Dec 3, 2024

Conversation

FelixMcFelix
Copy link
Collaborator

@FelixMcFelix FelixMcFelix commented Nov 28, 2024

Now that the fastpath temporarily drops (and reacquires) the port lock when TCP state needs to be invalidated, we opened ourselves up to the possibility that another packet could have removed this state ahead of us. Equally, a packet could insert new TCP state which we might accidentally remove.

This PR removes the unwrap on removal to account for the race, and only removes TCP flows if they are pointer-equal.

Closes #618, closes #624.

Now that the fastpath temporarily drops (and reacquires) the port lock
when TCP state needs to be invalidated, we opened ourselves up to the
possibility that another packet could have removed this state ahead of
us. Equally, a packet could insert new TCP state which we might
accidentally remove.

This PR removes the unwrap on removal to account for the race, and
only removes TCP flows if they are pointer-equal.

Closes #618.
@FelixMcFelix
Copy link
Collaborator Author

FelixMcFelix commented Nov 28, 2024

I sadly haven't been able to locally reproduce the crash itself, using either zone-to-zone traffic or between local VMs via standalone omicron. Both this PR and master stand up against while true; do iperf -c 10.0.0.1 -P128 -t 2; iperf -c 10.0.0.1 -P128 -t 2 -R; done for around 10 minutes. This should slam the TCP flow state table pretty hard, and exposes us to a lot of leftover entries when data segments arrive later than expected -- I would have thought this would get us a trigger, but alas.

@FelixMcFelix FelixMcFelix self-assigned this Nov 28, 2024
@rcgoodfellow
Copy link
Contributor

I sadly haven't been able to locally reproduce the crash itself, using either zone-to-zone traffic or between local VMs via standalone omicron. Both this PR and master stand up against while true; do iperf -c 10.0.0.1 -P128 -t 2; iperf -c 10.0.0.1 -P128 -t 2 -R; done for around 10 minutes.

Maybe try this for many flows simultaneously with distinct address pairs? This would get a bit closer to rack traffic conditions.

@FelixMcFelix
Copy link
Collaborator Author

FelixMcFelix commented Nov 29, 2024

Maybe try this for many flows simultaneously with distinct address pairs? This would get a bit closer to rack traffic conditions.

I've bumped up the topology a bit (all instances running omicron's builtin alpine linux):

┌──────────────┐    ┌──────────────────────────────────────────────────────┐
│   Macbook    │    │Farme                                                 │
│(10.0.23.168) │    │(10.0.147.187)                                        │
└──────────────┘    │  ┌ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─   │
        ▲           │   VPC                                             │  │
        └─────SSH───┼──┼─────────┬──────────────┬──────────────┐           │
                    │            │              │              │        │  │
                    │  │         ▼              ▼              ▼           │
                    │    ┌──────────────┬──────────────┬──────────────┐ │  │
                    │  │ │     cafe     │  restaurant  │     bar      │    │
                    │    │I: 172.30.0.5 │I: 192.168.0.5│I: 172.30.0.7 │ │  │
                    │  │ │E: 10.1.222.12│E: 10.1.222.13│E: 10.1.222.14│    │
                    │    └──────────────┴──────────────┴──────────────┘ │  │
                    │  │         ▲              │              │           │
                    │            │              │              │        │  │
                    │  │         └──────────────┴──────────────┘           │
                    │                  iperf3 client->server            │  │
                    │  └ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─   │
                    └──────────────────────────────────────────────────────┘

No crashes as yet on master under the above -P128 (×4 servers on cafe) rapid-fire iperf gauntlet, but I did come across #624. I've been attempting concurrent API operations (e.g., firewall rule add/delete) in case epoch bumps tell part of the story (as per oxide-dogfood on the most recent coredump). I'll leave it running, but we're missing something to make the original fault dead reproducible.

Copy link
Contributor

@luqmana luqmana left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Just left one suggestion for coalescing the flow state lookup+removal.

Comment on lines +1405 to +1410
if let Some(found_entry) = local_lock.tcp_flows.get(ufid_out) {
if Arc::ptr_eq(found_entry, &entry) {
self.uft_tcp_closed(&mut local_lock, ufid_out, ufid_in);
_ = local_lock.tcp_flows.remove(ufid_out);
}
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If it's ok to swap the order here, you could make use of the Entry API to lookup and remove the tcp flow to save on searching through it twice:

Suggested change
if let Some(found_entry) = local_lock.tcp_flows.get(ufid_out) {
if Arc::ptr_eq(found_entry, &entry) {
self.uft_tcp_closed(&mut local_lock, ufid_out, ufid_in);
_ = local_lock.tcp_flows.remove(ufid_out);
}
}
if let Entry::Occupied(found_entry) = local_lock.tcp_flows.entry(*ufid_out) {
if Arc::ptr_eq(found_entry.get(), &entry) {
_ = found_entry.remove_entry();
self.uft_tcp_closed(&mut local_lock, ufid_out, ufid_in);
}
}

Copy link
Collaborator Author

@FelixMcFelix FelixMcFelix Dec 3, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks a lot for the review Luqman.

I think this would be a good idea (the swap is valid as I read it), except local_lock.tcp_flows here is a FlowTable and not a BTreeMap. It'd be nice to forward the Entry API while respecting the capacity constraints in FlowTable::add etc. to enable that. I'll open a ticket.

EDIT: #627.

@FelixMcFelix FelixMcFelix merged commit b56afee into master Dec 3, 2024
10 checks passed
@FelixMcFelix FelixMcFelix deleted the fix-618 branch December 3, 2024 10:42
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
3 participants