Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update DPDK to 22.11 #385

Merged
merged 10 commits into from
Oct 10, 2023
Merged

Update DPDK to 22.11 #385

merged 10 commits into from
Oct 10, 2023

Conversation

PlagueCZ
Copy link
Contributor

@PlagueCZ PlagueCZ commented Sep 13, 2023

Updated the codebase and Dockerfile to use DPDK 22.11 LTS.

Changes done:

  • moved to Debian 12
  • PCI device structures are now hidden, thus the PCI address retrieval is done differently
  • ports need to be stopped before exitting to prevent errors (I think only the errors are new, DPDK was just silent before)
  • NUMA socket on my machine is not available now, but was previously. I changed the code to support both cases
  • removed conditional code from dp_rte_flow (offloading needs to be tested)
  • DPDK custom patches reduced to only the custom logging and gcc12 warning in debug, no backported bugfixes anymore
  • custom DPDK patch to raise restrictions on telemetry dictionary keys; also added telemetry tests

@github-actions github-actions bot added size/L enhancement New feature or request labels Sep 13, 2023
@PlagueCZ
Copy link
Contributor Author

@guvenc I implemented the changes needed for the -1 socket id. If you could just take a look at f79fd50 (Added NUMA socket id array check)

I actually changed the macro to unlikely((unsigned int)(SOCKETID) >= DP_NB_SOCKETS) ? 0 : (SOCKETID) because that way we can drop the RTE_VERIFY(socket_id < DP_NB_SOCKETS) that was used previously by all the callers as it would actually abort() the whole process.

Plus there was a bad order of init calls (by me I think), which is the other change in dp_service.c

And since I was touching so many lines, I also renamed some variables to reflect DPDK:

  • socketid to socket_id
  • vni_value to vni_data because this reflects the rte_hash_*(... void **data)
    Of course I can rename these back if that is not something you like in your code.

@github-actions github-actions bot added the documentation Improvements or additions to documentation label Sep 14, 2023
@PlagueCZ
Copy link
Contributor Author

@byteocean Not sure if the last commit that removes the ifdefs in dp_rte_flow are OK, as I cannot really test the hairpins, but offloading is working for the standard VF-VF and inter-host communication.

@guvenc If the above is OK, then I think this is all that is needed to move to DPDK 22. Florin should be back soon, so we should be able to go ahead with the Debian 12 soon which this is based on. But we are using OSC branches anyway, so the next steps regarding this PR are up to you.

@guvenc
Copy link
Collaborator

guvenc commented Sep 15, 2023

@byteocean Is the removed conditional compilation ok for the offloading path ?
@PlagueCZ
I will need some tests on my side as well before we make this draft a PR but it is in good shape for the testing.

@byteocean
Copy link
Contributor

byteocean commented Sep 15, 2023

Hi @PlagueCZ I got the following error/warning during the port starting. Any idea what could go wrong? DPDK version 22.11.3

olated_mode_ipip()]
2023-09-15 11:01:00.052 1(control) D SERVICE: Installed IPIP isolation flow rule, port_id: 0 [../src/rte_flow/dp_rte_flow_init.c:37:dp_install_isolated_mode_ipip()]
2023-09-15 11:01:00.052 1(control) W SERVICE: Cannot get port device info, port_id: 0 [../src/dp_port.c:505:dp_port_start()]
2023-09-15 11:01:00.088 1(control) D SERVICE: Trying to bind peer to port, peer_port_id: 0, port_id: 9 [../src/dp_hairpin.c:109:dp_hairpin_bind()]
2023-09-15 11:01:00.088 1(control) D SERVICE: Trying to bind port to peer, port_id: 9, peer_port_id: 0 [../src/dp_hairpin.c:116:dp_hairpin_bind()]
2023-09-15 11:01:00.088 1(control) I SERVICE: Init isolation flow rule for IPinIP tunnels [../src/dp_port.c:414:dp_port_install_isolated_mode()]
2023-09-15 11:01:00.108 1(control) D SERVICE: Installed IPIP isolation flow rule, port_id: 9 [../src/rte_flow/dp_rte_flow_init.c:37:dp_install_isolated_mode_ipip()]
2023-09-15 11:01:00.108 1(control) D SERVICE: Installed IPIP isolation flow rule, port_id: 9 [../src/rte_flow/dp_rte_flow_init.c:37:dp_install_isolated_mode_ipip()]
2023-09-15 11:01:00.108 1(control) W SERVICE: Cannot get port device info, port_id: 9 [../src/dp_port.c:505:dp_port_start()]

@byteocean
Copy link
Contributor

@PlagueCZ hairpin also works for simple tests (VM2VM, across hypervisor) under this branch. nice to get rid of these flags, and 22.11 seems to be improved a lot.

@PlagueCZ
Copy link
Contributor Author

Based on your guys' input, I removed the PCI address reporting in gRPC altogether (protocol is kept the same with a comment for the next breaking protocol change) and force-pushed the branch.

@guvenc guvenc marked this pull request as ready for review September 15, 2023 14:15
@github-actions github-actions bot added size/L and removed size/XL labels Sep 20, 2023
@PlagueCZ
Copy link
Contributor Author

I updated the port handling in another commit. We talked about the TAP device workaround and I also added a rollback for port startup errors, so that needed another function, etc.

@PlagueCZ PlagueCZ force-pushed the feature/dpdk_22_11 branch 3 times, most recently from 8429c13 to 83c1f9b Compare September 21, 2023 10:55
@PlagueCZ
Copy link
Contributor Author

@guvenc confirmed working on TSi staging (3AZ * 3nodes), there will be some load over the weekend, otherwise works perfectly.

This is before the VF-offloaded-packet dump branch was merged.

@PlagueCZ PlagueCZ force-pushed the feature/dpdk_22_11 branch 2 times, most recently from 7b76497 to 63e277a Compare September 27, 2023 11:01
@guvenc guvenc mentioned this pull request Sep 27, 2023
@guvenc guvenc linked an issue Sep 27, 2023 that may be closed by this pull request
@PlagueCZ PlagueCZ mentioned this pull request Sep 28, 2023
@PlagueCZ PlagueCZ force-pushed the feature/dpdk_22_11 branch 2 times, most recently from 30a1e57 to f6fb4cd Compare October 6, 2023 15:17
@PlagueCZ
Copy link
Contributor Author

PlagueCZ commented Oct 6, 2023

I think after the last addition (telemetry fix) this is ready and tested in production.

The checkpatch failure is because of the last addition in C++

Copy link
Collaborator

@guvenc guvenc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, I am so glad that we can move to the latest DPDK LTS version.

@guvenc guvenc merged commit 9b55b4c into main Oct 10, 2023
6 of 7 checks passed
@guvenc guvenc deleted the feature/dpdk_22_11 branch October 10, 2023 17:11
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
documentation Improvements or additions to documentation enhancement New feature or request size/L
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Support DPDK 22.11
3 participants