Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update COORDINATION.md with Marvell review comments #216

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

sburla-marvell
Copy link
Contributor

Add Marvell comments and CRS details

Add Marvell comments and CRS details

Signed-off-by: Satananda Burla <[email protected]>
@sburla-marvell sburla-marvell requested a review from a team as a code owner March 20, 2023 22:02
Copy link
Member

@glimchb glimchb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @sburla-marvell great comments.
Do yiu want to merge this content or just collect feedback? Let me know

@glimchb
Copy link
Member

glimchb commented Mar 20, 2023

@ballle98 @RezaBacchus @dandaly @Gal-Zaidman please review

@dandaly
Copy link
Contributor

dandaly commented Mar 21, 2023

Hi,
I can merge the option 2 feedback into another PR:

  • I will add that Driver Ready Check works for Multi-Host
  • I will add as a Pro that this solution already works without interdependencies with other standards bodies (PCI, DMTF, etc.).
  • I will add that it would make it easier to adopt if we could standardize the virtio solutions (either a standard OROM signaling to the device, or adopt the approaches from NVMe and idpf).
    For option 2 did I catch it all?

@RezaBacchus
Copy link
Contributor

Greetings Dan:
The abstraction you proposed is highly desirable, however, OROMs run at the end of POST and this conflicts with the need for the OROM to run when the card is discovered, so that it may stall UEFI until all the Bus/Dev/Fun on the card are discoverable. Re writing UEFI to make the OROM run when the card is discovered is a heavy lift and risky because the system is not stable during descovery.

@sburla-marvell
Copy link
Contributor Author

Thanks @sburla-marvell great comments. Do yiu want to merge this content or just collect feedback? Let me know

Most of it is just feedback. I can create a separate pull request for the CRS part if needed.

@glimchb
Copy link
Member

glimchb commented Mar 21, 2023

Most of it is just feedback. I can create a separate pull request for the CRS part if needed.

yes, please do, so I can merge the CRS part

@dandaly
Copy link
Contributor

dandaly commented Mar 21, 2023

Greetings Dan: The abstraction you proposed is highly desirable, however, OROMs run at the end of POST and this conflicts with the need for the OROM to run when the card is discovered, so that it may stall UEFI until all the Bus/Dev/Fun on the card are discoverable. Re writing UEFI to make the OROM run when the card is discovered is a heavy lift and risky because the system is not stable during descovery.

Hi Reza,
This OROM solution solves the coordination problem between a PCI compatible device and the host. If the device is not PCI compatible, you need some other solution. This driver based solution is working today, it doesn't require any UEFI changes. I think we need to split Coordination between solving these two separate problems:

  • Coordinating between host and infrastructure device within the confines of existing specs and standard solutions (PCI, UEFI, BIOS, all unchanged). Make it work without any assumptions.
  • Propose changes to specs and standard solutions to workaround specific issues, for example if certain DPUs cannot be PCI compliant

This way we don't gate IPU/DPU provisioning on requiring a new PCI, UEFI or BIOS change in order for it to work the OPI way. We can also specify the OPI way to workaround specific incompatibilities, whatever they may be.

@glimchb
Copy link
Member

glimchb commented Mar 21, 2023

This way we don't gate IPU/DPU provisioning on requiring a new PCI, UEFI or BIOS change in order for it to work the OPI way. We can also specify the OPI way to workaround specific incompatibilities, whatever they may be.

we should try to find a way that solves all vendors. if we can't, we can't... but we need to try first...

@dandaly
Copy link
Contributor

dandaly commented Mar 21, 2023

This way we don't gate IPU/DPU provisioning on requiring a new PCI, UEFI or BIOS change in order for it to work the OPI way. We can also specify the OPI way to workaround specific incompatibilities, whatever they may be.

we should try to find a way that solves all vendors. if we can't, we can't... but we need to try first...

I propose we answer the OPI provisioning question within the constraints of what's existing, and see where we land. I also haven't heard from any other vendor that they can't work within the PCI spec, or that they can't work in existing servers.

@glimchb
Copy link
Member

glimchb commented Mar 21, 2023

I also haven't heard from any other vendor that they can't work within the PCI spec, or that they can't work in existing servers.

I got feedback from one DPU vendor that strongly advocates for Option3 with OOB management via BMC in order not to rely on PCIe timing. Even if they are PCIe compliant card.

@@ -36,10 +38,13 @@ In-Band refers to PCIe config access to the xPU from UEFI running on the server
- Booted
- Stalled or locked-up
- Halted
- ***SB> what does OS refer to here? there could be multiple software subsystems on the XPU running multiple OS. what is the difference between Halted and Not started ? When does the transition take place?***
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Halted is an OS_STATUS which I imagine would corespond to the Linux definition of halted. Similar to Graceful Shutdown.

Mot Started is a different status register for CRASHDUMP similar to "Normal", Software termination is not detected.

@dandaly
Copy link
Contributor

dandaly commented Mar 21, 2023

I got feedback from one DPU vendor that strongly advocates for Option3 with OOB management via BMC in order not to rely on PCIe timing. Even if they are PCIe compliant card.

Can we get this rationale into the document? We need to have a clear reason why we need to create a dependency like what Option 3 proposes, since that dependency will limit adoption.

@@ -87,6 +125,7 @@ meeting the timing constraints around enumeration. Dynamic adding and
deleting devices on the PCI bus via hot plug requires BIOS configuration.

## 3: Out-band via platform BMC
***SB> How about multi Host support?***
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To some degree I see an xPU as multihost. The processor on the xPU and host processor are sharing a network device. The key thing is power domains and if the multiple hosts need to be coordinated. It may be possible to have multiple BMCs on I2C in a multi-master configuration.

- ***SB> How is this related to standard PCIe FLR? Looks like we are assuming XPU internal architecture here. What if the XPU's CPU complex is the one driving its PCIe interface?***


##1.a: PCIe CRS (Config Retry Status)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@sburla-marvell let's submit this part as separate PR so I can merge it...

- *“A Root Complex implementation may choose to limit the number of Configuration Request/CRS Completion Status loops before determining that something is wrong with the target of the Request and taking appropriate action, e.g., complete the Request to the host as a failed transaction.” *
- Implementation complicated by software visibility option (if not visible, it is treated similar to CA and UR with 1sec timeout), some old IPs do not support software visibility.



## 2: Driver Ready Check

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

from slack by @samerhaj:

UEFI spec has a Watchdog that may exit the UEFI Driver if it WAITs for too long.. Also, some of the higher-level UEFI drivers used for booting (like PXE/HTTP/etc..) are not in the vendor / xPU. They are in the UEFI BIOS itself, and use lower level xPU provided UEFI drivers (such as UEFI UNDI/SNP for Ethernet ... etc..)

@dandaly fyi

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants