-
Notifications
You must be signed in to change notification settings - Fork 6
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
"New" TTDevice class #40
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A step in the right direction, IMO.
I've commented on various parts of the diff, not all directly related to this PR's goals. These don't need fixing right now - just areas that could be opportunity for future improvement.
uint32_t *dest = reinterpret_cast<uint32_t*>(data); | ||
|
||
while (word_len-- != 0) { | ||
uint32_t temp = *src++; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks like a workaround for something. I am not convinced it is needed.
A similar comment from me in the past resulted in similar code being removed from the write path.
static const char sys_pattern[] = "/sys/bus/pci/devices/%04x:%02x:%02x.%u/%s"; | ||
char buf[sizeof(sys_pattern) + 10]; | ||
|
||
// revision pattern = "/sys/bus/pci/devices/%04x:%02x:%02x.%u/revision" | ||
std::snprintf(buf, sizeof(buf), sys_pattern, pcie_domain, pcie_bus, pcie_device, pcie_function, "revision"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure if the code that does this in multiple places is still around, but constructing the path so the caller can append "revision" or "current_link_width" or whatever could be a useful utility function.
@@ -3627,7 +2929,7 @@ void tt_SiliconDevice::write_mmio_device_register(const void* mem_ptr, tt_cxy_pa | |||
// Copy value from main buffer to aligned buffer | |||
std::memcpy(aligned_buf.local_storage, mem_ptr, size); | |||
} | |||
write_regs(dev, mapped_address, aligned_buf.block_size / sizeof(uint32_t), aligned_buf.local_storage); | |||
pci_device->write_regs(mapped_address, aligned_buf.block_size / sizeof(uint32_t), aligned_buf.local_storage); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Again - not really related to your change, just pointing it out for future consideration --
It seems that because this write_regs
interface accepts a block of data (as opposed to a single u32), it was natural to hook it up to UMD's device write interface (which accepts a block of data and conflates register vs memory access) instead of giving UMD a sane interface for register writes.
I regard this area of code as broken because if the buffer does get fixed up on account of (size % 4 != 0), it looks like hardware receives (aligned_buf.block_size - size
) bytes of garbage, which is almost certainly not what the caller wants.
I am not sure that the buffer here should be getting fixed up at all.
d2e1711
to
4e74102
Compare
I really like to push towards small PRs. That said, this PR currently is small in terms it doesn't have many functional changes, only a lot of lines change due to file move/rename. I'd rather if this PR goes in like this, and we can create an issue to go back to all your helpful comments that you left on this PR and fix that stuff further. That way we can chunk out the work in more pieces and paralelize and prioritize better (for instance this PR as a whole has a higher priority than some minor additional refactorings), but also have more focused code changes which makes changes less error prone. @joelsmithTT does that sound fine to you? |
83c55b8
to
378bba9
Compare
Please make the PR target "main". |
Closing this PR cuz rebasing is not worth the time. Please see new PR #64 |
This PR aims to separate/clean up tt_SiliconDevice by separating all functions that interact directly with the PCIe into the TTDevice class.
End goal:
host_api.h
from metal, where the essential functions left intt_SiliconDevice
(write_to_device
,read_from_device
, tlb setup, etc.) will be exposed to TT-metalNext steps:
tt_silicon_driver.cpp
andtt_device.h
into something more appropriate (open to suggestions)pcie_device.cpp .hpp
intott_device.cpp .hpp
UMD post commit: https://github.com/tenstorrent/tt-metal/actions/runs/10636550474