Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Gen 4] Optimize QSPI flash writes and erasures #2841

Open
wants to merge 5 commits into
base: develop
Choose a base branch
from

Conversation

avtolstoy
Copy link
Member

@avtolstoy avtolstoy commented Nov 12, 2024

IMPORTANT: Make sure to upgrade prebootloader-part1 before system or bootloader.

Problem

  1. We are unable to write anything over 8 bytes at a time to the QSPI flash.
  2. QSPI flash clock speed and IO mode (2IO/4IO) are not correctly calibrated on boot
  3. Erasures do not use 32K and 64K operations where possible

Solution

  1. Manually perform write transactions (in appropriate IO mode depending on flash type and hardware) and write up to 256 bytes at a time which is what flashes normally support
  2. Change drive mode on QSPI flash pins and 'guess' correct dummy cycles of the flash in flash initialization
  3. For erasures spanning multiple 4k blocks use optimized 32K and 64K operations, which are quicker. This should help OTA times.

To validate

  1. Sleep needs to be double checked including power measurements (could be leakage due to changed pin modes of QSPI pins)
  2. OTP operations

Steps to Test

Make sure to upgrade prebootloader-part1 before system.

  • wiring/no_fixture_stress validates that the flash is in correct mode.

Example App

N/A

References

N/A


Completeness

  • User is totes amazing for contributing!
  • Contributor has signed CLA (Info here)
  • Problem and Solution clearly stated
  • Run unit/integration/application tests on device
  • Added documentation
  • Added to CHANGELOG.md after merging (add links to docs and issues)

…at a time. Erase optimize with 32K and 64K erasures
…brate flash to highest clock rate and best IO mode
@avtolstoy avtolstoy added this to the 6.2.1 milestone Nov 12, 2024
@avtolstoy avtolstoy marked this pull request as draft November 12, 2024 15:09
@avtolstoy avtolstoy marked this pull request as ready for review November 12, 2024 15:46
Copy link
Member

@scott-brust scott-brust left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Still working through the PR, but wanted to submit questions so far.
Some TODOs still

  • Finish reviewing some of the exflash_hal code changes
  • Actually test the quad IO read/writes on my GD25 MSOM board, with QSPI traces (might not be most accurate but Id like to see all 4 IOs going)
  • Review + run the test cases

}

int hal_exflash_init(void) {
// ExFlashLock lk;
// FLASH_ClockSwitch(BIT_SHIFT_FLASH_CLK_XTAL, _FALSE);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We are doing this at the SDK level in flash_calibration_backup() right?
It looks like this is called by prebootloader when it turns on the KM4. So the KM4 will boot with the higher flash speed, correct?
(Im guessing these lines are leftover from attempting to switch speeds once KM4 has powered on)

SPIC->addr_length = 3;
}
~RtkSpicFlashConfigGuard() {
SPIC->ssienr = 0;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This disables the SPI controller once were done with the read/write/erase, right?

SPIC->ssienr = 0;
SPIC->ctrlr0 = ctrl0_;
SPIC->auto_length = autoLen_;
SPIC->addr_length = addrLen_;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These restore the ctrlr0, auto_length and addr_length registers to what they were before the read/write/erase?
Is there a concern that those will change with the flash operations? (I assume the auto increment address register should change?)

SPIC->dr[0].word = RTK_SPIC_CMD_ADDR_FORMAT(cmd, addr);
}

class RtkSpicFlashConfigGuard {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This config guard wasnt being used before, correct?
And it should be in effect for all read/write/erase (and OTP) commands now?


if (flash_init_para.FLASH_cur_bitmode == SpicQuadBitMode) {
// Quad mode enabled
if (flash_init_para.FLASH_Id == FLASH_ID_GD) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Where is flash_init_para populated? in KM0 on boot? Then copied to KM4 in Set_System()?

    _memcpy((void *)&flash_init_para, &flash_init_para_km0, sizeof(FLASH_InitTypeDef));

How are FLASH_cur_bitmode and FLASH_Id populated? They must be valid by the time exflash_hal is run right?

rtkSpiFlashSendCmd(GENERIC_FLASH_CMD_WREN);
rtkSpiFlashTxCmdAddr(cmd, address, mode);
if (size > (RTK_SPIC_FIFO_HALF_SIZE - 4)) {
writeLen = RTK_SPIC_FIFO_HALF_SIZE - 4;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this an attempt to limit the number of bytes we write to the SPIC Data Register fifo?
It should have 64, but we only want to use up to 32 at a time just to be conservative?
There should be 1 byte of data, 3 bytes of address, then up to 28 bytes of data?

}
uint32_t data;
uint32_t cnt = writeLen / 4;
for (uint32_t i = 0; i < cnt; i++) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The logic is to write as much data in 4 byte word chunks first, then 2 byte half words, and then finally 1 byte to the SPIC data buffer. (This always writes to SPIC->dr[0] though?)

}

__attribute__((section(".ram.text"), noinline))
int hal_exflash_erase_block(uintptr_t start_addr, size_t num_blocks) {
return erase_common(start_addr, num_blocks, EraseBlock);
for (size_t i = 0; i < num_blocks; i++) {
CHECK(erase_common(start_addr, GENERIC_FLASH_CMD_BE64K));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we actually do 64k block erases at some point in normal operation?

uintptr_t addr = start_addr;
while (sizeInBytes > 0) {
hal_exflash_cmd_t cmd = GENERIC_FLASH_CMD_SE;
if (addr % (64*1024) == 0 && sizeInBytes >= 64*1024) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This checks if we are being asked to erase a 64k or 32k aligned series of sectors, and if so, coalesces them into a more efficient Block Erase command, right?
Is there a concern that start_addr might not be sector aligned in the single sector erase case?

rtkSpiFlashSendCmd(GENERIC_FLASH_CMD_WREN);
/* Disable SPI_FLASH */
SPIC->ssienr = 0;
rtkSpiSetTxMode();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This means that there is no data on the other IO lines? IE this is just a single line command + address transmission?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants