-
Notifications
You must be signed in to change notification settings - Fork 187
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Flash corruptioin on SAMD51 #217
Comments
A bit further on I found the cause of this and was able to reliably avoid and reproduce the issue. Short description:
So the hypothesis is that brownout protection is not acting as it should during switch off. In the screenshot below the traces show voltage curves at turn-off with and without 150 ohm load. |
@dhalbert, I saw your commet on adafruit forum, as you suggested, let's discuss further in this thread.
I'm pretty sure that corruption occurs on power-down cycle as I was able to reliably reproduce it without bleed resistor and coulde not with it. Still, a bleed resistor is just a work-around in the short term. One possible cause that I can think of is current leak though one of the input pins, we'll remove it in next iteration of the design and see if it has any effect. |
Your testing is interesting. What firmware are you using in normal operation? Is it CircuitPython, Arduino, or something else? Perhaps the firmware is changing with the brownout detection. I looked at the bootloader code again. It enables a BOD33 level at around 2.7V. It does not enable hysteresis, which maybe it should. We could try bumping up the BOD33 and enabling hysteresis. On the SAMD21, hysteresis is just on or off. At around 2.7V BOD33, it looks like it's about 70mV. On the SAMD51, there is a a 4-bit field with 6mV steps. I don't have any experience in choosing this value but we could try about the same 70mV. When I discussed this kind of problem with Microchip in the past, it was an issue about power glitches on power-up. That was the motivation for the current code, which is all about waiting enough time for the voltage to stabilize on power-up. Your problem seems to be on power-down. Your scope trace does not show any glitches on power down, but I wonder if the longer timebase chosen is hiding something, though I don't see any evidence of that. What kind of power supply are you using? Have you tried a different power supply to see if that makes any difference? Microchip also said they had seen this flash erase problem when there was insufficient decoupling capacitance on Vddcore. Are your decoupling caps close to the SAMD51 chip? Are the power pins on the SAMD51 wired the same way as the Adafruit board, or are they somewhat different? We go by the reference designs in the datasheet. Is it possible to test this on a board other than yours, with the same power supply and external connections? For instance, do you see this problem on the SAMD51 Feather CAN? Here is the Feather CAN power arrangement. There are more decoupling caps not shown here as well: |
@dhalbert Thanks for your input!
I'm using CircuitPython 9.0.5 with latest version of UF2 bootloader (3.16, updated with ...
I'll record some longer traces, jsut to be sure.
I'm using two different lab supplies with same results. Important to note that I'm turning the device on and off in a rough manner, manually connecting and disconnecting power wires.
Yes, but these may be smaller than on a feather. The schematics are here btw.
Yes, we also try to follow reference and feather designes as closely as possible
This should be possible, but I'd need to somehow simulate the power-down curve on the feather. This should require some hacking. and I don't have a function generator atm that I could use for that. |
That could cause power glitches, though you didn't trace any. I've seen that myself just bobbling a USB plug a bit. The scope trace picture that you posted, is that TEST_VDDCORE1, or is it VCC3V3? There is a lot going on in your power supply circuitry, and there is opportunity for noise, maybe pins going out of range. Is it possible to supply just 3.3V to the SAMD51 and see if you can duplicate the problem?
At least measure the power-down curve, and see if you can reproduce the flash erasure problem. But we haven't had reports of flash erasure since we re-did the bootloader. CircuitPython does set BOD33, but it sets it to the same value as the bootloader setting. Another thing to try would be to write an Arduino program that's equally simple and see if you get the same problem. Probably yes, but that would eliminate CircuitPython itself as cause. |
Another small possibility: the BOD12 calibration value is set at the factory. From the datasheet:
If you have accidentally erased this value when doing initial chip programming, that might cause a problem. |
@dhalbert thank you so much for these pointers. I'll definately investigate these further when I get back to this issue. That will probably be in a couple of weeks from now, as I'm waiting for more boards to be made. |
A quick update - The positive news is that with the latest bootloader the board is not getting bricked, just cpy install gets corrupted. Dropping a new uf2 file fixes the issue. |
BTW, is it possible to protect flash memory where cpy resides? |
We haven't provided a mechanism to do that. But the |
Done with expected result (flash corruption), so it's not circuitpython. Quick summary of issue occurance (6 test boards):
|
@dhalbert could you please provide a pointer on how to set the register from CP? |
There's no way to do that from CircuitPython code. These are "fuse" bits, so there's special setup needed to change them. There is code in the bootloader to do this: see the code that corrects errors in the fuses. I meant that after the CircuitPython UF2 is loaded, you could connect to the board with, say, a hardware debugger and set those bits. For instance, I think you can write a script using a J-Link utility to do this. It's also possible from the MicroChip IDE's to do it by hand. Or you could make a special build of CircuitPython that checks the bits and sets them if necessary. And undoing them is also needed, if you want to be able to update CircuitPython. But I don't have a recipe for you off the bat. It still sounds like there might be something marginal about the power supply or the decoupling capacitors, which is causing a power dip. Is there any difference on the date codes of the SAMD51's that indicates the one bad board has a different rev chip? I think this is also something you could bring up with MicroChip as a support case. They might have some advice for you. Also read the datasheet errata carefully. |
This issue is similar (and probably related) to #170
The main difference is that in this case it occurs on SAMD51 and memory corruption occurs at 0x4000 with the result that the board loses the circuitpython install.
Hardware used is based on Feather M4 CAN, schematics are here
I haven't done measurements of the power-up and -down voltage curves, but I suspect it's a manifistation of the brownout issue.
note: before I ran
update-bootloader-feather_m4_can-v3.16.0.uf2
on v3.16 bootloader, multiple devices were bricking with memory corruptioin at 0x000000, same behavior as described in #170. After running the update, one board does not show any problems after ~200 power cycles, while this one particular board fails within 20. I'm not sure what changed exactly between3.16..bin
andupdate-.. .uf2
Symptopms
update_....uf2
I have reproduced this several times, the issue usually occurs within 20 power cycles.
Clearing and setting BOD33_DIS bit with the programmer did not change anything.
Analysis
I've compared corrupted mememory dump to a working one. There is a difference at address 0x4000 . Just one line in intel hex files is different.
Summarized, data diff at address 0x4000
Below is output from my analysis script with details:
working
broken
Follow-up
I've taken a look at main.c, there seems to be as section for brownout protection for SAMD51.
I'm willing to invest some time to fix this, but fiddling with bootloaders is not something that I've done before...
I'd like to discuss possible solutions here before I start (randomly) changing stuff.
The text was updated successfully, but these errors were encountered: