-
Notifications
You must be signed in to change notification settings - Fork 215
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Atomic operations in PSRAM (esp. with 3rd party libraries) #2027
Comments
I don't see a simple way out of this without introducing our own, "PSRAM-safe" atomics. Which will be a usability and interoperability nightmare, because library developers will not likely use it, or even know about it. Turning portable-atomic into a shim of sorts for us is at least technically possible, but still doesn't solve every issue - for MCUs that can use the core atomic types, those ones can't really be shimmed for us to provide our own implementation. I don't think we can extend portable-atomic to know about our particular memory mappings. |
Do we know if an exception is raised whilst accessing atomics in psram, or does it just silently fail? If we can detect it, that at the very least opens up some options for us. |
I doubt any exception is raised. From the APP/PRO CPU POV, accessing PSRAM is like accessing regular internal RAM. And it ends up accessing regular internal RAM after all - thanks to the caching MMU. With that said, I'm not in the position to discuss a possible fix, as I do not understand the operation of the MMUs to a deeper-enough detail. All I know is that ESP-IDF provides a special xchg atomic implemented in software, which is to be used when PSRAM is enabled, as pointed to by @ProfFan . With that said, whoever wrote and maintains this is probably in that position. (@bjoernQ - is that you?). I would be happy to know though why hardware atomics cannot work with PSRAM:
Anyway, I guess me and @ProfFan are just raising the flag that there is an issue with atomics and PSRAM, and somebody needs to look into it. For both ESP-IDF and baremetal by the way, as the ESP-IDF multicore targets are also happily using only hardware atomics ATM; and once we know of a fix, we should probably change the definition of the upstream multicore STD targets from just using hardware atomics to emitting calls to the atomic intrinsic functions, so that the user code can use whatever ESP-IDF had put in the definition of these intrinsic functions, depending on user settings). Ah, one more question, couldn't resist:
|
I'll try and find those answers and get back to you. In the meantime, the only truly bullet-proof way to solve this right now is to have a target variant where we disable hardware atomics, falling back to portable-atomic or something else to ensure atomicity (note: this is exactly what esp-idf does, when the work around is enabled). I'd really like to avoid doing this though if we can help it. |
The question is, is The critical-section one is tempting, but do you remember why our support for Also if you notice, the software PSRAM-safe xchg impl in the ESP-IDF does something different... yet similar. They seem to implement the "fallible" xchg variant only, which can sporadically fail. But then the question is, what would the user do when presented with such an xchg impl, other than re-trying in a loop, which suspiciously looks like a cousin to spinlocks anyway... |
I think the answer currently for the S3 is most likely true but I am not sure. It's probably good to open an issue in
If we have a custom target (for example
I wonder what they did for Cortex-M0/M0+ and such? See https://doc.rust-lang.org/std/sync/atomic/index.html#portability |
If Cortex-M0/M0+ is always single-core (is it?) then disable-all-interrupts-one-way-or-another as a strategy does work fine and since it is a single core, you don't have the "is a spinlock a lock or just waiting" issue. By the way... by reading the opening paragraph in the "Portability section"... namely: I'm now even more puzzled as to why they kicked out our
|
No, see RP2040 |
Ah so you mean for rp2040 they use a spinlock+disable-all-interrupts software impl? If yes, then this only reinforces my suspicion, that our I guess at the time (and now still) I was not understanding the subject matter well enough to object their decision on technical ground... |
...and if spinlocking+disable-all-interrupts is fine as a strategy, then the |
Beaten me in 5 secs lol
Probably no, see rust-lang/rust#99668
For real-time applications like what we use ESPs for I think this is worse than the IDF way of "best effort" weak CAS, but can live without if it's too complicated to make something similar in Rust. |
TL;DR :(
Because the weak cas esp idf impl does not disable interrupts (which is... slow?) or if not why then? Because a strong cas impl on top of a weak cas would also require a loop anyway... |
Whilst it's related to the cache/MMU setup, the reason it doesn't work isn't PSRAM is treated too much like flash. Unfortunately, the appropriate signals to provide the required locking (which aren't needed when caching flash) were also not implemented for caching PSRAM, meaning the CPU will happily do atomic modifications on PSRAM memory without actually ensuring it's an atomic operation. As you mentioned, PSRAM is treated just like DRAM in the esp32 and esp32s3, and because of this, we can't use the
Hmm, I'm not sure if this would work, consider the case of running one core but sharing an atomic variable between an interrupt. I believe these would be two different accesses, but I suppose the CPU (being a single core in this case) can only do one thing at a time so access would be sequential? So in summary, we have no way of detecting this, we need to be proactive when using PSRAM. Essentially for every atomic access, we need to
I guess the question is where we do this. I already suggested a different target as a nuclear option but we can probably do better. Maybe we can do better, perhaps with something like |
From our discussion in the meeting, we need to determine whether enabling This is not ideal though as it incurs a cost to all atomics, even ones stored in internal memory. Longer term it would be better to have a solution like esp-idf's where it checks where the atomic is before deciding whether to guard access with a spinlock. |
As we're using atomic compare_exchange in our multi-core critical section implementation, can we enable |
Ah I forgot about that, good catch. Fortunately, I think we can control this by using |
I'm open to adding a new cfg to adjust the behavior here, but do not intend to change the default or add a new Cargo feature because:
Footnotes
|
Thanks @taiki-e ❤️!
I think this entirely reasonable, I have a couple of questions:
|
I think there are two options on what implementation to use:
(The former is more efficient but more complex to implement (although there is already code related to interrupt disabling).) And even two options on when to use it:
(The latter is more efficient but more complex as well.) Of course the cfg name will be different depending on which option is selected. Could you tell me which of these options you would actually like? |
It sounds like using the
For our specific use case, the latter is more attractive to me. Note that we only need to do this on four targets, |
Is it safe to call interrupt disable instruction from user code in
I think this depends on the actual spec and guarantees. If it is something that is clearly guaranteed in the spec, then I guess you can hard code it, but if it is something like “vendor may change it in the future” or “different for each chip or firmware configuration” then we can't hard code the value into the portable-atomic code. |
@ivmarkov Could you clarify this, the details are a bit fuzzy for me.
The silicon has a hardcoded address space for PSRAM, it will never change. It is dependant on the chip configuration (at time of production) this means, the esp32 and esp32-s3 have different address spaces for PSRAM so we'll need a solution that accounts for that (I believe we can achieve this with cfg's). Firmware configuration has no impact. |
AFAIK the current atomic operations on the S3 uses native atomics. However this creates a problem: a 3rd party library could be using atomics on their data structure placed on the SPI PSRAM.
In the IDF, they handled it by first checking for PSRAM address, then try acquire the external RAM lock, then do CAS: https://github.com/espressif/esp-idf/blob/c9df77efbf871d4c3ae9fb828778ff8c4ab36804/components/esp_hw_support/cpu.c#L199
I wonder if a similar thing should be done for the S3 (for example, a PR to
portable-atomic
that adds a feature for a similar implementation?Based on discussions with @ivmarkov
The text was updated successfully, but these errors were encountered: