Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Tone() not correct working while BLE is active #525

Closed
fgaetani opened this issue Jul 6, 2020 · 17 comments
Closed

Tone() not correct working while BLE is active #525

fgaetani opened this issue Jul 6, 2020 · 17 comments
Assignees
Labels

Comments

@fgaetani
Copy link

fgaetani commented Jul 6, 2020

I noticed that the Tone () function does not work well when the BLE is active, the buzzer sound is not clean, because the PWM signal is not correct.
Probably, tone and bluefruit.cpp share a timer, is it possible to change the timer?
Using analogWrite() by appropriately modifying the frequency prescaler everything works well, but I need to have different prescaled because I use analogWrite() for other actuators.

I noticed in the Tone.ccp file the comment:
//Use fixed PWM2, TODO could conflict with other usage.

Maybe it was already reported?
How can I solve it?
Thank you!

@fgaetani fgaetani added the Bug label Jul 6, 2020
@hathach
Copy link
Member

hathach commented Jul 8, 2020

They didn't share timer, tone() also use PWM hardware like analogWrite(). Though seem like it use IRQ function to adjust the pulse width more frequently. When BLE in active, it will preempt everything else (being the highest priority). That cause the tone() irq delayed.

To really solve this issue, we will need to rework/update the tone() to use more SRAM for DMA so that the IRQ is triggered less often and hopefull the SRAM buffer can hold long enough for BLE event to complete. Unfortunately, I am too busy for now. But I am happy to explain things if you are willing to do it yourself.

@henrygab henrygab self-assigned this Jul 27, 2020
@henrygab
Copy link
Collaborator

Hi @fgaetani ,

First, thank you for filing this issue.

I have been working on fixing tone() as part of #496 (see also, #459, #447). The current implementation does not use the nRF52 hardware effectively for DMA output, and actually causes an interrupt for each and every pulse (!), and stops the PWM output for each and every pulse(!).

I am working to change tone() to (at a high level) do the following:

  1. Use the 'shortcuts' feature to avoid constantly stopping and starting PWM generation
  2. Significantly reduce interrupts (once per second is current goal) when a duration is specified
  3. Avoid interrupts altogether when there is no duration

I will now add this issue to the current WIP PR.

Thank you!

@henrygab
Copy link
Collaborator

henrygab commented Aug 2, 2020

@hathach -- FYI, I have a method to enable tone() which avoids the use of interrupts altogether.

This is done by configuring the PWM peripheral with appropriate refresh and loop settings, and starting at the appropriate sequence. That single tasks causes the PWM peripheral to generate the tone without any further CPU involvement (including stopping the tone when done).

Result is 100% reliable tone generation, regardless of what else the device is doing (e.g., softdevice, interrupt storm, etc.).

I'm currently in the process of finalizing test code, and validation with 'scope, but it looks very promising. It also greatly simplifies the code for this API.

P.S. - I also removed all floating point math, converting it to integer math. Of course, this included not only formal verification of equivalence, but also test code that validated equivalence of inputs vs. outputs. While off-by-one-cycle differences existed, the integer math is more accurate than the floating point was. 😁

@hathach
Copy link
Member

hathach commented Aug 3, 2020

@hathach -- FYI, I have a method to enable tone() which avoids the use of interrupts altogether.

This is done by configuring the PWM peripheral with appropriate refresh and loop settings, and starting at the appropriate sequence. That single tasks causes the PWM peripheral to generate the tone without any further CPU involvement (including stopping the tone when done).

Result is 100% reliable tone generation, regardless of what else the device is doing (e.g., softdevice, interrupt storm, etc.).

I'm currently in the process of finalizing test code, and validation with 'scope, but it looks very promising. It also greatly simplifies the code for this API.

Wow, that is very impressive, once you submitted PR for it, I will pull out my scope to verify it (with and without BLE activities).

P.S. - I also removed all floating point math, converting it to integer math. Of course, this included not only formal verification of equivalence, but also test code that validated equivalence of inputs vs. outputs. While off-by-one-cycle differences existed, the integer math is more accurate than the floating point was.

This is not really needed, M4 has FPU, it is better to just keep it as it is. However, since you already done that, we could give it a try if the code is cleaner and easier to understand.

@henrygab
Copy link
Collaborator

henrygab commented Aug 3, 2020

FYI -- Full details on conversion from float to integer math is in the GIST.

@hathach
Copy link
Member

hathach commented Aug 3, 2020

FYI -- Full details on conversion from float to integer math is in the GIST.

very thoughtful, but it is mostly for one without FPU hardware. nRF52 M4 has dedicated FPU which can compute float operation in a cycle or 2, there is no much difference from float op and integer op with nRF52. Therefore I would op for simpler code (also faster as well).

@henrygab
Copy link
Collaborator

henrygab commented Aug 3, 2020

+INF, -INF, NAN, etc... not to mention floating-point exceptions (which I don't think are handled by this BSP). Floating point brings a host of dangers.

IMHO, use of floating point in an embedded system should only be done for very good reason, and with necessary precautions. But it's a shared view.

Performance is not an issue with this code. I also doubt the old code is faster. The old code specified floats without a suffix, such as time_per=per/0.000008; (time_per was integer, per was float). Therefore, as per C++ specification: (no suffix) defines double, this is dividing a float by a double. How would that division of a float by a double result in use of a (single-precision) FPU hardware?

In contrast to the extreme care needed when using floats, the integer math is provably lossless, and already written and tested. I do not recommend restoring the floating point version. At the same time, I will not stop you from doing so ... you have write access to the PR's branch.

@henrygab
Copy link
Collaborator

henrygab commented Aug 3, 2020

@fgaetani -- if you have a moment, would love your testing to confirm that PR #496 fixes your problem.

If you're not familiar with / comfortable with using a git-based version of the BSP ... no worries! It will eventually get into a release version, and I think you'll find it solves the problems with tone() and other ISRs entirely.

@hathach
Copy link
Member

hathach commented Aug 4, 2020

+INF, -INF, NAN, etc... not to mention floating-point exceptions (which I don't think are handled by this BSP). Floating point brings a host of dangers.

IMHO, use of floating point in an embedded system should only be done for very good reason, and with necessary precautions. But it's a shared view.

Performance is not an issue with this code. I also doubt the old code is faster. The old code specified floats without a suffix, such as time_per=per/0.000008; (time_per was integer, per was float). Therefore, as per C++ specification: (no suffix) defines double, this is dividing a float by a double. How would that division of a float by a double result in use of a (single-precision) FPU hardware?

As I said, the FPU hardware allows the simpler code and faster time, I don't see any reason to use more complicated code. Regarding the INF or NAN issue, I don't see any issue until there is one arisen with current code. I am sure the gcc + FPU could just handle these float/double op with ease. Even without FPU on , if the code is complicated I could trust gcc for generating code, since this is not time-critcial, the computation time is not a factor with this tone().

In contrast to the extreme care needed when using floats, the integer math is provably lossless, and already written and tested. I do not recommend restoring the floating point version. At the same time, I will not stop you from doing so ... you have write access to the PR's branch.

For this function, the error is not a factor, loseless or not, it does not change the output of the function. Please keep the floating changes in its own PR, I think we will probably keep it pending until there is a reason to re-consider it.

@henrygab
Copy link
Collaborator

henrygab commented Aug 5, 2020

@hathach -- I often spend extra effort, changing items to match your feedback. Much of your feedback is excellent. Here, I cannot.

nRF52 M4 has dedicated FPU which can compute float operation in a cycle or 2, there is no much difference from float op and integer op with nRF52.

Did you notice that I also pointed out that, while the FPU is single-precision, the old code was using double-precision math (because it declared constants as doubles, and thus the single-precision floats were converted to doubles)? In other words, the FPU isn't used by the old code.

Not only did I remove the double-precision, lossy calculations, I took the time to separately write unit tests to exhaustively test the allowable input combinations, and confirm they matched the prior version (including visual inspection of differences ... all of which were more accurate using the integer based code). I had a feeling you would be uncomfortable, and did this to remove all fear of incorrect results. The functions are also separate from the main body of code now, simplifying unit testing if you had any doubts. Therefore, I've proven the new code is correct and more accurate.

  • You don't like it? OK.
  • You think floating point usage is OK to use in embedded projects like this? OK. We will disagree.
  • You want to decline the PR? OK. Your choice.
  • You found a technical issue (e.g., incorrect behavior)? Great! Let me know and I'll see if I can figure out a technical fix.
  • You ask me to spend my time removing an accurate and correct solution, because you like the concept of "using the FPU"?
    No.

I will of course be disappointed if you reject the PR on this ground. I will not understand your choice. I will respect your choice.

@hathach
Copy link
Member

hathach commented Aug 5, 2020

@henrygab I am very thankful for your PRs so far, you spent much more time and have done lots of work and contribution to the repo. Most if not all are excellent and spot-on however, I think this float vs integer approach based too much on assumption and theory so far.

Did you notice that I also pointed out that, while the FPU is single-precision, the old code was using double-precision math (because it declared constants as doubles, and thus the single-precision floats were converted to doubles)? In other words, the FPU isn't used by the old code.

Have you checked the generated assembly of the existing code ? I don't see why FPU cannot be used for double, I am also not sure if the gcc would use double in this case as well. It is the theory, the truth is in the assembly code.

  • You don't like it? OK.
  • You think floating point usage is OK to use in embedded projects like this? OK. We will disagree.
  • You want to decline the PR? OK. Your choice.
  • You found a technical issue (e.g., incorrect behavior)? Great! Let me know and I'll see if I can figure out a technical fix.
  • You ask me to spend my time removing an accurate and correct solution, because you like the concept of "using the FPU"?
    No.

I will of course be disappointed if you reject the PR on this ground. I will not understand your choice. I will respect your choice.

I don't understand the meaning for this series questions of you. I always tried to look at PR as more point of view than I could write, and try to understand the reason why contributors would do that. However, as I said, so far there is no reason to have a more complicated code and possibly run slower than existing mcu.

Not only did I remove the double-precision, lossy calculations, I took the time to separately write unit tests to exhaustively test the allowable input combinations, and confirm they matched the prior version (including visual inspection of differences ... all of which were more accurate using the integer based code). I had a feeling you would be uncomfortable, and did this to remove all fear of incorrect results. The functions are also separate from the main body of code now, simplifying unit testing if you had any doubts. Therefore, I've proven the new code is correct and more accurate.

It is not about your implementation, I have no doubts that It works perfectly. The problem is I don't see the benefit of using a more complicated code. I am good with float precision, there is nothing to gain more with loesless in this tone().

@henrygab
Copy link
Collaborator

henrygab commented Aug 5, 2020

... however, I think this float vs integer approach based too much on assumption and theory so far. ...
Have you checked the generated assembly of the existing code ? I don't see why FPU cannot be used for double, I am also not sure if the gcc would use double in this case as well. It is the theory, the truth is in the assembly code.

@hathach -- I am disappointed ... I do not understand your need for this evidence, especially as you can easily see it in Ghidra, GDB, Ozone, ... . Sigh.

In a final attempt at good will, please see the following screenshot from Ghidra, showing the float-based code calls to __aeabi_f2d() and __aeabi_d2uiz. Those are software-based double handling, btw ... no FPU.

https://github.com/henrygab/ZZ_ImgSrc/raw/master/Bugs/HathachNotBelieve/Ghidra.PNG

@hathach
Copy link
Member

hathach commented Aug 5, 2020

@henrygab I am not too familiar with FPU, however since it exist, we should make use of it. If gcc does not generate the code to make use of FPU, it is usually an indication that we didn't make an full usage of hardware. What we should do is figuring out why gcc does not ( missing option, libmath etc..). The approach to use integer is with FPU_PRESENT mcu is not ideal solution.

I have a bit of experience with fixed point calculation when writing my own mp3 decoder a very long time ago. I don't hate fixed point calculation, I just don't think it is appropriate solution for M4F.

PS: I never looked at the assembly code and/or FPU with nRF52 so far, maybe I would check it out later on (when having time) to make sure FPU is used.

@jpconstantineau
Copy link
Contributor

jpconstantineau commented Aug 7, 2020 via email

@henrygab
Copy link
Collaborator

henrygab commented Aug 7, 2020

That applies also to the nRF52840 (and nRF52833, nRF52820, ...).. imho, it could be seen as an application note, rather than an errata, as I would expect an OS to save/restore floating point state with tasks/threads. Does FreeRTOS save/restore floating point state with each task switch?

Presuming FreeRTOS does not save/restore floating point state per task...

In a full OS, floating point error flags are stored/restored with a thread. I've not looked to see if FreeRTOS does this, but it seems unlikely as most embedded code would avoid floats except in special circumstances. Without OS support, any app code would need to:

  1. Disable context switching (if configuration enabled pre-emptive switching)
  2. Save any prior floating point flag state
  3. Perform the floating point calculation
  4. Convert back to integer
  5. Check for an handle error states (NAN, +INF, -INF, etc.)
  6. Restore prior floating point flag state (if configuration enabled pre-emptive switching)

Of course, ISRs should be even less likely to use floating point, unless it manually checks/saves/restores the error flags... but even then, a better design is often available ... such as signaling a waiting task ... to keep ISR short.

@henrygab
Copy link
Collaborator

Hi @fgaetani,

PR #496 looks like it's in essentially its final state. If you want to try it, I think you'll be very pleased with the solution ... which doesn't use any interrupts for tone(). Once the peripheral is setup and started, tone generation occurs (and stops) entirely without CPU involvement. Therefore, you should no longer see any interaction between it and softdevice usage. 👍

@henrygab
Copy link
Collaborator

@fgaetani ... Just wanted to give an update that the fix was merged. You should see this fixed in the next released version of the BSP.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

4 participants