-
Notifications
You must be signed in to change notification settings - Fork 5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
spi-bcm2835: merge upstream patches allowing DMA transfers #1085
spi-bcm2835: merge upstream patches allowing DMA transfers #1085
Conversation
…jiffies The polling mode of the driver is designed for transfers that run less than 30us - it will only execute under those circumstances. So it should run comfortably without getting interrupted by the scheduler. But there are situations where the raspberry pi is so overloaded that it can take up to 80 jiffies until the polling thread gets rescheduled - this has been observed especially under heavy IO situations. In such a situation we now fall back to the interrupt handler and log the situation at debug level. Signed-off-by: Martin Sperl <[email protected]> Signed-off-by: Mark Brown <[email protected]> (cherry picked from commit a750b12)
Conditions per spi_transfer are: * transfer.len >= 96 bytes (to avoid mapping overhead costs) * transfer.len < 65536 bytes (limitaion by spi-hw block - could get extended) * an individual scatter/gather transfer length must be a multiple of 4 for anything but the last transfer - spi-hw block limit. (some shortcut has been taken in can_dma to avoid unnecessary mapping of pages which, for which there is a chance that there is a split with a transfer length not a multiple of 4) If it becomes a necessity these restrictions can get removed by additional code. Note that this patch requires a patch to dma-bcm2835.c by Noralf to enable scatter-gather mode inside the dmaengine, which has not been merged yet. That is why no patch to arch/arm/boot/dts/bcm2835.dtsi is included - the code works as before without dma when tx/rx are not set, but it writes a message warning about dma not used: spi-bcm2835 20204000.spi: no tx-dma configuration found - not using dma mode To enable dma-mode add the following lines to the device-tree: dmas = <&dma 6>, <&dma 7>; dma-names = "tx", "rx"; Tested-by: Noralf Trønnes <[email protected]> (private communication) Signed-off-by: Martin Sperl <[email protected]> Signed-off-by: Mark Brown <[email protected]> (cherry picked from commit 3ecd37e)
fixes several warnings/error emmitted by the kbuild system: * warn: cast from pointer to integer of different size using size_t instead of u32 * error: 'SZ_4K' undeclared moved to PAGE_SIZE and PAGE_MASK instead Review showed also a typo in the same code where tx_buff was checked twice instead of checking both rx and tx_buff. Reported by: Stephen Rothwell <[email protected]> Signed-off-by: Martin Sperl <[email protected]> Signed-off-by: Mark Brown <[email protected]> (cherry picked from commit 7e52be0)
Signed-off-by: Fengguang Wu <[email protected]> Signed-off-by: Mark Brown <[email protected]> (cherry picked from commit 29ad1a7)
replaced with #if to avoid issues in the future. Signed-off-by: Martin Sperl <[email protected]>
Signed-off-by: Martin Sperl <[email protected]>
Can you add an entry to arch/arm/boot/dts/overlays/README for spi-dma? |
Signed-off-by: Martin Sperl <[email protected]>
added documentation |
No objection here. |
spi-bcm2835: merge upstream patches allowing DMA transfers
I see that the spi node already have the dma properties, so DMA will be enabled by default: |
Not any more, it doesn't. |
kernel: BCM2835_V4L2: Add support for V4L2_EXPOSURE_METERING_MATRIX See: raspberrypi/linux#1068 kernel: dmaengine: bcm2708-dmaengine: Fix memory leak when stopping a running transfer See: raspberrypi/linux#1072 kernel: BCM270X_DT: mz61581: Revert to spi-bcm2708 See: raspberrypi/linux#1077 kernel: bcm2708/2835-i2s: Fix for PCM register ranges in device trees See: raspberrypi/linux#1079 kernel: bcm2835-sdhost: Add the ERASE capability See: raspberrypi/linux#1076 kernel: bcm2835-sdhost: Ignore CRC7 for MMC CMD1 kernel: BCM270X_DT: Add unit address to gpio node name kernel: spi-bcm2835: merge upstream patches allowing DMA transfers See: raspberrypi/linux#1085 kernel: BCM270X_DT: Use i2c_arm for rtc and bmp085 overlays kernel: BCM2708_DT: CM dtparams for audio, watchdog and RNG firmware: video_decode: Don't wait for a valid timestamp to output frames See: raspberrypi/firmware#451
kernel: BCM2835_V4L2: Add support for V4L2_EXPOSURE_METERING_MATRIX See: raspberrypi/linux#1068 kernel: dmaengine: bcm2708-dmaengine: Fix memory leak when stopping a running transfer See: raspberrypi/linux#1072 kernel: BCM270X_DT: mz61581: Revert to spi-bcm2708 See: raspberrypi/linux#1077 kernel: bcm2708/2835-i2s: Fix for PCM register ranges in device trees See: raspberrypi/linux#1079 kernel: bcm2835-sdhost: Add the ERASE capability See: raspberrypi/linux#1076 kernel: bcm2835-sdhost: Ignore CRC7 for MMC CMD1 kernel: BCM270X_DT: Add unit address to gpio node name kernel: spi-bcm2835: merge upstream patches allowing DMA transfers See: raspberrypi/linux#1085 kernel: BCM270X_DT: Use i2c_arm for rtc and bmp085 overlays kernel: BCM2708_DT: CM dtparams for audio, watchdog and RNG firmware: video_decode: Don't wait for a valid timestamp to output frames See: #451
this did not go into the current 4.1 kernels provided by the foundation - is this on purpose? |
Not deliberate. I'll see if I can cherry-pick them easily. |
In principle with 4.1 you can take the upstream version directly - only thing left behind is that "strange" fix to make CS work when moving SPI to a different place, but which is better handled by configuring the device-tree correctly to use "cs-gpios" instead of native CS... |
This PR applies with only a trivial conflict. I'm happy to push that (if compile test passes). |
The "strange" patch was the one you upstreamed that disables native CS for everyone, rather than using an overlay to enable gpio-cs for your application. |
well - take it as is then Because when you do that I would recommend switching to cs-gpios and then we do not need that out of stream "translation" patch any more... I guess we could even create an overlay for moving things arround... |
The "strange" patch from my part that you are refering to was an upstream requirement, where "device-tree" backwards-compatibility is a must - but only for things that are upstream... I was talking about that additional strange patch that made that also work with the alternate PIN location for SPI for CM, which I guess would be better handled via an overlay that configures cs-gpios plus the alternate location... I could look into how we could do this in a sane way - the question is if we still need to accomondate the spi-bcm2708 - say in 4.2? |
spi-bcm2708 is dropped in #1099 |
here the "basic" patch for the device-tree (assuming we do not support spi-bcm2708 any longer):
Needed for all variants that you produce (unless you move these to as dtsi)... Creating an overlay to change the cs-gpios for the CM to a different location would be simple... |
Yes there should be an overlay, but it should be to switch to gpio-cs on the chosen pins. The standard behaviour should be to use native CS, because that's what the DT says, it's what users expect, and because it is sometime necessary (yes, @notro?). |
Maybe it would work if the overlay for spi-bcm2708 contains a modifier to set up gpios as they are - But then spi-bcm2708 still sets the pins as it wants internally in bcm2708_init_pinmode, so that is not an issue... |
No it doesn't - that code isn't used if DT is enabled. |
The thing about native-cs is that there is a bug that just got merged into spi/for-next (ca861dd) As soon as that is in place that native-cs issue is gone when using reverse polarity clock or data levels. Here that patch for reference:
|
thinking about it - cs-gpios is not used with the spi-bcm2708 driver as it uses the old transfer interface that builds everything on its own, so it should work transparently. |
Why are we discussing bcm2708-spi? It hasn't been used since rpi-4.0.y. |
If it is still there, then you need to support it... |
It's going. Move on. |
Here a link to a patch to fix that DT settings that is - as of now - untested (still need to compile the kernel and I can not really test the overlay without a CM): Please review: It may also mean we can revert the following patch as it is no longer necessary with cs-gpios set: |
I've pushed this PR to 4.1 so it should match 4.0. It builds and doesn't break booting for me (but I have no spi device to test). How best to handle the cs/gpio issue can be sorted later. |
I can and will - Maybe @notro can also test with his displays... |
I have created a patch to modify the DT to use gpio-cs. The patch for the issue with regards to polarity is still not merged by Linus, but it went into several trees of the spi branch... |
I disagree with the reasoning behind the patch. Many people have been using SPI with native chip selects without issue. You found a problem with it, then pushed a patch upstream to workaround it for everybody (except it assumed it was using BANK0 GPIOs) when an overlay to provide a workaround for those affected would have been a cleaner solution. An overlay to switch to GPIO cs seems like a better approach, and we can revert the driver hack. |
The point is: spi-bcm2835 does not work with native-cs any longer - the reason behind it is that whenever you are touching the SPI-CS-Register there is a slight chance that the chip select will change for a very short period of time. And all those speed optimizations of the spi driver will write to this register while there is a transfer already happening. E.g: the "original driver" just enabled interrupts in the CS register and then we had to wait 14us for the interrupt to get really handled by our code. Now we first prefill the FIFO as long as possible (without interrupts enabled) and only when it is full we enable interrupts so that when we reach the HW-configured limits the interrupt will trigger - this means that we need one interrupt less to finish and we start the transfer faster - especially for short transfers, where with the above approach we are polling for short periods of time (<30us) and if we then exceed the timeout (say because of rescheduling of the kernel thread), then we move to interrupt driven to finish the process - and this again requires a modification to the CS register... As upstream needs backwards compatiblity for existing drivers that the kernel at that time supports, we had to either "fall back to interrupt only" for those "native" cases or to make the change in code. On top with the gpio-cs approach you are NOT limited to 2 Chip-selects any longer - I am running 5 SPI devices on my test setup right now with arbitrary GPIOs as chip selects, which is much of an advantage. The above device-tree patch also has the advantage of separating out the CS from the SPI pins, so that creating overlays to add additional chip-selects becomes easier. |
kernel: spi-bcm2835: merge upstream patches allowing DMA transfers See: raspberrypi/linux#1085 kernel: BCM270X_DT: Add pwm and pwm-2chan overlays See: raspberrypi/linux#756 kernel: rpisense-fb: add low-light mode and gamma control See: raspberrypi/linux#1104 firmware: video_decode: Fix up a vfw/avi timestamp hack firmware: arm_loader: Fix issue with hevc decoding See: http://forum.kodi.tv/showthread.php?tid=231092&pid=2057694#pid2057694
In kernel 4.1 SPI DMA is enabled by default, eating up precious dma channels. |
kernel: spi-bcm2835: merge upstream patches allowing DMA transfers See: raspberrypi/linux#1085 kernel: BCM270X_DT: Add pwm and pwm-2chan overlays See: raspberrypi/linux#756 kernel: rpisense-fb: add low-light mode and gamma control See: raspberrypi/linux#1104 firmware: video_decode: Fix up a vfw/avi timestamp hack firmware: arm_loader: Fix issue with hevc decoding See: http://forum.kodi.tv/showthread.php?tid=231092&pid=2057694#pid2057694
Dma channels are only used/assigned when the spi driver is loaded. |
The problem I'm facing here is that only 3 of the 5 dma channels are available by default (according to /sys/class/dma/dma0chan?/in_use), and both bcm2708-i2s and spi-bcm2835 each seem to want 2 DMA channels. In my Cirrus audio card build (which uses SPI for low-bandwidth audio device configuration) this results in a non-working driver since bcm2708-i2s can't allocate the second DMA channel. Either disabling SPI DMA by uncommenting the dma lines in bcm2708_common.dtsi or using spi-bcm2708 fixes this, leaving 1 free dma channel after the I2S module is loaded. For these cases, where SPI DMA isn't really needed, it would be better to have an option to disable it. BTW: I tried disabling the legacy DMA API, this results in a total of 7 dma channels but unfortunately also a nonworking framebuffer. |
I have now tried the following:
One thing that is surprizing is that we only have 5 dma channels - even if the mask for DMAs shows that we have more than 5 DMAs that are useable as per parameter:
The dmachans=0x7f35 says which ones are useable for the kernel: (the ones with no bit set) |
You have the sense of the dmachans parameter inverted - a bit set is a free channel, which makes more sense. I'm now investigating the channel usage. |
OK - I miss-read that - but we still have only 5 channels open (as per /sys/class/dma) and when we claim all then the system falls over without diagnostics... Maybe there is an off by one bug that does not register 6 channels but only 5? So maybe these are the dma-channels that are configured (and not any other) and for some reason the foundation DMA implementation has a bug? |
See #1113. |
See: raspberrypi/linux#756 kernel: spi-bcm2835: merge upstream patches allowing DMA transfers See: raspberrypi/linux#1085 kernel: rpisense-fb: add low-light mode and gamma control See: raspberrypi/linux#1104 kernel: bcm2708-dmaengine: Use more DMA channels (but not 12) See: raspberrypi/linux#1113 kernel: Add /dev/gpiomem device for rootless user GPIO access See: raspberrypi/linux#1112 kernel: Add RaspiDAC3 support kernel: Rpi 4.1.y spi bcm2835 patches clock-polarity issue See: raspberrypi/linux#1125 kernel: BCM270X_DT: Add SDIO overlay
See: raspberrypi/linux#756 kernel: spi-bcm2835: merge upstream patches allowing DMA transfers See: raspberrypi/linux#1085 kernel: rpisense-fb: add low-light mode and gamma control See: raspberrypi/linux#1104 kernel: bcm2708-dmaengine: Use more DMA channels (but not 12) See: raspberrypi/linux#1113 kernel: Add /dev/gpiomem device for rootless user GPIO access See: raspberrypi/linux#1112 kernel: Add RaspiDAC3 support kernel: Rpi 4.1.y spi bcm2835 patches clock-polarity issue See: raspberrypi/linux#1125 kernel: BCM270X_DT: Add SDIO overlay
kernel: BCM2835_V4L2: Add support for V4L2_EXPOSURE_METERING_MATRIX See: raspberrypi/linux#1068 kernel: dmaengine: bcm2708-dmaengine: Fix memory leak when stopping a running transfer See: raspberrypi/linux#1072 kernel: BCM270X_DT: mz61581: Revert to spi-bcm2708 See: raspberrypi/linux#1077 kernel: bcm2708/2835-i2s: Fix for PCM register ranges in device trees See: raspberrypi/linux#1079 kernel: bcm2835-sdhost: Add the ERASE capability See: raspberrypi/linux#1076 kernel: bcm2835-sdhost: Ignore CRC7 for MMC CMD1 kernel: BCM270X_DT: Add unit address to gpio node name kernel: spi-bcm2835: merge upstream patches allowing DMA transfers See: raspberrypi/linux#1085 kernel: BCM270X_DT: Use i2c_arm for rtc and bmp085 overlays kernel: BCM2708_DT: CM dtparams for audio, watchdog and RNG firmware: video_decode: Don't wait for a valid timestamp to output frames See: raspberrypi#451
kernel: spi-bcm2835: merge upstream patches allowing DMA transfers See: raspberrypi/linux#1085 kernel: BCM270X_DT: Add pwm and pwm-2chan overlays See: raspberrypi/linux#756 kernel: rpisense-fb: add low-light mode and gamma control See: raspberrypi/linux#1104 firmware: video_decode: Fix up a vfw/avi timestamp hack firmware: arm_loader: Fix issue with hevc decoding See: http://forum.kodi.tv/showthread.php?tid=231092&pid=2057694#pid2057694
See: raspberrypi/linux#756 kernel: spi-bcm2835: merge upstream patches allowing DMA transfers See: raspberrypi/linux#1085 kernel: rpisense-fb: add low-light mode and gamma control See: raspberrypi/linux#1104 kernel: bcm2708-dmaengine: Use more DMA channels (but not 12) See: raspberrypi/linux#1113 kernel: Add /dev/gpiomem device for rootless user GPIO access See: raspberrypi/linux#1112 kernel: Add RaspiDAC3 support kernel: Rpi 4.1.y spi bcm2835 patches clock-polarity issue See: raspberrypi/linux#1125 kernel: BCM270X_DT: Add SDIO overlay
merge of upstream patches allowing for DMA of spi transfers.
Adds an overlay to enable the DMA mode just on request.
See also: #1036 for a discussion.