-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
High-speed streaming capacity of CDC class #920
Comments
This is already asked, but can you put up your board & setup and how you test the throughput and its result. This allow others to join and try with their own setup for comparison. |
I've done some tests:
cdc_task2:With fifo exported from https://github.com/HiFiPhile/tinyusb/blob/60b41ffc1d35ceb8e7b25b9e6fb5202fe0778984/src/class/cdc/cdc_device.h#L108
xfer_fifo:Simulation of edpt_xfer_fifo, in this case there will be no more copy inside tinyusb/src/class/cdc/cdc_device.c Line 438 in 5a4fc11
Replace tu_fifo_write_n(&p_cdc->rx_ff, &p_cdc->epout_buf, xferred_bytes); withtu_fifo_advance_write_pointer(&p_cdc->rx_ff,xferred_bytes);
According to my test do less copy can improve performance and reduce execution time significantly. @mfp20 do you have any update on your project ? |
@HiFiPhile which one? |
Last month I saw you had some discussion about DMA and CDC, have you sorted it out ? |
No. I wrote a "dma pump" to allocate DMA channels and use on request of heterogeneous consumers. But performance were not as good as integrating dma in peripheral's functions. Probably working a bit more on that route would bring good results, but it is too much work to circumvent the lack of control over buffers. The easiest path would be to be able to supply your own buffers to peripherals; in the case of TinyUSB use pointers, function ptrs and ptrs to ptrs, instead of allocating a tu_fifo by default, so that developer can then use his own buffer and supply his own buffer access routines to TinyUSB. |
@HiFiPhile wow, that is a very great detail, current queuing only 1 transfer per endpoint can also limit the actual throughput. Though expanding it can increase code complexity since there is more mcu that does not support that than one does. But definitely what we could take a look a. |
Yes I agree with you, currently 30MB/s is close to the limit, I think it's easier to focus on reduce CPU utilisation. I'll try to add edpt_xfer_fifo support for CDC class and have few questions @hathach , @PanRe : |
HI, For IN transfers, i am currently not sure what |
Sorry it's not very clear, I mean between class driver and dcd
I mean how dcd's |
Ah ok. Yes i would say it is the class drivers responsibility to check everything and the DCD only queues the ordered number of bytes given by |
Maybe |
I've add cdc & dcd_transmission xfer_fifo support and updated the chart, result seems pretty good. |
Hi @HiFiPhile, I'm relatively new to TinyUSB and just got it working on the STM32F429 Evaluation board running @180MHz with the external High Speed PHY. I'm using an adaptation of the After using your python script for testing the throughput on my Windows 10 machine, I get very different results from your chart. Since the STM32F429 MCU is about 24MHz slower, I would assume the throughput would be slower but I find it odd that it is about 5 times slower than what I see in your charts. Am I missing something here? I use a bare minimum configuration generated with CubeMX for the USB HS interface with DMA enabled but without enabling the Middleware from ST. I adjusted the usb_config.h file to look something like this:
|
Indeed these numbers are not good for HS transfer... I thank it's from both hardware & software. The Synopsis DWC2 IP inside STM32F4 is less capable than Chipidea HS inside LPC43XX. Chipidea one has embedded DMA who can transfer multiple packets without CPU intervention or buffer copy, while for DWC2 there are much more work needed to do. You can see DWC2 driver is more than 2x larger. Software side tinyusb/src/portable/synopsys/dwc2/dcd_dwc2.c Line 955 in aaff27d
|
Same test (reduced a little) on LPC55S69:
Honestly I'm surprised the throughput is much lower than LPC4357 ! Both DCD have built-in DMA, 150MHz Cortex-M33 should perform not much less than 200MHz Cortex-M4. For |
I believe enabling DMA in the dcd_dwc2 driver would be beneficial for upping the transfer rates. Trying a throughput test with the standard ST middleware with DMA enabled I can get approximately 16MB/s throughput, as long as the data size is limited. I'm afraid it will take me a considerable amount of time to get that working. Given the throughput rates you are reporting, I am considering using an NXP MCU in the future though ^^ |
Is your feature request related to a problem? Please describe.
Raw throughput of USB can be high but once you make some data copy the speed will drop significantly. Currently there is no way to skip fifo & buffer in CDC class:
USB HW Buffer ==> ep_out_buf ==> rx_ff ==> User app
Besides DMA is not supported.
We had some discussions on this subject, for example expose the fifo or add xfer_fifo support.
Describe the solution you'd like
Sort out a solution to make CDC class capable of high-speed streaming
The text was updated successfully, but these errors were encountered: