Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

proposal to fix regression in 4c9fcf86651232c2104b57510a0ac86cf86123e4 #1057

Closed
wants to merge 4 commits into from

Conversation

Matioupi
Copy link
Contributor

@Matioupi Matioupi commented Mar 6, 2022

see : 4c9fcf8#r68042495

only 25f0d58 is actually requiered.

@straithe
Copy link
Member

straithe commented Mar 7, 2022

Can you describe the regression in more detail, please?

@Matioupi
Copy link
Contributor Author

Matioupi commented Mar 7, 2022

I will try to provide a C example, but using python examples example2.py or example3.py from #1058 exhibit the behaviour.
Without the extra device->streaming = false; the libhackrf state is such that hackrf_is_streaming(pDevice) will alway return "true"
and flow will get stuck in the
while theTestHackRF.isStreaming():

from the waiting loop after transfer callback has been launched.
even if the callback is done (i.e. has returned non 0 value) the state remain "is_streaming == true" and thus never returns.

result = theTestHackRF.startTX(dummyTXCB,None)
if (result != LibHackRfReturnCode.HACKRF_SUCCESS):
    print("Erreur :",result, ",", HackRF.getHackRfErrorCodeName(result))

while theTestHackRF.isStreaming():
    time.sleep(0)

@Matioupi
Copy link
Contributor Author

Matioupi commented Mar 8, 2022

Here is a minimal C example that can be used to reproduce:
hackrf_is_streaming_stuck.zip
compile with :
gcc hackrf_is_streaming_stuck.c -lhackrf -o hackrf_is_streaming_stuck

output without PR is :

mathieu@ZBook15G6:~/Dev/matioupi/test$ ./hackrf_is_streaming_stuck
Setting up HackRF...
Init HackRF...
Just before calling hackrf_start_tx
Just after calling hackrf_start_tx
    stuck in hackrf_is_streaming() wait loop for 5.000000s, TX LED is probably still ON
    stuck in hackrf_is_streaming() wait loop for 5.000000s, TX LED is probably still ON

which shows that is_streaming keeps returning true long after transfer is actually finished. While with patched PR the wait loop exit correctly :

mathieu@ZBook15G6:~/Dev/matioupi/test$ ./hackrf_is_streaming_stuck
Setting up HackRF...
Init HackRF...
Just before calling hackrf_start_tx
Just after calling hackrf_start_tx
    end of hackrf_is_streaming() wait loop after 0.064704s
    approx tx rate : 8102868.500000 Mech/s
Just before calling hackrf_stop_tx
Just after calling hackrf_stop_tx

by the way, I also noticed the block :

if (!resubmit || result < 0) {
    transfer_finished(device, usb_transfer);
}

right above the PR commit, that probably needs adding a device->streaming = false; too. What do you think ?

@mossmann mossmann requested a review from martinling March 8, 2022 16:30
@Matioupi
Copy link
Contributor Author

Hello, did you had time to review/reproduce the issue and test the proposed patch ?

Copy link
Member

@martinling martinling left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @Matioupi, thanks for reporting the issue and making this PR. I've reproduced the issue using your C program, which was very helpful.

I think your fix is correct, but it took me a while to understand how and why this line ended up being needed:

  • The problem crept in via PR Fixes for concurrency bugs in start/stop operations #1029, which removed the request_exit() call after the callback returns nonzero, and replaced it with transfer_finished() - I did this because we don't want to make the transfer thread exit until all the transfer cancellations are complete.

  • I was confused how this caused a problem with the streaming flag, because the request_exit() function doesn't touch the streaming flag, it only sets do_exit.

  • But hackrf_is_streaming() checks both streaming and do_exit, and it turns out it was depending on do_exit being set in order to return false in this particular scenario: when transfers are stopping because the RX callback has returned nonzero, yet hackrf_stop_rx() has not been called yet.

  • So in order to maintain the behaviour of hackrf_is_streaming(), we need to clear the streaming flag here instead of setting do_exit.

So I'm happy with the fix, but I do have a request before merging it: could you please rebase this branch to include just the commit with the fix. At the moment it includes several unrelated commits which are then undone later.

@martinling martinling linked an issue Mar 16, 2022 that may be closed by this pull request
@martinling martinling added the bug label Mar 16, 2022
@martinling
Copy link
Member

by the way, I also noticed the block :

if (!resubmit || result < 0) {
    transfer_finished(device, usb_transfer);
}

right above the PR commit, that probably needs adding a device->streaming = false; too. What do you think ?

If !resubmit, then we've started cancelling transfers, which means hackrf_stop_{tx|rx} was called, which will have already set streaming to false. So we don't need it in that case.

If result < 0, then libusb_submit_transfer failed. This shouldn't happen, so it's not really clear what we should do if it does. At the moment, we just record that transfer as finished, and it silently falls out of the transfer pool. That might not be the best choice, but since it shouldn't be happening, it also shouldn't be affecting the result of hackrf_is_streaming.

@Matioupi Matioupi force-pushed the master branch 2 times, most recently from 31b73b4 to 8fae12d Compare March 17, 2022 19:55
@Matioupi Matioupi deleted the branch greatscottgadgets:master March 17, 2022 20:03
@Matioupi Matioupi closed this Mar 17, 2022
@Matioupi Matioupi deleted the master branch March 17, 2022 20:03
martinling added a commit that referenced this pull request Mar 17, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

possible regression in 4c9fcf86651232c2104b57510a0ac86cf86123e4
3 participants