-
Notifications
You must be signed in to change notification settings - Fork 61
Conversation
5e38e08
to
b39eaa3
Compare
Codecov Report
@@ Coverage Diff @@
## master #1032 +/- ##
==========================================
+ Coverage 82.31% 82.33% +0.01%
==========================================
Files 189 189
Lines 13663 13671 +8
==========================================
+ Hits 11247 11256 +9
+ Misses 2416 2415 -1
Continue to review full report at Codecov.
|
} else { | ||
mt->pause_mutex->unlock(); // Unlock locked by try_lock() mutex. | ||
} | ||
mt->pause(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we add a comment here, to document that we expect to wait for the download to not be paused anymore?
This callback is passed through several method calls and constructors and the intent is a bit obfuscated here, IMO.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agree, also the name is a bit confusing, as it will not pause on every call, but I wasn't able to come up with something better. I'll add a comment.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1 for a comment, but even just checkPause
would be a better name.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
True, I'll rename it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, a bit late again. But it probably wouldn't hurt to have a null check as well before making the call.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@lbonn I don't know, std:function will throw in case it's empty. Do we want to proceed if pause callback isn't provided?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just having:
if (checkPause != nullptr) {
checkPause()
}
Then, users of the class could just pass a null pointer if they don't need the pause functionality. It's only an added functionality, not a part of the core duty of the class.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed, that would make sense.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Apart from what is already mentioned, looks good to me. I never liked that mutex.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A lot of good stuff here. Thanks!
} else { | ||
mt->pause_mutex->unlock(); // Unlock locked by try_lock() mutex. | ||
} | ||
mt->pause(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1 for a comment, but even just checkPause
would be a better name.
} | ||
} while (retry); | ||
throw Exception("image", "Could not download file, error: " + response.error_message); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This cleanup is definitely an improvement, but I think we are losing our retry functionality. We currently retry downloads at a high level if they fail for basically any reason. I think that's worth keeping. Originally, I think we gave it three tries before failing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks! I forgot to look into sotauptaneclient.cc. The retry indeed will not work. But does it work now? The code on master throws in exactly the same way.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My bad, the actual retry logic is in SotaUptaneClient::downloadImage
. All is well, sorry!
Fetcher test currently uses `large_interrupted` file for pause/resume. With `large_interrupted` server aborts after transferring half of a file (1000 bytes). This should lead to the `The target's calculated hash did not match the hash in the metadata` error and after that, the upper layer may choose to retry. However, currently, the test succeeds, because while committing the first half to the db, fetcher gets `pause` event, which makes it retry silently after `resume` is called. This behavior itself is not a problem, the problem is that it is also correct for the test to fail, if `pause` comes after the first chunk is committed. To solve this: * Use `large_file` * Launch download in a separate thread, so we can timeout, currently after 20 sec. * Pause the fetcher after a certain percent is downloaded, rather than after 200 ms. Signed-off-by: Eugene Smirnov <[email protected]>
We cannot get any progress report on a file of 2000 bytes, as we receive all file in a single curl callback. Signed-off-by: Eugene Smirnov <[email protected]>
9d8ba4f
to
a16bb89
Compare
I've rebased this to the merged sqlite changes, added couple of small fixes from my previous abandoned PR, and resolved the comments, except the one, which I'm not yet sure about. |
a16bb89
to
16f331e
Compare
@@ -234,7 +234,7 @@ std::future<HttpResponse> HttpClient::downloadAsync(const std::string& url, curl | |||
curlEasySetoptWrapper(curl_download, CURLOPT_TIMEOUT, 0); | |||
curlEasySetoptWrapper(curl_download, CURLOPT_LOW_SPEED_TIME, speed_limit_time_interval_); | |||
curlEasySetoptWrapper(curl_download, CURLOPT_LOW_SPEED_LIMIT, speed_limit_bytes_per_sec_); | |||
curlEasySetoptWrapper(curl_download, CURLOPT_RESUME_FROM, from); | |||
curlEasySetoptWrapper(curl_download, CURLOPT_RESUME_FROM_LARGE, from); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would it make sense then to change the type of from
to curl_off_t
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, absolutely. Thanks!
Using a mutex to pause the fetcher is not only non-idiomatic, but it is also an undefined behavior, as the thread which called `pause` may terminate and/or resume might get called from another thread. See [https://en.cppreference.com/w/cpp/thread/mutex] and [https://en.cppreference.com/w/cpp/thread/mutex/unlock]. It seems to work now, but it can easily break anytime. Also there was a race condition between `fetchVerifyTarget()` or `DownloadHandler()` and `setPause()`, which should be solved with this commit: http->download() setPause(true) DownloadHandler() { ... return written_size + 1 setPause(false) http->download() == CURLE_WRITE_ERROR && pause_ == false throw OversizedTarget Signed-off-by: Eugene Smirnov <[email protected]>
* Remove prints that are duplicated in hmi_stub/tests * Don't send progress events with the same percentage. Signed-off-by: Eugene Smirnov <[email protected]>
Signed-off-by: Eugene Smirnov <[email protected]>
Signed-off-by: Eugene Smirnov <[email protected]>
16f331e
to
816c81e
Compare
There are several problems with pause-resume fetcher API and the corresponding fetcher test that I see at the moment.
Fetcher test currently uses
large_interrupted
file for pause/resume. Withlarge_interrupted
server aborts after transferring half of a file (1000 bytes). This should lead to theThe target's calculated hash did not match the hash in the metadata
error and, after that, the upper layer may choose to retry. However, currently, the test succeeds, because while committing the first half to the db, fetcher getspause
event, which makes it retry silently afterresume
is called. This behavior itself is not a problem, the problem is that it is also correct for the test to fail, ifpause
comes after the first chunk is committed.Race condition between
fetchVerifyTarget()
orDownloadHandler()
andsetPause()
:Using a mutex to pause the fetcher is not only non-idiomatic, but it is also an undefined behavior, as the thread which called
pause
may terminate and/or resume might get called from another thread. See [https://en.cppreference.com/w/cpp/thread/mutex] and [https://en.cppreference.com/w/cpp/thread/mutex/unlock]. It seems to work now, but it can easily break anytime.The
downloading_
guard in fetcher is pretty useless IMO. If I understand correctly its main purpose is to returnkNotDownloading
, if theDownload
API wasn't called yet. But thedownloading_
flagdoesn't guarantee anything. The high-level
Download
API call is asynchronous and the execution of underlyingFetcher::fetchVerifyTarget
will be performed at some point in the future on the command queue.Pause
, on the other hand, is synchronous, so it's entirely possible that caller gets akNotDownloading
even after he has calledDownload
. That means that caller should track the downloading state himself and provide some retry logic for the aforementioned case.So why bother with the
kNotDownloading
at all?Cleaner approach IMO would be to set
pause
no matter if theDownload
was called or not. In caseDownload
called after aPause
it should just block until theResume
is called.This PR attempts to solve issues 1-3, issue 4 is out of scope currently, I just want to start a discussion on it.