-
Notifications
You must be signed in to change notification settings - Fork 61
feat/ota-4174/direct ostree update of IP Secondary #1500
Conversation
mike-sul
commented
Dec 23, 2019
•
edited
Loading
edited
- Initial version of an ostree direct update support for IP Secondary
- Decouple the secondary interface implementation (aktualizr_secondary) from the TCP server
- Decouple the secondary implementation common functionality from an update/install method specific (update agent)
- Encapsulate the manifest management functionality in a single component and reuse it for Primary and Secondary
- Remove some outdated tests for the secondary and add more relevant tests
Codecov Report
@@ Coverage Diff @@
## master #1500 +/- ##
=======================================
Coverage 80.78% 80.78%
=======================================
Files 184 184
Lines 11182 11182
=======================================
Hits 9033 9033
Misses 2149 2149 Continue to review full report at Codecov.
|
This pull request fixes 1 alert when merging 8828cf4 into 2403384 - view on LGTM.com fixed alerts:
|
8828cf4
to
b404658
Compare
This pull request introduces 1 alert and fixes 1 when merging b404658 into 2403384 - view on LGTM.com new alerts:
fixed alerts:
|
b404658
to
6fbbcc0
Compare
This pull request introduces 1 alert and fixes 1 when merging 6fbbcc0 into 2403384 - view on LGTM.com new alerts:
fixed alerts:
|
6fbbcc0
to
78752fd
Compare
This pull request fixes 1 alert when merging 78752fd into 2403384 - view on LGTM.com fixed alerts:
|
78752fd
to
4548419
Compare
This pull request fixes 1 alert when merging 4548419 into 2403384 - view on LGTM.com fixed alerts:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is where I've gotten so far, more to come soon! So far, good at a high level, just some small details.
void AktualizrSecondary::stop() { /* TODO? */ | ||
if (rebootDetected()) { | ||
LOG_INFO << "Reboot has been detected, applying the new ostree revision: " << pending_target_.sha256Hash(); | ||
// TODO: refactor this |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To achieve what goal?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
// TODO: refactor this to make it simpler as we don't need to persist/store
// an installation status of each ECU but store it just for a given secondary ECU.
We can get rid of dependency on the sqlstorage at all.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
// TODO: refactor this to make it simpler as we don't need to persist/store
// an installation status of each ECU but store it just for a given secondary ECU.
Yep, that makes sense.
We can get rid of dependency on the sqlstorage at all.
How else would you store it, though?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
e.g. just a file or a simpler implementation that uses sqlite db
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we're using libaktualizr, we might as well use the sqlite db, but you might be right about using a simpler implementation for this functionality.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This PR is way, way too big. There is a lot of good stuff here, but it's almost impossible to review, because there are too many things going on at once. This certainly could have been broken down into smaller chunks. At a minimum, the individual refactorings should be done in independent commits. As it is, it's extremely challenging to figure out what changes are related to which other changes. I've done a first pass over this, so we may as well keep it as is, but in the future, we cannot merge PRs this large. They will be rejected.
try { | ||
std::string ca, cert, pkey, server_url; | ||
extractCredentialsArchive(data, &ca, &cert, &pkey, &server_url); | ||
// TODO: why are qe loading this credentials at all ? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Authentication with Treehub. See OstreeManager::addRemote()
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not questioning the need in ca/cert/pkey, I don't understand why, at first, we put them into the key manager and then the ostree packman fetch them from the key manager, why not just pass ca/cert/pkey to the packman.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fair point. My guess is that it either came from mimicking the style of the Primary or that it's only done because it's a hack. If the Secondary is just reusing the same credentials as the Primary, it shouldn't need them at all if the Primary is doing the fetching and proxying the objects to the Secondary. But I know we aren't quite there yet.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, another item to improve implementation, in the future, but it's a minor at the moment, IMHO.
@@ -231,6 +232,7 @@ Json::Value KeyManager::signTuf(const Json::Value &in_data) const { | |||
} | |||
|
|||
Json::Value signature; | |||
// TODO: FIX the hardcoded value of a signature method/algorithm |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@lbonn do I recall that you said this was correct because it is still used even for ED25519? Or what was the story there? I'd like to clear this up.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll double check this one too.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hm I don't recall this particular discussion but it looks wrong in this case, if uptane.key_type is set to ed25519 for example.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's not worry about that for this PR; there's already enough going on here. I made a ticket to track this, though: https://saeljira.it.here.com/browse/OTA-4276.
// TODO: it does not make much sense to read, pass via a function parameter to Virtual secondary | ||
// and store the file that has been already downloaded by Primary | ||
// Primary should apply ECU (primary, virtual, secondary) specific verification, download and installation logic in | ||
// the first place |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure I agree with this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why ? What's your argument(s) ?
I think, it make sense to download a file straight to a desired location/place or copy it from the primary place to the desired location eliminating passing a file content as a string parameter of a function/method call.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think, it make sense to download a file straight to a desired location/place or copy it from the primary place to the desired location eliminating passing a file content as a string parameter of a function/method call.
I agree with that statement. We shouldn't really be passing strings around like this. It should either be a file handle or a read handle to the storage. Do we have a ticket for that already? We should if we don't.
Primary should apply ECU (primary, virtual, secondary) specific verification, download and installation logic in the first place
This is what I don't agree with. Specifically, the Primary cannot apply installation logic for a Secondary. Or what do you mean by "installation logic", actually? Perhaps I've misunderstood.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we have a ticket for that already? We should if we don't.
No, we don't a have a ticket specifically for this issue, but it could be part of the epic on the secondary and package manager refactoring as it most likely will require the secondary interface change.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure. Can you make that ticket?
We successfully changed the API call for OpenStoredTarget
to use StorageTargetRHandle
, which it seems like we might be able do for virtual Secondaries. It probably won't be that simple for "real" Secondaries, though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
API/interface for real and virtual secondaries should be the same, effectively one of the questions here how to turn sendFirmware(const std::string& data) into something more flexible and robust so we can pass a file handler/name in one case, TLS certs and treehub in another and actual image data in some other...
4548419
to
d5af374
Compare
This pull request fixes 1 alert when merging d5af374 into e871cc7 - view on LGTM.com fixed alerts:
|
4cb1ac7
to
1a9be9a
Compare
This pull request fixes 1 alert when merging 1a9be9a into 3eeccc7 - view on LGTM.com fixed alerts:
|
1a9be9a
to
1446fbb
Compare
This pull request fixes 1 alert when merging 1446fbb into a6ee6b4 - view on LGTM.com fixed alerts:
|
I just tested this manually and got this when I tried to provision:
|
3df3e99
to
0f8a6c1
Compare
This pull request fixes 1 alert when merging 0f8a6c1 into 45f5416 - view on LGTM.com fixed alerts:
|
0f8a6c1
to
1a34721
Compare
This pull request fixes 1 alert when merging 1a34721 into 45f5416 - view on LGTM.com fixed alerts:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Checkpointing my re-review.
|
||
private: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Redundant private
, but not critical at all.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok
// TODO: fake package manager | ||
const std::string fake_pacman_case = "fake_pacman"; | ||
installed_image_info.len = fake_pacman_case.size(); | ||
installed_image_info.hash = Uptane::ManifestIssuer::generateVersionHashStr(fake_pacman_case); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd probably prefer to leave the hash empty, although I'm open to discussion on what the right thing to do is. Either way, not enough to hold this up for now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The backend will reject a manifest with an empty hash value.
|
||
#include "package_manager/ostreemanager.h" | ||
|
||
// TBD: consider moving this and SotaUptaneClient::secondaryTreehubCredentials() to encapsulate them in one place that |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Any reason to chose "TBD" over "TODO"? It makes grepping more challenging, unless that was the goal.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
will replace it with TODO.
|
||
static std::pair<bool, std::shared_ptr<Uptane::SecondaryInterface>> create(const std::string& address, | ||
unsigned short port, int con_fd); | ||
static SecondaryInterface::Ptr create(const std::string& address, unsigned short port, int con_fd); | ||
|
||
explicit IpUptaneSecondary(const std::string& address, unsigned short port, EcuSerial serial, | ||
HardwareIdentifier hw_id, PublicKey pub_key); | ||
|
||
// It looks more natural to return const EcuSerial& and const Uptane::HardwareIdentifier& | ||
// and they should be 'const' methods |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
They are const
methods now so that comment can be updated.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are a lot of TODOs here, and historically we only allowed those if they were associated with a ticket to actually investigate them. (That's one of the things the LGTM check broadly helps with.) You don't need to do that before we merge this, but it would be good to follow up on those.
Most of my comments here are small things, so hopefully it won't take long to resolve them. Generally looks pretty good, and we've both tested it, so I know it's close.
// optional, if not specified nothing is updated, just image data are verified and delivered | ||
boost::filesystem::path filepath; | ||
// optional, if not specified any target can be applied | ||
std::string target_name; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just to confirm our conversation: I'd like to try to keep these out for now. target_name
can just be the device ID, and filepath
can basically be hardcoded for now.
if (!storage->loadEcuReportCounter(&ecu_cnt) || (ecu_cnt.size() == 0)) { | ||
LOG_ERROR << "No ECU version report counter, please check the database!"; | ||
// should we send manifest at all in this case ? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This might just have to be a TODO for now.
} else { | ||
LOG_ERROR << "Secondary manifest is corrupted or not signed, manifest: " << secmanifest; | ||
// should we report to BE about it ? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should be a TODO linked to that ticket in some way.
// download an ostree revision just for Primary, Secondary will do it by itself | ||
// Primary cannot verify downloaded OSTree targets for Secondaries, | ||
// Downloading of Secondary's ostree repo revision to the Primary's can fail | ||
// if they differ signficantly as ostree has a certain cap/limit of the diff it pulls |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you explain this bit about "ostree has a certain cap/limit of the diff it pulls"? I wasn't aware of that.
To me, the comment about "Downloading of Secondary's ostree repo revision to the Primary" is a bit confusing, because the larger conceptual problem is that the Primary shouldn't be downloading objects for the Secondary if it has no use for them and they would just waste space.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the Primary shouldn't be downloading objects for the Secondary if it has no use for them and they would just waste space.
That's exactly what we need this if statement if (update.IsForEcu(primary_ecu_serial) || !update.IsOstree())
for, i.e. to avoid downloading of an ostree revision aimed for a secondary's repo to a primary's ostree repo. The purpose of this comment is to justify this of statement.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fair enough, I was just a bit thrown off. But is it really true that ostree limits the size of a diff it pulls?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you explain this bit about "ostree has a certain cap/limit of the diff it pulls"? I wasn't aware of that.
`Dec 12 17:27:56 qemux86-64 aktualizr while pulling image: 0 Writing content object: min-free-space-percent '3%' would be exceeded, at least 12.2 MB requested
Error message while pulling secondary's ostree repo version/revision/commit-objects to the primary's one if there is big diff between two of them`
I ran into it a few times on qemu when was testing an ostree update on secondary at the initial stage of development of the ostree update on secondary.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Gotcha. That error is just because the primary is running out of space, though; that isn't specifically dependent on the size of the diff. I've seen something like "error 34 need more input" which I was hoping was related, but I still don't always really understand why I see that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That error is just because the primary is running out of space, though; that isn't specifically dependent on the size of the diff.
My understanding is that the error depends on both an available free space on a storage and a diff size (https://salsa.debian.org/debian/ostree/commit/1f5ce1a9f789d9c0de5d6fbdf79540bf71c5bc9b, "specifies a minimum
percentage of total space (in blocks) in the underlying filesystem to
keep free. The default value is 3 ")
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
also, I get an error when I am trying to pull an ostree revision of the qemu secondary/primary to a local ostree generated by the makephysical tool.
src/libaktualizr/utilities/utils.cc
Outdated
@@ -275,7 +275,8 @@ std::string Utils::genPrettyName() { | |||
std::string Utils::readFile(const boost::filesystem::path &filename, const bool trim) { | |||
boost::filesystem::path tmpFilename = filename; | |||
tmpFilename += ".new"; | |||
|
|||
// that's kind of dangerous to include a specific use-case business logic into a generic function used by many use | |||
// cases |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What would you prefer instead? If you see an actual problem, it should probably have a TODO associated with it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just read a file in Utils::readFile
and introduce another function that removes the '.new' file and utilizes the "clean" version of Utils::readFile
. Probability that actual problem occurs is very low, but the checking and removing of the ".new" file is redundant/useless in most of the places this function is called from.
- Each target listed in Director's target file should have corresponding match in ImageRepo's one - Test for the use-case if IP Secondary Director's target file and ImageRepo's one have different number of targets - Separate the ostree specific tests from non-ostree one in the IP Secondary Verification test suite Signed-off-by: Mike Sul <[email protected]>
…r IP Secondary Signed-off-by: Mike Sul <[email protected]>
1a34721
to
b90c955
Compare
This pull request fixes 1 alert when merging b90c955 into 4c41de8 - view on LGTM.com fixed alerts:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good once CI passes!
b90c955
to
3bcf5c5
Compare
Added corresponding record into CHANGELOG.md |
This pull request fixes 1 alert when merging 3bcf5c5 into 4c41de8 - view on LGTM.com fixed alerts:
|
@@ -27,7 +31,7 @@ class Bootloader { | |||
private: | |||
const BootloaderConfig config_; | |||
|
|||
INvStorage& storage_; | |||
std::shared_ptr<INvStorage> storage_; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What was the reason for this change? shared_ptr makes it much harder to understand the lifetime of an object.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What was the reason for this change?
- make it consistent with the way the other components "owns" a pointer to the storage instance
- it imposes an additional burden to the bootloader clients to take care about the storage instance lifetime
static std::vector<Campaign> campaignsFromJson(const Json::Value &json); | ||
static void JsonFromCampaigns(const std::vector<Campaign> &in, Json::Value &out); | ||
static std::vector<Campaign> fetchAvailableCampaigns(HttpInterface &http_client, const std::string &tls_server); | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
While it seems logical to make those static methods, do we need a namespace for a single class now?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I made them static members to make make docs
(doxygen) happy, for some reason it cannot see these three functions implementation if it is just in a scope of the namespace.
do we need a namespace for a single class now?
I think, it can be removed now.
Overall, it's very hard to review and to follow the changes. Things, like renaming a class or changing a class member, should be in separate commits with a clear description. I'm fine with merging new functionality as a single commit if it passes basic tests, but changes to the main codebase should be separated, so we can easily track them and revert if needed. |
fair enough, will try not to produce such huge PRs going forward and break changes into smaller commits. |
I briefly reviewed all the changes and apart from commit history my main question is about introducing shared pointers for bootloader and storage. I still don't understand the reason for that and, in general, I'd really like to avoid them. IMO, what we want to see in the future is a clear class hierarchy with transparent object ownership and lifetime model and not a situation when everybody has shared pointers to each other. |
- Decouple the secondary interface implementation (aktualizr_secondary) from the TCP server - Decouple the secondary implementation common functionality from an update/install method specific (update agent) - Encapsulate the manifest management functionality in a single component and reuse it for Primary and Secondary - Remove some outdated tests for the secondary and add more relevant tests Signed-off-by: Mike Sul <[email protected]>
3bcf5c5
to
0a74dda
Compare
This pull request fixes 1 alert when merging 0a74dda into 21c8595 - view on LGTM.com fixed alerts:
|
Replaced by #1518. |