Skip to content
This repository has been archived by the owner on May 21, 2024. It is now read-only.

Refact/ota 4814/install on secondary #1642

Merged
merged 32 commits into from
Jun 18, 2020

Conversation

mike-sul
Copy link
Collaborator

@mike-sul mike-sul commented Apr 15, 2020

verified the given PR's aktualizr-primary against aktualizr-secondary of the given PR, the current master and 2020.5, both types of an update were verified, ostree and binary/file (qemu).

Also, a basic test to verify the backward compatibility is added to this test suite

std::make_pair(1024 * 3 + 1, false /* old messages/sendFirmware, fallback testing */),
.

@mike-sul mike-sul force-pushed the refact/OTA-4814/install-on-secondary branch 4 times, most recently from ad2d418 to d9c1f78 Compare April 15, 2020 11:01
@codecov-io
Copy link

codecov-io commented Apr 15, 2020

Codecov Report

Merging #1642 into master will decrease coverage by 0.00%.
The diff coverage is 80.52%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #1642      +/-   ##
==========================================
- Coverage   82.71%   82.71%   -0.01%     
==========================================
  Files         190      194       +4     
  Lines       12087    12266     +179     
==========================================
+ Hits         9998    10146     +148     
- Misses       2089     2120      +31     
Impacted Files Coverage Δ
src/aktualizr_secondary/secondary_tcp_server.h 100.00% <ø> (ø)
src/aktualizr_secondary/update_agent_ostree.h 100.00% <ø> (ø)
src/libaktualizr-isotp/isotpsecondary.cc 0.00% <0.00%> (ø)
src/libaktualizr-isotp/isotpsecondary.h 0.00% <ø> (ø)
src/libaktualizr-posix/ipuptanesecondary.h 100.00% <ø> (ø)
src/libaktualizr/primary/sotauptaneclient.h 100.00% <ø> (ø)
src/virtual_secondary/managedsecondary.h 100.00% <ø> (ø)
.../virtual_secondary/partialverificationsecondary.cc 67.34% <0.00%> (ø)
...c/virtual_secondary/partialverificationsecondary.h 63.63% <ø> (ø)
src/virtual_secondary/virtualsecondary.h 100.00% <ø> (ø)
... and 45 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 48dcb6e...d948dec. Read the comment docs.

@mike-sul mike-sul force-pushed the refact/OTA-4814/install-on-secondary branch 3 times, most recently from 13440a3 to 7c6ad01 Compare April 15, 2020 13:16
Copy link
Contributor

@lbonn lbonn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Quite a lot to review! Here are mostly small comments on my first pass.

src/libaktualizr/uptane/secondaryinterface.h Outdated Show resolved Hide resolved
src/virtual_secondary/managedsecondary.cc Outdated Show resolved Hide resolved
src/virtual_secondary/managedsecondary.cc Outdated Show resolved Hide resolved
src/virtual_secondary/managedsecondary.cc Outdated Show resolved Hide resolved
Copy link
Collaborator

@pattivacek pattivacek left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not done, but here's what I've got so far until I get another break.

src/aktualizr_secondary/main.cc Show resolved Hide resolved
src/aktualizr_secondary/update_agent_file.cc Show resolved Hide resolved
src/aktualizr_secondary/update_agent_file.cc Show resolved Hide resolved
src/aktualizr_secondary/update_agent_file.cc Outdated Show resolved Hide resolved
src/aktualizr_secondary/update_agent.h Outdated Show resolved Hide resolved
src/aktualizr_secondary/aktualizr_secondary.h Outdated Show resolved Hide resolved
src/aktualizr_secondary/aktualizr_secondary.h Show resolved Hide resolved
src/aktualizr_secondary/aktualizr_secondary_interface.h Outdated Show resolved Hide resolved
@mike-sul mike-sul force-pushed the refact/OTA-4814/install-on-secondary branch from 7c6ad01 to 308710c Compare April 16, 2020 13:18
Copy link
Contributor

@eu-siemann eu-siemann left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall looks good! Do I understand correctly that the main change in secondary iface is to remove sendFirmware step and to provide an image reader callback instead, that will be called internally in install?

src/aktualizr_primary/secondary.cc Outdated Show resolved Hide resolved
src/libaktualizr/crypto/crypto.h Outdated Show resolved Hide resolved
src/libaktualizr/primary/aktualizr.h Outdated Show resolved Hide resolved
if (send_firmware_result) {
result = secondary.install(target.filename());
}
data::ResultCode::Numeric result = secondary.install(target);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removing some code from sotauptaneclient.cc is nice!

src/libaktualizr/uptane/uptane_test.cc Outdated Show resolved Hide resolved
@mike-sul
Copy link
Collaborator Author

the main change in secondary iface is to remove sendFirmware step and to provide an image reader callback instead, that will be called internally in install?

Yes, the main change in the secondary interface is getting rid of sendFirmware() method.
The image reader callback is not exactly part of the interface (SecondaryInterface), this is rather an optional parameter of a specific secondary that might need it (e.g. the ostree secondary does not need it).

The whole point of this removal (sendFirmware) is to let a specific secondary decide what exactly to do in order to install a target and don't put any its specifics into a common interface. So, a concrete secondary needs to implement install(const Target&) method and how exactly it's done and what is an internal protocol/communication between primary and secondary ECUs is solely at its discretion. This is the next step after "freeing" the IP Secondary from dependency on the secondary interface.

Before that the secondary interface reflected a specific use-case (file/binary update with full Uptnae verification) so any other implementation had a headache of adjusting its implementation into that interface and protocol.

The next logical step, IMHO, would be changing 'putMetadata(const RawMetaPack& meta_pack)' to 'putMetadata(const Target& target)'.

And the whole strategy here is to bring the update procedure on Primary and Secondary to a common denominator so the sotauptaneclient would perform the Uptane update on different entities (Primary ECU, different types of Virtual ECUs, different types of real ECUs) in a uniform way.

Copy link
Collaborator

@pattivacek pattivacek left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Finally got all the way through. Generally does look quite good, but I have several comments and questions. We still need to decide whether we want to merge this soonish or wait until we have all of our intended refactorings and interface changes complete.

I do remember now why we had inherited from the SecondaryInterface on the Secondary before. It was so that you would immediately know if you changed something that could break the compatibility between them. Obviously that's not perfect but it was meant to help track those types of changes. Now we just need to be vigilant when making changes in these parts of the code.

src/aktualizr_secondary/aktualizr_secondary_test.cc Outdated Show resolved Hide resolved
src/aktualizr_secondary/aktualizr_secondary_test.cc Outdated Show resolved Hide resolved
src/aktualizr_secondary/secondary_rpc_test.cc Outdated Show resolved Hide resolved
src/libaktualizr-isotp/isotpsecondary.cc Show resolved Hide resolved
src/libaktualizr-isotp/isotpsecondary.cc Show resolved Hide resolved
src/libaktualizr-posix/ipuptanesecondary.cc Outdated Show resolved Hide resolved
src/libaktualizr/crypto/crypto.h Outdated Show resolved Hide resolved
src/virtual_secondary/partialverificationsecondary.h Outdated Show resolved Hide resolved
src/virtual_secondary/managedsecondary.h Outdated Show resolved Hide resolved
src/virtual_secondary/managedsecondary.cc Show resolved Hide resolved
@mike-sul
Copy link
Collaborator Author

It was so that you would immediately know if you changed something that could break the compatibility between them.

First of all, that what tests are for, secondary the previous design could not help with detecting changes in the RPC messages itself. The bigger changes, e.g. RPC message format is changed will be detected by the given PR's design. too as well as any changes in the secondary interface will require changes in the secondary implementation on Primary.

@mike-sul mike-sul force-pushed the refact/OTA-4814/install-on-secondary branch from 308710c to 7822603 Compare April 17, 2020 08:52
@eu-siemann
Copy link
Contributor

I must admit I am not so sure about merging sendFirmware and install into one call.

Those are the separate steps in the Uptane spec:
5.4.2.7. Send images to Secondaries
and
5.4.3.x Installing images on Primary or Secondary ECUs

And I see at least one good reason for that: if primary doesn't have enough space to store the secondary image it can utilize secondary storage via the sendFirmware call, while still verifying the hash. If hash verification by the primary fails, it will not proceed to the installation step.

@mike-sul
Copy link
Collaborator Author

mike-sul commented Apr 17, 2020

I must admit I am not so sure about merging sendFirmware and install into one call.

Those are the separate steps in the Uptane spec:
5.4.2.7. Send images to Secondaries
and
5.4.3.x Installing images on Primary or Secondary ECUs

And I see at least one good reason for that: if primary doesn't have enough space to store the secondary image it can utilize secondary storage via the sendFirmware call, while still verifying the hash. If hash verification by the primary fails, it will not proceed to the installation step.

That's not exactly what happens here, there are still two phases, firmware upload/delivery, and installation, it just moved from sotauptaneclient to the Primary's part of secondary implementation.

And I see at least one good reason for that: if primary doesn't have enough space to store the ?> > > secondary image it can utilize secondary storage via the sendFirmware call, while still verifying the > hash. If hash verification by the primary fails, it will not proceed to the installation step.

I think, the functionality you are describing would be even easier to implement with the given API/interface compared to previous version of SecondaryInterface. Moreover, I actually, think that both the previous version and the one in the given PR are not good enough to solve to given issue properly. IMHO, the right design here would be the introduction of a logical ECU notion, so specific logical ECU would be responsible not just for sending firmware from Primary to Secondary but also for downloading of an image from BE. In this case, specific Secondary implementation (Primary part) will be able to implement a download mechanism in such a way that it will actually stream it directly to Secondary. Otherwise, with having this fake package manager and multi-types Secondaries it would really hard to implement the business logic within sotauptaneclient that accommodates all use cases of each Secondary type.

@mike-sul mike-sul closed this Apr 17, 2020
@mike-sul mike-sul reopened this Apr 17, 2020
@mike-sul mike-sul force-pushed the refact/OTA-4814/install-on-secondary branch 2 times, most recently from 4c71ce3 to e1d2af7 Compare April 17, 2020 11:57
@pattivacek
Copy link
Collaborator

Two more questions:

  1. How does this impact downloading large objects on the Secondary and storing them in memory? I know this gets rid of the std::string, though.
  2. Have you tested manually upgrading a Secondary from the old version to the new version with qemu or whatever?

@mike-sul
Copy link
Collaborator Author

Two more questions:

  1. How does this impact downloading large objects on the Secondary and storing them in memory? I know this gets rid of the std::string, though.

It improves the uploading of large objects from Primary to Secondary as it doesn't store overall image data in RAM in general and doesn't pass them via function call stack (passing a string with >1GB size as a parameter is not exactly what we wanna do). Image data are transferred from Primary to Secondary by pieces.

  1. Have you tested manually upgrading a Secondary from the old version to the new version with qemu or whatever?

Yes, details are in my first comment to this PR.

Copy link
Collaborator

@pattivacek pattivacek left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've added a couple commits to clean things up and I have one lingering question to look into.

src/libaktualizr-posix/asn1/asn1_message.cc Outdated Show resolved Hide resolved
And restore the metadata RPC test.

Signed-off-by: Patrick Vacek <[email protected]>
RawMetaPack had a fixed size, which makes it impractical for using with
delegations. This way we can use whatever metadata we have, and whenever
we get around to supporting delegations for Secondaries or
Timestamp/Snapshot from the Director, it won't be a big deal to add them
in as needed.

Greater flexibility comes with the cost of verbosity, but I don't think
it's too bad.

Signed-off-by: Patrick Vacek <[email protected]>
Mostly due to changes in reading stored files via the package manager
instead of the storage class.

I really don't like "fixup" commits like this but it's way too tedious
to keep doing this at this point.

Signed-off-by: Patrick Vacek <[email protected]>
putRoot, putMetadata, sendFirmware, and install now all return an
InstallationResult. It's a bit of a misnomer but is allows us to return
an error code and a freeform string, which seems ideal.

Signed-off-by: Patrick Vacek <[email protected]>
That does mean that it does not work for other types of Secondary, but
since it is only used for testing, that's a good thing. It is now
encapsulated solely in a class only used for testing and demonstration.

Signed-off-by: Patrick Vacek <[email protected]>
Seems like it doesn't really need to be in VirtualSecondary, and since
it is virtual it can still be overridden just like the other functions
if desired.

Signed-off-by: Patrick Vacek <[email protected]>
Also update the fault injection doc and add some better comments to
explain the custom ResultCode usage.

Signed-off-by: Patrick Vacek <[email protected]>
Now putMetadata, sendFirmware, and install all can return a return code
and a description. This should get forwarded all the way back to the
backend now.

Note that some of other requests (like getInfor and getManifest) do not
yet use anything like that. I didn't see a need, but I'm open to
discussion.

Protocol negotiation is now only done during putMetadata, so that's the
only thing that gets retried if a Secondary is still using the old v1
protocol messages. This can still be improved a bit.

I also explicitly labeled some of the messages that are now deprecated
as part of the first version of the protocol.

Signed-off-by: Patrick Vacek <[email protected]>
In the IpSecondary code, it was originally used for sendFirmware but
ended up in the install function due to refactoring mistakes. However,
it doesn't appear to be necessary.

In the ManagedSecondary code, the mutex was actually unused.

Signed-off-by: Patrick Vacek <[email protected]>
I refactored the RPC tests substantially but functionally only added
another test that verifies that errors reported by the Secondary are
correctly interpreted on the Primary (both result code and description).

We already cover that these errors are reported from the Secondary
implementations on the Primary back to the server via the installation
report thanks for the VirtualSecondary failure injection tests.

Signed-off-by: Patrick Vacek <[email protected]>
Now there is an explicit process for it. We have to check more often
than I'd like, but I don't see an easy way how to avoid that since
Secondaries that need to reboot to complete installation can
unexpectedly change their version between any two manifest requests.

Signed-off-by: Patrick Vacek <[email protected]>
Signed-off-by: Patrick Vacek <[email protected]>
I actually caught a bug with this... but the bug was just in the mock.

Signed-off-by: Patrick Vacek <[email protected]>
I've yet to encounter a case when that status would not imply success,
and you can still control the success boolean if desired with the
three-parameter constructor.

Signed-off-by: Patrick Vacek <[email protected]>
@pattivacek pattivacek force-pushed the refact/OTA-4814/install-on-secondary branch from 1310543 to c8b6be5 Compare June 18, 2020 06:46
@pattivacek
Copy link
Collaborator

Well, it's only taken two months, but this PR is finally ready for review. I think all of it has already been reviewed at least once at some point, and I feel comfortable merging it after having done substantial testing on it. However, I'd like to provide one more chance for others to take a look and provide their input. @mike-sul I'm also curious if you see any causes for concern.

@lgtm-com
Copy link

lgtm-com bot commented Jun 18, 2020

This pull request introduces 3 alerts when merging c8b6be5 into f868186 - view on LGTM.com

new alerts:

  • 3 for Expression has no effect

@advancedtelematic advancedtelematic deleted a comment from lgtm-com bot Jun 18, 2020
@mike-sul
Copy link
Collaborator Author

I'm also curious if you see any causes for concern

@patrickvacek At first glance, I don't see any reasons for concern, will look into it closer later today.
In any case, I suppose, that it makes sense to communicate to the customers about the correct procedure of deployment/switching to the new version. (at first, deploy on Primary and then on Secondaries, iirc).

Copy link
Contributor

@eu-siemann eu-siemann left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great job, @mike-sul , and @patrickvacek ! Everything looks good to me, let's finally merge it :)

@eu-siemann eu-siemann merged commit d089d8b into master Jun 18, 2020
@eu-siemann eu-siemann deleted the refact/OTA-4814/install-on-secondary branch June 18, 2020 15:21
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants