Skip to content
This repository has been archived by the owner on May 21, 2024. It is now read-only.

Only send device data if it has changed #1673

Merged
merged 5 commits into from
May 25, 2020

Conversation

pattivacek
Copy link
Collaborator

We now stored hashes of the hardware/system information, list of installed packages, network information, and configuration in the database. The hardware and network data were previously stored locally in SotaUptaneClient, but this persists over restarts, so it saves even more bandwidth.

Note that at least on my laptop, one field in the output of lshw changes every time I call it, so the hardware info gets sent every time. Thankfully, I could not reproduce that problem on qemu.

I also removed sending the manifest as part of device data. There's no reason to; it is always sent as part of checking for updates, and there is never a time that device data is sent that isn't immediately followed by checking for updates.

There's no reason to; it is always sent as part of checking for updates,
and there is never a time that device data is sent that isn't
immediately followed by checking for updates.

Signed-off-by: Patrick Vacek <[email protected]>
@pattivacek pattivacek force-pushed the feat/OTA-4882/cache-device-data branch from d388961 to 15658ab Compare May 19, 2020 14:33
@eu-siemann
Copy link
Contributor

QEMU doesn't have frequency scaling, most of the real hardware will behave as your laptop.

Copy link
Contributor

@kbushgit kbushgit left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see that the lshw output is different only in section with id: cpu and the value size, is this value critical for us? Can we just ignore this field and remove it from the resulting json?

@pattivacek
Copy link
Collaborator Author

I see that the lshw output is different only in section with id: cpu and the value size, is this value critical for us? Can we just ignore this field and remove it from the resulting json?

No, not critical, and yes, we could just ignore it, but that requires parsing something that we previously left untouched. We've since agreed to only send the device data once during initialization and only again if a user provides custom data through the new API function, so we don't have to worry about it anymore. I still need to update the PR, though.

@pattivacek
Copy link
Collaborator Author

I also don't understand why the tests are failing. I haven't been able to reproduce locally yet.

@codecov-commenter
Copy link

codecov-commenter commented May 20, 2020

Codecov Report

Merging #1673 into master will increase coverage by 0.23%.
The diff coverage is 87.32%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #1673      +/-   ##
==========================================
+ Coverage   82.61%   82.85%   +0.23%     
==========================================
  Files         190      190              
  Lines       12087    12142      +55     
==========================================
+ Hits         9986    10060      +74     
+ Misses       2101     2082      -19     
Impacted Files Coverage Δ
src/libaktualizr/primary/sotauptaneclient.h 100.00% <ø> (ø)
src/libaktualizr/storage/invstorage.h 97.95% <ø> (ø)
src/libaktualizr/storage/sqlstorage.h 50.00% <ø> (ø)
src/libaktualizr/storage/sqlstorage.cc 80.58% <75.00%> (+0.83%) ⬆️
src/libaktualizr/primary/sotauptaneclient.cc 91.35% <93.61%> (+0.66%) ⬆️
src/aktualizr_info/main.cc 92.34% <0.00%> (+0.85%) ⬆️
src/libaktualizr/storage/sqlstorage_base.cc 80.95% <0.00%> (+5.44%) ⬆️
src/libaktualizr/storage/sqlstorage_base.h 100.00% <0.00%> (+40.00%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update c903991...42c6ce8. Read the comment docs.

-- Don't modify this! Create a new migration instead--see docs/ota-client-guide/modules/ROOT/pages/schema-migrations.adoc
SAVEPOINT MIGRATION;

CREATE TABLE device_data(unique_mark INTEGER PRIMARY KEY CHECK (unique_mark = 0), hardware_info_hash TEXT NOT NULL DEFAULT "", installed_packages_hash TEXT NOT NULL DEFAULT "", network_info_hash TEXT NOT NULL DEFAULT "", configuration_hash TEXT NOT NULL DEFAULT "");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm starting to dislike this kind of one-row tables.

Maybe we can actually simplify things here with something like a key/value:

CREATE TABLE device_data(data_type TEXT PRIMARY KEY, hash TEXT NOT NULL);

And then, we don't even have to duplicate the load/store methods completely, they can call these with data_type as a constant:

void storeDeviceDataHash(const std::string &data_type, const std::string &hash);
bool loadDeviceDataHash(const std::string &data_type, std::string *hash);

Plus, each of these can just use INSERT OR REPLACE instead of checking if the table is empty first.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I was debating whether I should add these new items to the existing device_info table. I decided against it just because it seemed like two separate concepts. This seems like it would be more flexible (as in, fewer migrations necessary to add new items). Should we migrate some of the other similar things to that model as well?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a fair question. Perhaps I was too vague about what felt wrong but it's probably that these hashes are pretty independent from each others and read/written at different times. And they have the same format, so it felt more natural to have them in different rows.

As far as other examples we have, the primary_keys table looks similar but I think the public/private keys are not so independent and we store both at once. It also looks like the other "one-row" tables are updated all at once, with fields of various types.

Also (again, opinion :D), the pain threshold we should have before changing the table schema should be slightly higher than changing regular code as they are not completely free to migrate (resources as much as risk). So I don't think we have anything urgent to change there right now.

@pattivacek pattivacek force-pushed the feat/OTA-4882/cache-device-data branch from 451482b to 7cdfadc Compare May 22, 2020 18:27
lbonn
lbonn previously approved these changes May 25, 2020
"INSERT OR REPLACE INTO device_data(data_type,hash) VALUES (?,?);", data_type, hash);
if (statement.step() != SQLITE_DONE) {
LOG_ERROR << "Can't set " << data_type << " hash: " << db.errmsg();
return;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we throw an exception here?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, probably.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Although, to be fair, there are lots of other parts of the storage that don't throw given the same circumstances.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I know. Still, will be fewer places to fix in the future ;)

@eu-siemann
Copy link
Contributor

LGTM, except one comment

We now stored hashes of the hardware/system information, list of
installed packages, network information, and configuration in the
database. The hardware and network data were previously stored locally
in SotaUptaneClient, but this persists over restarts, so it saves even
more bandwidth.

Note that at least on my laptop, one field in the output of lshw changes
every time I call it, so the hardware info gets sent every time.
Thankfully, I could not reproduce that problem on qemu.

Signed-off-by: Patrick Vacek <[email protected]>
Custom hardware info will be sent if it is provided and it has changed
from what was sent last.

Signed-off-by: Patrick Vacek <[email protected]>
The Primary is now more efficient and I guess the Secondary needs some
extra time to catch up now. I wish there was a better solution here.

I fixed a typo and tried to make the expiry tests a bit more consistent
with each other since that was originally what I thought might be a
cause of the failure.

Signed-off-by: Patrick Vacek <[email protected]>
@pattivacek pattivacek force-pushed the feat/OTA-4882/cache-device-data branch from 7cdfadc to 42c6ce8 Compare May 25, 2020 10:11
@eu-siemann eu-siemann self-requested a review May 25, 2020 10:37
@pattivacek pattivacek merged commit adadf80 into master May 25, 2020
@pattivacek pattivacek deleted the feat/OTA-4882/cache-device-data branch May 25, 2020 12:28
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants