-
Notifications
You must be signed in to change notification settings - Fork 57
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Increase update check efficiency #83
Comments
TODO: open issue in the AppImageKit repo to first define a method how to embed and later read the UIDs from the AppImages. I guess we'll end up adding a section to the ELF header. |
As people want reproducible AppImages, I think we should use some sort of a checksum rather than a random UUID. What do you think? |
A checksum that depends on the contents won't work for various reasons. If I only found the original feature description... |
@TheAssassin is it this one? AppImage/AppImageKit#525 |
Yes, thanks a lot, @pamarcos. I was looking in the repos of AppImageUpdate and zsync2. |
Just to give more information about why I wanted to include custom information in zsync header, in my case it was to have the ability to have meta info that can be read in the client in order to distinguish whether we need to update or not. ATM zsync will always try to sync the file itself if it detects it's different (EDIT: added the bold words), which may not always be desired if the client version is newer than the other one. Also, one could even provide the changelog into the header itself and present that to the user. Basically, it was to give developers the means to extend to whatever they find useful the zsync header ;). Thanks a lot for implementing it. For instance, Sparkle framework checks the version and will only update if and only if, the client is older. |
That's wrong. zsync2 (the CLI) performs an update check before attempting any syncs. The method is expensive, though, which is why we are looking for alternatives. |
Let me rephrase it: "ATM zsync will always try to sync the file itself if it is differerent , instead of if it is newer" The process to check whether it's different or not is what's expensive since it's basically doing the checksums for all the blocks |
@pamarcos zsync2 is a file syncer not a file updater, so syncing is what you want. Not to mention there is a possibility to compare the timestamps of the files, which works great, but which you didn't want back when I introduced it. |
That's correct, and it serves it purpose really well. However, in the context of AppImageUpdate (since we're using it to update AppImages) and thinking of the UX, using just zsync without anything extra lacks the functionality to have a proper updater. My idea about extending the zsync header was to allow developers to decide what they need to check to know whether to download or not a version or even provide extra information to the user. We introduced the With the current implementation of the AppImageUpdate lib we can easily do that, since all we need to do is get the zsync file (which has embedded the extra parameters of version and changelog), parse it and retrieve the version. If version is newer than client's version, we can ask the user whether to update or not, providing the new features and bugfixes.
I didn't like comparing timestamps because I think it's not very realiable. Not many users are going to touch the file, but those who do will lose updating capabilities. |
The zsync header already contains a checksum of the whole file:
I can only presume that calculating the SHA-1 checksum may be slightly faster than the adler+md4 checksum for every block, but I don't think that would make a huge performance improvement to what we have now. |
@pamarcos How about semver in the header and the AppImage to find which is the latest one ? |
@pamarcos hashing the entire file will never be performant. Also, we cannot use a hash, see AppImage/AppImageKit#525 (comment). One thing that is never going to happen is that there will be any kind of version number involved in the update process. Version numbers are not at all reliable. And many projects don't use them. Also, they're bad for continuous releases. Why do you think it's so hard for a developer to maintain a single static file (or symlink to a file) on a webserver? |
I completely agree. I was not suggesting to do so. I was just justifying why adding custom parameters into the zsync header was useful. I do not have the answer to how reliable check whether a file should be updated or not based only on the file if we're not using a hash. Using a UUID seems logic and fair enough to me, but it makes me a little uneasy since it wouldn't even be based on the file itself but just some random and large enough ID that is statistically unlikely to clash with another one. For my use case, checking the version is more than enough to know whether there's need to update or not. Using this approach, we don't need to do the whole checksum for the file which is slower. There's no development that needs to be done to accomplish this since we can already do it. I understand this is not the case for a generic AppImage, so we need another solution that I don't have ATM.
I don't think is hard to maintain. We do have the zsync file updated for every single release. However, there are a few cons to using the "sync this file with this one" approach from the UX point of view:
|
@pamarcos please open separate issues, as your feature proposals are off topic regarding the new update check method. (We might not implement the method I suggested in the first post after discussing those, though.)
Hm, okay, I think we need a solution to support "multi channel updates" and ways to switch between these, or maybe something like prioritized lists of update information, which are checked sequentially, and the first that returns a "change" is used for the next update. The problem, however, is, we need to find a method how this can be implemented in a way so it doesn't depend on any kind of versioning scheme. Suggestions welcome. I still believe the The way I would design this: you could specify a list of update information, e.g., I still feel this is overengineering. It should be possible for update system maintainers to publish the stable version's zsync file under the beta one's URL when there's a stable release. This way, no changes would be necessary to the AppImageUpdate code, as this is possible already (although, to be fair, the beta URL would need to be put into that AppImage to make sure beta releases will be downloaded in the future). Looks like we need a conceptual discussion about the complexity we want in AppImageUpdate, there's a lot of pros and cons to consider. Same goes for the changelog "API". @probonopd once suggested that AppImageUpdate could fetch the changelog for releases on GitHub via the GitHub API without much hassle. If someone comes up with an idea for the other types, I'd work on that. Maybe we can just add a Anyway, that's off topic. We should return to discussing the update check efficiency. Let's open separate issues for the other two topics. |
I think I didn't explain myself properly. I was actually not suggesting any feature, everything I stated should be possible if custom information can be embedded and retrieved into/from the zsync2 header. You said you already implemented that, so nothing else to be done 👍 . I haven't tried it yet, though. I just justified the need for that and explained how I think it can be used to check for updates without running a hash check provided your app has a proper versioning scheme. In case someone else may find it useful. I should probably have given that piece of information in a different place because I see it caused confusion.
Every single time I have worked with times in different time zones it's been a true headache with lots of corner cases to cover. Computerfile has a funny video dedicated to this.
Agree, sorry for the confusion. |
@pamarcos I see, well, I think we need a better API to access the headers, I'll open an issue for that.
Cool. Now we just need to fix the issue @probonopd brought up regarding reproducibility of AppImages. I recently changed some minor detail in the way the squashfs image is generated, so that now AppImages that are built from the exact same AppDir will be bitwise equivalent. We should not break with that feature by using randomly generated UUIDs. I think we might really get away with a hash sum that does not depend on the AppImage but only the squashfs image. Potential caveats involve that changes of the runtime or ELF header fields won't be recognized by AppImageUpdate any more. We could mitigate them to some point by combining this check with the MTime based method, i.e., even if the hash value is the same, if the MTime of the AppImage on the server is newer, an expensive update check is performed. That way, at least a majority of runtime changes will be noticed. |
I didn't find the original description, so I'll describe the idea again below.
I think it was @pamarcos who's requested a "custom .zsync file header" feature for zsync2, which I implemented a while ago. Using this feature, @probonopd and I designed a way to highly increase update checks' efficiency by adding such a custom header.
The idea is to embed some kind of randomly generated (well, the actual constraint is that it doesn't depend on the contents of the AppImage, see
<insert link to original discussion>
) UID (unique identifier) in the AppImage before building the.zsync
file. The same UID is then added as a header field.When checking for an AppImage, zsync2 has the option to download the file header only. When it sees the header containing the UID, it could be compared to the one embedded in the file (AppImageUpdate would need to read it from the AppImage and pass it to the
ZSyncClient
).In case the values differ, a full update is triggered, even if the rest of the files are equal, at least the chunk with the UID needs to be updated.
The chance for collisions (i.e., the same value is used twice, although the files differ) is negligible, similar to the change for hash collisions.
I would suggest to use some "well known" algorithm to generate an identifier, e.g., UUID4. It can be stored efficiently in raw bytes in the AppImage, and stored hexadecimally in the .zsync file for readability.
The performance of the update check will improve a lot for large AppImage files, as hashing these files can be pretty slow, and they need to be hashed completely to ensure the file is up to date.
The text was updated successfully, but these errors were encountered: