Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Deployment is not atomic #226

Closed
weavejester opened this issue Jul 5, 2014 · 7 comments
Closed

Deployment is not atomic #226

weavejester opened this issue Jul 5, 2014 · 7 comments

Comments

@weavejester
Copy link

Deploying to Clojars isn't atomic, so that if there are any errors, it partially commits. Running the process again can therefore result in an inconsistent upload, and because redeployment is forbidden, it cannot be easily fixed.

Ideally Clojars should wait for a consistent project before locking the version.

@technomancy
Copy link
Collaborator

Agreed; happy to take a patch for this.

@xeqi
Copy link
Collaborator

xeqi commented Jul 5, 2014

Brain dump for anyone interested in this topic:

Deploying through lein/maven ends up using aether. They end up eventually using wagon-http and is a series of PUTs for each artifact deployed (file, checksums) a GET for the maven-metadata.xml, and a PUT with an updated maven-metadata.xml.

Note that an artifact is not a project. A project is made up of several artifacts such as a pom, signatures, and all jars that should be deployed, including ones with classifiers (such as -sources).

A couple of ways it could be done:

Artifact (file + checksums) as the atomic unit

  1. Allow overwriting of an artifact's files until a complete set of file and checksums exists.
  2. Could even perform additional validation on the file (valid jar, parsable pom, valid signature, valid checksums).
  3. Do not allow downloads of an artifact until it is complete.
  4. Do not allow overwriting a complete artifact.

Possible downsides:

  1. Any issues with concurrent uploads?
  2. The clients currently send an updated maven-metadata.xml at the end of deploying a project. We'd need to accept it, but trash it. Then when an artifact is complete write a new one ourselves. I don't think there currently is any code in clojars to mutate it. I expect we could find a library for it.

Project as the atomic unit

http based deployments would no longer be acceptable. We'd have to write a custom wagon provider. This would allow complete control over the upload mechanism, such as to notifications of locking and completion.

Possible downsides:

  1. Would require lein/maven/etc updates/plugins to use the custom wagon provider.
  2. Concurrent locking stuff?

I'd be happy to hear other ideas.

@technomancy
Copy link
Collaborator

Fully atomic operations would be great, but a simple fix of making immutability contingent upon an uninterrupted deploy (jar/pom/sigs/checksums present) would go a long way too.

@niwinz
Copy link

niwinz commented Nov 18, 2014

Any progress on it?

I'm currently have two last version of my library partially uploaded. I can't fix it reuploading and I can't delete them because clojars as far as I know does not supports delete uploaded versions...

@technomancy
Copy link
Collaborator

Sorry about this. I was performing maintenance on Clojars's DB, which is
currently in SQLite and not concurrent. I've put the maintenance on hold
for the time being.

@tobias
Copy link
Member

tobias commented Sep 22, 2015

Another approach would be to write the uploaded artifacts to a
session-scoped dir, validating the artifacts when the session is
complete, and moving them to the repo if validation passes.

Aether honors cookies, so we can use a session with deploys. Aether
also sends maven-metadata.xml last, so we can use that as a signal
to know when the deploy is done.

The process:

  1. An upload request comes in.
    • Check the session for an upload dir
    • If the dir doesn't exist, we create a unique one for this session
    • Make sure the upload dir ends up in the session
  2. Validate that the artifact doesn't already exist in the main
    repo or the upload dir, if so, bail. Do the normal name, etc
    validations as well
  3. Store the artifact in the upload dir
  4. If the artifact was maven-metadata.xml:
    • Validate the deploy, returning a useful failure status message if
      any fail (see Use the http status message to relay more context on deploy failure #367)
      • Confirm we have a pom
      • Confirm the pom is readable
      • If the pom packaging is "jar", confirm a jar exists
      • If any signatures exist, require signatures for all artifacts
      • Verify all signatures & checksums
    • If validation passes, move the artifacts to the repo, rm upload-dir

Possible Issues

  • aether works, but are there other clients that are used that may not
    work with a session? do folks curl deploys?
  • does any client send maven-metadata.xml out of order?
  • existing maven-metadata.xml files that have incorrect signatures
    don't currently prevent deploys - new versions can be deployed. The
    metadata won't be correct, but versions can still be downloaded
    directly (see #369). This change would break that, requiring all of
    those signature/bad metadata issues to be resolved manually. It
    would be worth it to scan the repo to find how many bad metadata's
    there are.
  • need a cron job to clean up old upload dirs

@tobias
Copy link
Member

tobias commented Mar 8, 2016

Implemented via 57fc478

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants