Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Volume Handle Checksum #112

Closed

Conversation

akutz
Copy link
Contributor

@akutz akutz commented Sep 20, 2017

Please note that this PR is dependent upon #111 -- a separate PR to isolate the changes to the spec.md file, removing the trailing whitespace that affects edits of that file.

This patch updates the VolumeHandle message with a new, required field - checksum - an MD5 checksum of the handle's ID and metadata (keys sorted lexicographically).

Value

A checksum for the handle provides both data integrity and a single, small, unique piece of data that can be used to quickly cache and retrieve volume information based on a handle. The handle's ID cannot be used for this purpose as the ID alone does not guarantee uniqueness.

Overhead

Because the checksums produced by the MD5 algorithm are 32 bytes in size, the checksum data does not add much overhead to the handle itself, easily allowing handles to stay within the suggested constraint of 1MiB.

Calculation

The checksum should be calculated by writing the handle's ID to a buffer followed by writing the metadata's key/value pairs to the same buffer where the keys are sorted lexicographically.

For example, take the following handle:

id =       "vol-00"
metadata = {"region":"US","alerts":"false","name":"MyVol"}

The buffer used to calculate the checksum for the above handle is:

vol-00alertsfalsenameMyVolregionUS

The MD5 checksum for the above handle is therefore:

81504c01d43ab67052fca49012875537

This patch updates the VolumeHandle message with a new, required field -
checksum. This field contains an MD5 checksum of the handle's ID and
metadata (keys sorted lexicographically).
@akutz akutz force-pushed the feature/handle-chksum branch from 6bfc292 to 7ec62fd Compare September 20, 2017 21:46
@jieyu
Copy link
Member

jieyu commented Oct 2, 2017

Can we instead require that id needs to be unique?

@akutz
Copy link
Contributor Author

akutz commented Oct 2, 2017

Hi @jieyu,

I mean, yeah. That works too. However, if the ID is unique then why the metadata? I also thought the ID cannot be unique thanks to the requirement that CSI work with existing platforms where ID+Metadata may be required to uniquely identify volumes.

@saad-ali
Copy link
Member

saad-ali commented Oct 3, 2017

I mean, yeah. That works too. However, if the ID is unique then why the metadata? I also thought the ID cannot be unique thanks to the requirement that CSI work with existing platforms where ID+Metadata may be required to uniquely identify volumes.

I agree, the problem @akutz is trying to work around is the fact that it is hard to use the VolumeHandle, as defined, as a unique key. Therefore, I think we should strongly consider removing metadata altogether. ID should be the unique string identifier. Storage providers can encode whatever they want inside it. Having both ID and metadata act as the handle makes it very difficult to use it as a handle because comparing a struct and map is much more difficult and not natively supported in most languages. Opened #116 to carry on this disucssion.

Closing this PR in the meantime.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants