Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Data checksum #23

Open
ottuzzi opened this issue Jan 12, 2015 · 2 comments
Open

Data checksum #23

ottuzzi opened this issue Jan 12, 2015 · 2 comments

Comments

@ottuzzi
Copy link

ottuzzi commented Jan 12, 2015

Hi,

it would be interesting if I can ask the target drive to return a checksum of written data but I do not see this possibility in the protocol: am I missing some detail?
I would like to be sure that if I asked to write some data these are really written and read as intended by the target disk: what I'm thinking is some new call to write data and, contextually, to return the checksummed value of what it was written to the disk.
The returned value can be checked with the "host" own value so we can have a good probability everything is fine.
What do you think?

Thanks
Bye
Piero

@jphughes
Copy link
Contributor

At this time we do not return the checksum of the data to be written, but we will check the value that is sent along with the data.

The way that it works now is:

  • Data is sent and the tag/algorithm contains the checksum (hash, cry, etc.) of the value
  • If the drive knows the tag/algorithm that was used, the drive will check that the tag and the data are still correct, and if they are not, report the key that is not correct.
  • On read, the original tag returned so that the reader can check it.
    At no time does the drive change the received tag.

Since the tag is set before sending, you can be assured that there is a complete end-2-end data integrity, If the drive calculated this, there is the risk of a data integrity failure between the host and the drive (i.e, TCP is not perfect, and TCP error detection is not perfect either) the drive would return the checksum of the wrong information.

Hope this helps

Jim

On Jan 12, 2015, at 2:15 AM, Piero Ottuzzi [email protected] wrote:

Hi,

it would be interesting if I can ask the target drive to return a checksum of written data but I do not see this possibility in the protocol: am I missing some detail?
I would like to be sure that if I asked to write some data these are really written and read as intended by the target disk: what I'm thinking is some new call to write data and, contextually, to return the checksummed value of what it was written to the disk.
The returned value can be checked with the "host" own value so we can have a good probability everything is fine.
What do you think?

Thanks
Bye
Piero


Reply to this email directly or view it on GitHub #23.

@ottuzzi
Copy link
Author

ottuzzi commented Jan 13, 2015

Hi,

thank you very much for your answers: everything you say is clear but I was thinking to a more thorough check.
I'll try to show the differences between what I understood is implemented at this moment and what I was thinking about.

NOW

  • HOST sends data with a checkum
  • DRIVE checks if data and checkum matches and if OK it physically writes data; this will prevent network errors
  • on every read DRIVE returns data and the original checksum and HOST can now check if they match

MY PROPOSAL

In my proposal you keep the same behaviour but I'm asking to add a new workflow working this way:

  • HOST sends data with a checkum
  • DRIVE checks if data and checkum matches and if OK it physically writes data
  • DRIVE immediately re-reads data and checks if data and checkum matches and if OK it sends OK to HOST; if re-read data do not match with checksum it returns an error.

The whole point here is to avoid a subtle disk error: in your workflow last check is in data arrival to disk frontend, in my proposal last check is about data written on disk. It can happen that data written to disk cannot be read correctly. With your approach you will know data cannot be read correctly on next read (probably when you need them), in my proposed additional workflow you know immediately that data can be read... at least for now ;)

Hope I was more clear than in first post :)

Thanks in advance
Bye
Piero

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants