Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Invalid XML-files #19

Open
MuellerSeb opened this issue Apr 27, 2020 · 7 comments
Open

Invalid XML-files #19

MuellerSeb opened this issue Apr 27, 2020 · 7 comments

Comments

@MuellerSeb
Copy link
Collaborator

MuellerSeb commented Apr 27, 2020

pyevtk stores the data associated with the given mesh in append data, that is given in a raw binary format.
Setting of raw:

encoding="raw"

Encoding as binary:
bin_pack = struct.pack(fmt, *dd)

This is producing invalid XML files and other software is may unwilling to read the files (e.g. meshio see: nschloe/meshio#786)

We should use base64 encoded data, where the binary data is encoded with ascii-characters as suggested by VTK. Some references:

When using appended data with base64, we have to keep in mind, that the offsets are changing, since they address the character in the base64 encoded string and not the binary offset like with raw data:

self.offset += (

meshio already implemented writing routines using base64 encoding. We should have a look there:
https://github.com/nschloe/meshio/blob/d9c05ae688858b5166630874f3ec875a43b8fd37/meshio/vtu/_vtu.py#L482

@renefritze
Copy link
Contributor

renefritze commented Apr 27, 2020

I've never even considered that the output files might not be valid XML since all my (only VTK standard following apparently) consumers happily accept them. I also didn't know that meshio doesn't fully support the VTK standard then, nor that Kitware managed to define a standard that uses xml and isn't fully compliant. Great stuff 🙄

So as far I understood, the only upside of using 'appended-raw' is that the implementation is less complex. Is that correct?

And aside from the implementation being trickier, there's no downside to using 'appended-binary', right?

@MuellerSeb
Copy link
Collaborator Author

You are right. In the end, it shouldn't be that hard to rewrite the appending routines. But one has to dig through all that serialization and base64 stuff...

@MuellerSeb
Copy link
Collaborator Author

My main aim would be to produce files, that meshio can read.

@renefritze
Copy link
Contributor

I cannot really justify pushing time into helping with the implementation, but if you can and want to, I could invest a little time in making the test setup better. I've already pushed a quick & dirty check to the generated examples into the xml_checks branch.

@MuellerSeb
Copy link
Collaborator Author

Another idea:
If we drop the use of appended and write the dataArray directly at their position of occurrence in the file, we don't have to estimate the offset in the appended data.

Only downside: the file isn't that readable anymore as before, where the data-metainfo was stored at the top of the file and the actual data was appended.

@renefritze , @xylar: would you have any problems with that approach?

@xylar
Copy link
Collaborator

xylar commented May 2, 2020

I think having the file be readable has saved me some debugging trouble in the past but I don't have strong feelings about it.

Regarding base64 encoding, won't it by definition have to take up 4 times as much space, since you're only allowed to use 64 of the 256 values of a given byte? If so, I would be opposed to just switching to base64 as opposed to making that the default option but continuing to support the existing functionality even if it is not valid XML (since it works great in ParaVeiw).

@renefritze
Copy link
Contributor

I agree with @xylar in that we should be careful to not degrade current use cases. Making the output mode configurable, defaulting to current mode, would be perfectly fine with me though.

While not space-efficient in itself, the base64 mode would also allow adding inline gzipping of the data, IIRC. That would be a real long-term benefit for us I think.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants