-
Notifications
You must be signed in to change notification settings - Fork 560
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Incorrect turtle serialization of decimal values #1043
Comments
I did some digging and this is not random, the https://github.com/RDFLib/rdflib/blob/master/rdflib/term.py#L1243 is the culprit here. |
If the datatype on |
Here is what the link you posted says about
Anyway, we are talking about python, where it chooses to represent a number in scientific notation if it has more than 4 zeroes after the decimal. irrespective of whether we cast it as I can create a PR when I have time to spare |
Or, just not try to produce a "pretty" output, will also work :) |
Right. I see the issue now. I would say that the "don't make it pretty" option is probably the right approach, and we should make sure that python Decimal objects or things explicitly marked with xsd:decimal are not converted to floating point. |
Is anyone here going to work on this or is the a wontfix? |
I do not expect to have bandwidth to work on it any time soon but it definitely needs to be fixed. Also I think is part of a general issue around serialization of literals. |
So I was looking into this issue from the past couple of days.
Since the If I change the last line to
The conversion works properly the number is indeed stored as Decimal but the representation is no longer pretty (has a large number of non-zero decimal points). As of now, I have thought of the below two ways of fixing the serialization bug.
I have also added a test case for this issue, and the first two solutions are ready to go (open to suggestions as to which one will be better to push). Want to know if there are some other suggestions or should I prefer the 3rd option over the first 2? |
this is what would be the best solution, according to @tgbugs . |
I don't think theres really a correct answer for this. Personally I think it best to avoid this (and how I do it in my projects) is as @achaudhary997 wrote: As for this bug, I believe number option 1 above is the best way forward. Still try to format it as before, but if it has |
Agree! Right back to the early designs of RDFlib the idea has always been that you should be aiming to do the right thing (you can always make up edge cases that break any software!) so use the appropriate Python |
As of now the pull request (not merged yet) just returns the representation on an |
rdflib prior to 5.0.0 serialises real literals syntactically wrong in turtle. See RDFLib/rdflib#1043.
@CraigMiloRogers are you saying that "1.9860001065575846E-7" is incorrect (looks fine to me) or are you perhaps just saying that there is inconsistency? We may not be able to deal with inconsistency, but we would hope that all different forms are, themselves, correct. |
I am using rdflib to serialize to ttl files, rdflib = 5.0.0
Here is a sample output,
notice that the line
wdt:P2020015 2.16E-7
is correct while,wdt:P2020017 5E-16.0
is incorrect as it has a.0
at the end, which is invalid.Please take a look, this seems to be happening randomly
The text was updated successfully, but these errors were encountered: