Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use of xsd:decimal instead of xsd:float for conversion factors #42

Closed
jmkeil opened this issue Dec 7, 2020 · 9 comments
Closed

Use of xsd:decimal instead of xsd:float for conversion factors #42

jmkeil opened this issue Dec 7, 2020 · 9 comments

Comments

@jmkeil
Copy link
Contributor

jmkeil commented Dec 7, 2020

Using a comparison of unit ontologies with ABECTO I became aware that OM uses xsd:float to represent conversion factors and offsets. I think, instead xsd:decimal should be used, as binary floating point datatypes are not able to exactly represent many of the conversion values. Especially 0.1, 0.01, 0.001, … can not be represented. Only the lexical representation seems to be exact. However, the lexical to value mapping will map them to a slightly different value. This runs the risk of numerical problems in applications. I outlined this in detail in arXiv:2011.08077.

@HajoRijgersberg
Copy link
Owner

Thanks again so much, Jan Martin, for all your effort, evaluating OM and the other unit ontologies! Also this issue, that you raise here, is very important. I'll read the paper you wrote about it! I'll keep you - again - updated!

@HajoRijgersberg
Copy link
Owner

I agree that we should use xsd:decimal. I'll process that. But before I can do that I'll have to represent the many scientific notations in OM (with an e notation like in 3e4) to plain decimal numbers, which is required for xsd:decimal. This will take a long time. I'll keep you updated about this process!

@jmkeil
Copy link
Contributor Author

jmkeil commented Dec 22, 2020

Maybe some RegEx might ease that?
Examples:

Pattern replace by before after
rdf:datatype="&xsd;float">(\d+)(e|E)3< rdf:datatype="&xsd;decimal">$1000< … rdf:datatype="&xsd;float">3141592653e3<… … rdf:datatype="&xsd;decimal">3141592653000<…
rdf:datatype="&xsd;float">(\d+).(\d)(e|E)3< rdf:datatype="&xsd;decimal">$1$200< … rdf:datatype="&xsd;float">314159265.3e3<… … rdf:datatype="&xsd;decimal">314159265300<…
rdf:datatype="&xsd;float">(\d+).(\d\d)(e|E)3< rdf:datatype="&xsd;decimal">$1$20< … rdf:datatype="&xsd;float">31415926.53e3<… … rdf:datatype="&xsd;decimal">31415926530<…
rdf:datatype="&xsd;float">(\d+).(\d\d\d)(e|E)3< rdf:datatype="&xsd;decimal">$1< … rdf:datatype="&xsd;float">3141592.653e3<… … rdf:datatype="&xsd;decimal">3141592653<…
rdf:datatype="&xsd;float">(\d+).(\d\d\d)(\d*)(e|E)3< rdf:datatype="&xsd;decimal">$1.$2< … rdf:datatype="&xsd;float">314159.2653e3<…
… rdf:datatype="&xsd;float">31415.92653e3<…
… rdf:datatype="&xsd;decimal">314159265.3<…
… rdf:datatype="&xsd;decimal">31415926.53<…

@dr-shorthair
Copy link
Contributor

dr-shorthair commented Dec 23, 2020

Easier (and safer) to use SPARQL?

I just did similar for QUDT - see qudt/qudt-public-repo#311 (comment)
You should be able to adapt these queries.

@HajoRijgersberg
Copy link
Owner

Thanks, Jan Martin and Simon, Problem is that I work with an organized ASCII file as source. So I should perform these actions with string manipulation in a text file.
(I am working with an ASCII file to keep the overview in the ontology and to be able to perform complex copy and paste actions for defining "batch" quantities and units, which I still need at the moment due to lack of time for automating the various processes that come with maintaining and extending OM.)
So, I would need the RegEx commands you suggest, Jan Martin, to be converted to replace actions in text.

@jmkeil
Copy link
Contributor Author

jmkeil commented Aug 18, 2023

Doing #80 first might ease solving this issue.

@HajoRijgersberg
Copy link
Owner

HajoRijgersberg commented Aug 19, 2023

Once again, thanx for your suggestions Jan Martin and Simon. Apart from your suggestions, I have started to perform the conversions manually because of the urgency. Jan Martin and I had email contact about it (for me to fully understand the issue and what should be done, i.e., converting to decimals and integers), and I find the issue so important that I have immediately started. I think it will work out in a few weeks (doing in-between other activities). I have done more elaborate things in the past. ;) Wish me luck however. ;)

@jmkeil
Copy link
Contributor Author

jmkeil commented Aug 28, 2023

Easier (and safer) to use SPARQL?

I just did similar for QUDT - see qudt/qudt-public-repo#311 (comment) You should be able to adapt these queries.

@dr-shorthair How did you manage to not suffer from the rounding issues? A compliant engine should have create lexicals with tiny offset to the correct value (e.g. 0.00000000999999993922529 for 1.0E-8).

@jmkeil
Copy link
Contributor Author

jmkeil commented Oct 19, 2023

Solved by #89.

@jmkeil jmkeil closed this as completed Oct 19, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants