Skip to content
This repository has been archived by the owner on Aug 11, 2021. It is now read-only.

Numeric types #9

Closed
magik6k opened this issue Aug 23, 2017 · 7 comments
Closed

Numeric types #9

magik6k opened this issue Aug 23, 2017 · 7 comments

Comments

@magik6k
Copy link

magik6k commented Aug 23, 2017

There seems to be a problem with handling of numeric types that I haven't seen mentioned yet:

JavaScript and JSON both default to float64 for number representation. This is a problem when handling CBOR or any other 'non-strict' format which can distinguish ints vs floats. Currently when int encoded within CBOR is encoded to JSON and back, it will get back as a cbor with float.

Possible solutions are:

  • Constrain canonical ipld-cbor to one numeric type
  • Add type metadata to json/js serializations
  • asm.js seems to be trying to do something about this, I haven't read much into it, but it may give some hints: http://asmjs.org/spec/latest/#value-types
@kevina
Copy link

kevina commented Feb 1, 2018

Another option is to serialize to CBOR with the smallest type that will fit. I believe this approach is recommended in the standard. Part of that means, that if a float has no fractional component then store it as an int.

@vmx vmx added the backlog label May 28, 2018
@daviddias daviddias added ready and removed backlog labels Jun 4, 2018
@warpfork
Copy link

Having schemas would fix this unambiguously. Those aren't today, however.

I also currently don't know of any things where we want floats, so, I'm inclined to suggest that any time that JS or JSON bouncing has made an ambiguous situation, our ipld tools should probably cast it back to int by default. (Add options, etc, etc, but by default we should probably steer things away from floats.)

My 2 cents.

@vmx
Copy link
Member

vmx commented Oct 18, 2018

I'd like to post an update what the current state is. If you run these commands:

echo '{"number": 5}' | ipfs dag put
echo '{"number": 5.0}' | ipfs dag put
ipfs block get zdpuB1E8Fyka9J7JeE5LJVLU7GdStPmYoJVGefpDeooK3vMX8|hexdump -v -e '/1 "%02X "'; echo

echo '{"number": 5}'|node dagput.js > /dev/null
echo '{"number": 5.0}'|node dagput.js > /dev/null
jsipfs block get zdpuArrVgd4Ey7K6Ld2Et8B2yBEGkU8pmA3bmXaxfjBMoW9p7|hexdump -v -e '/1 "%02X "'; echo

The output is:

echo '{"number": 5}' | ipfs dag put
zdpuB1E8Fyka9J7JeE5LJVLU7GdStPmYoJVGefpDeooK3vMX8
echo '{"number": 5.0}' | ipfs dag put
zdpuB1E8Fyka9J7JeE5LJVLU7GdStPmYoJVGefpDeooK3vMX8
ipfs block get zdpuB1E8Fyka9J7JeE5LJVLU7GdStPmYoJVGefpDeooK3vMX8|hexdump -v -e '/1 "%02X "'; echo
A1 66 6E 75 6D 62 65 72 FB 40 14 00 00 00 00 00 00 
echo '{"number": 5}'|node dagput.js > /dev/null
zdpuArrVgd4Ey7K6Ld2Et8B2yBEGkU8pmA3bmXaxfjBMoW9p7
echo '{"number": 5.0}'|node dagput.js > /dev/null
zdpuArrVgd4Ey7K6Ld2Et8B2yBEGkU8pmA3bmXaxfjBMoW9p7
jsipfs block get zdpuArrVgd4Ey7K6Ld2Et8B2yBEGkU8pmA3bmXaxfjBMoW9p7|hexdump -v -e '/1 "%02X "'; echo
A1 66 6E 75 6D 62 65 72 05

As you can see, it doesn't matter if you input 5 or 5.0, it leads to the same hash. Though it's a different hash between the Go and the JavaScript implementation. If you post the hexdump into cbor.me, you'll get:

Go stores it as float, JS as integer.

The CBOR spec has a section on JSON to CBOR conversion which says:

JSON numbers without fractional parts (integer numbers) are represented as integers

So I think JavaScript is doing the right thing.

And finally the source of the JS script I was using as there's no dag put in jsipfs yet:

'use strict'

const IPFS = require('ipfs')
const node = new IPFS()

node.on('ready', async () => {
  process.stdin.resume()
  process.stdin.setEncoding('utf8')
  process.stdin.on('data', async (raw) => {
    const data = JSON.parse(raw)
    const cid = await node.dag.put(data)
    process.stderr.write(cid.toBaseEncodedString() + '\n')
    node.stop()
  })
})

@warpfork
Copy link

Oh wow. Go is probably getting it as a float because of... reasons that are definitely bugs.

I forgot that go-ipfs doesn't use refmt yet.

@mikeal
Copy link
Contributor

mikeal commented Oct 30, 2018

echo '{"number": 5.0}'|node dagput.js > /dev/null

This is an issue with JavaScript and not easily fixed.

JavaScript's number type can't represent a valid integer as a float. We can get around the 64bit/bigint issues pretty easily because we own the serializer and de-serializer and can spit out whatever types we want, but the number type in JS simply won't allow us to force the representation of a float.

You'd have to map some kind of schema on top of the serializer to force the conversion if this was required.

@Stebalien
Copy link

Stebalien commented Oct 31, 2018

@warpfork go-ipfs does use refmt.

And yes, javascript is doing the right thing here. Canonical CBOR says "use the smallest representation". We could decide to do something different but then we wouldn't be using canonical CBOR. not the right part of the spec.

@vmx
Copy link
Member

vmx commented Nov 21, 2018

I'm closing this issue in favour of ipld/specs#80 as this is really a DAG CBOR issue and not a interface-ipld-format one.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

7 participants