Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Proquint prefix in RFC #83

Closed
sg495 opened this issue Oct 15, 2021 · 5 comments
Closed

Proquint prefix in RFC #83

sg495 opened this issue Oct 15, 2021 · 5 comments

Comments

@sg495
Copy link
Contributor

sg495 commented Oct 15, 2021

According to [https://github.com/multiformats/multibase/blob/master/rfcs/PRO-QUINT.md], the "full" prefix for proquints is pro-, but the multibase prefix is p. As the multibase spec dictates that encoded strings be <base-encoding-character><base-encoded-data>, the result of encoding IP 127.0.0.1—as the 4 bytes bytestring [0x7f, 0x00, 0x00, 0x01]—would be:

>>> import proquint
>>> ip = bytes([0x7f, 0x00, 0x00, 0x01])
>>> ip_int = int.from_bytes(ip, byteorder="big")
>>> encoded_data = proquint.uint2quint(ip_int)
>>> encoding_character = "p"
>>> encoding_character+encoded_data
'plusab-babad'

Of course, the multibase spec could be easily extended from a single code character to an arbitrary prefix code, as <base-encoding-prefix><base-encoded-data>. One could then legitimately use pro- as the multibase prefix for proquints, which would lead to the intended result:

>>> encoding_prefix = "pro-"
>>> encoding_prefix +encoded_data
'pro-lusab-babad'

@Stebalien I'm happy to make the necessary changes to the readme, table and proquint RFC, if this sounds sensible.

@sg495
Copy link
Contributor Author

sg495 commented Oct 15, 2021

I think it is also worth mandating some formatting rules for acceptable prefixes. For example, one could require that prefixes be non-empty strings of printable ASCII characters (codepoints 0x20-0x7E) and that the only prefixes starting with '0x' be the hex strings "0x00"-"0x19" and "0x7F" for single non-printable ASCII characters. This way, implementations of multibase can perform generic multibase prefix validation without having to know the table beforehand.

Similarly, one could mandate a specific format for multibase names, e.g. the format "^[a-z][a-z0-9_-]+$" used by multicodec names.

@Stebalien
Copy link
Member

The multibase prefix is p, the base encoding is defined to be ro-<proquint-encoded-data>. I would like to allow arbitrary prefixes, but prefixes need to be "prefix free" with respect to the other prefixes so allowing arbitrary prefixes is a bit tricky.

@sg495
Copy link
Contributor Author

sg495 commented Oct 15, 2021

Ah! I see. However, that doesn't quite match the proquint encoding itself, which is a pity. What do you see as the main obstacle to adopting a more general prefix code for the multibase encoding? Checking the prefix property itself is easy, and we could make a validation script for the multibase table similar to that for the multicodec table. One could then additionally mandate that implementations of multibase only allow multibase codes that respect the prefix property, which is trivial to check in most programming languages.

@Stebalien
Copy link
Member

These two solutions are effectively the same, just slightly different spemantics. I picked this route because the proquint encoding was a cool random idea and I didn't want to put too much effort into it.

Unfortunately, current multibase implementations already assume that these prefixes will be exactly one "character" (whatever that means in the host language) and changing that now is a significant amount of work that I'm not willing to do.

@sg495
Copy link
Contributor Author

sg495 commented Oct 16, 2021

I can certainly sympathise with that, yes 😅. In this case, I think it's worth adding a brief clarification to both spec and table, in order to avoid confusion.

sg495 added a commit to sg495/multibase that referenced this issue Oct 20, 2021
@sg495 sg495 closed this as not planned Won't fix, can't repro, duplicate, stale Jul 21, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants