Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

special characters in channel names cause error #656

Closed
jchris opened this issue Feb 19, 2015 · 20 comments · Fixed by #1293
Closed

special characters in channel names cause error #656

jchris opened this issue Feb 19, 2015 · 20 comments · Fixed by #1293
Assignees
Milestone

Comments

@jchris
Copy link
Contributor

jchris commented Feb 19, 2015

this is a bug when you are trying to generate channel names from user data.

@deefactorial
Copy link

@deefactorial
Copy link

My work around is to base 64 encode the user data for the channel name.

@zgramana
Copy link
Contributor

We need to document this once fixed.

@zgramana zgramana added this to the 1.2.0 milestone Apr 28, 2015
@zgramana
Copy link
Contributor

As part of this, we need to canonicalize the set of valid chars for document ids and channel names, since the former are often consumed in the later. The charset also needs to be URL safe, as the REST GET/PUT/DELETE for a specific document includes the doc id as part of the URL.

@deefactorial
Copy link

Is it possible to URL encode special characters for document ids?
On Apr 28, 2015 2:16 PM, "Zachary Gramana" [email protected] wrote:

As part of this, we need to canonicalize the set of valid chars for
document ids and channel names, since the former are often consumed in the
later. The charset also needs to be URL safe, as the REST GET/PUT/DELETE
for a specific document includes the doc id as part of the URL.


Reply to this email directly or view it on GitHub
#656 (comment)
.

@zgramana zgramana added ready and removed backlog labels May 8, 2015
@zgramana zgramana self-assigned this May 15, 2015
@snej
Copy link
Contributor

snej commented Aug 4, 2015

It was also pointed out by a forum user that email addresses can't safely be used as channel names because they allow a lot of characters that aren't allowed in channels.

At a minimum, if we added '%' to the channel character set then arbitrary strings could be hex-encoded as per URLs. (The Wikipedia page I linked above says that this encoding can also be used in email addresses.)

This limited character set is my fault, but honestly I can't remember why i was so restrictive about it. The only problem I can think of with an arbitrary character set would be losing the ability to concatenate channel names in a plain string, since there wouldn't be any delimiter you could put between them; but I don't think we ever do that since a JSON array is such an obvious way to express that.

@deefactorial
Copy link

My work around was to base64 encode the channel names, although difficult
to read and not that efficient it works.
On Aug 4, 2015 2:23 PM, "Jens Alfke" [email protected] wrote:

It was also pointed out by a forum user that email addresses can't safely
be used as channel names because they allow a lot of characters
https://en.wikipedia.org/wiki/Email_address#Syntax that aren't allowed
in channels.

At a minimum, if we added '%' to the channel character set then arbitrary
strings could be hex-encoded as per URLs. (The Wikipedia page I linked
above says that this encoding can also be used in email addresses.)

This limited character set is my fault, but honestly I can't remember why
i was so restrictive about it. The only problem I can think of with an
arbitrary character set would be losing the ability to concatenate channel
names in a plain string, since there wouldn't be any delimiter you could
put between them; but I don't think we ever do that since a JSON array is
such an obvious way to express that.


Reply to this email directly or view it on GitHub
#656 (comment)
.

@jchris
Copy link
Contributor Author

jchris commented Aug 6, 2015

IMHO the allowed channel names should be all UTF-8 strings, otherwise sync functions fill with BS.

@jchris
Copy link
Contributor Author

jchris commented Aug 6, 2015

Or is it UTF-16? Whatever it is we should strive for unsurprising.

@jchris
Copy link
Contributor Author

jchris commented Nov 4, 2015

I am running into this again. I really do think the allowed set of channel names should be all unicode strings. The PR looks trivial.

@zgramana zgramana assigned jchris and unassigned zgramana Nov 6, 2015
@jchris
Copy link
Contributor Author

jchris commented Nov 12, 2015

For the immediate term, we can make the only character not allowed in channel names be ,. In the longer term we can change that but it will require coordination with Couchbase Lite, because the channels filter param is passed as a comma delimited string not a JSON array, in the rest API currently. There are no technical hurdles to changing that, except we need to be able to support the current client behavior for the rest of the release lifetime.

@jchris
Copy link
Contributor Author

jchris commented Nov 13, 2015

Questions:

  • what about the empty string? I'm coding it as invalid.
  • just double checking that we are ok with having these be valid: "*", "**", "a", "a ", "a b", "a*b", "FOO", "123", "-z", "foo_bar", "Éclær", "z7_", "!", "Z∫•", "*!"

@adamcfraser
Copy link
Collaborator

"*" and "!" should be prohibited (used for the admin and public channels), as well as empty string.

There's another ticket that proposes using the ! to identify named public channels (#947), so at minimum we should restrict that as an opening channel name.

Does this change expose us to any type of injection attack when the channel name gets used as part of a view query (e.g. when it's included as part of the startkey in a channel backfill request)?

@jchris
Copy link
Contributor Author

jchris commented Nov 13, 2015

View query keys are any string, so that should be fine. I think reserving the ! and * names is fine, but they are valid today.

I think !example should be valid, also *example and _foo

@jchris
Copy link
Contributor Author

jchris commented Nov 13, 2015

Basically, channel names should be subject to the same restrictions as view keys, (none) but for the short term we have to keep the comma restriction in there until we deprecate the current clients query format

@adamcfraser
Copy link
Collaborator

I guess I should clarify. ! and * are valid channel names (and should continue to be), but they have custom functionality associated with them. The same would be true for !example after #947 gets implemented.

So I guess my concern isn't whether these should be allowed, but the underlying implication of creating channels based on user data. App developers who don't want users to be able to make documents public will need to ensure they aren't allowing docs to be assigned to channel ! (or !example in future). That's not any different than where we are today, but might merit some documentation/recommendations around usage.

@jchris
Copy link
Contributor Author

jchris commented Dec 14, 2015

I think we decided that we wouldn't want to do channel namespaces using special character prefixes, because we want to allow user data as channel names.

I got bit by this again trying to use Stack Overflow api data tags as channel names, but some questions are tagged c#.

@jchris jchris assigned adamcfraser and unassigned jchris Dec 14, 2015
@zgramana zgramana added the docs label Jan 8, 2016
@zgramana zgramana modified the milestones: 1.3, 1.2.0 Jan 8, 2016
@tleyden
Copy link
Contributor

tleyden commented Feb 16, 2016

What are the currently allowed characters in channel names? (pre #1293, since it hasn't been merged)

Specifically are : and - allowed?

@pasin
Copy link

pasin commented Feb 17, 2016

Based on my test, '-' is allowed but not ':'.

@jamescooke
Copy link

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

10 participants