Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MapBox Query with pound sign # has Unexpected Results #43

Closed
dernewt opened this issue Aug 13, 2022 · 5 comments · Fixed by #46
Closed

MapBox Query with pound sign # has Unexpected Results #43

dernewt opened this issue Aug 13, 2022 · 5 comments · Fixed by #46
Assignees
Labels
bug Something isn't working
Milestone

Comments

@dernewt
Copy link

dernewt commented Aug 13, 2022

When calling .BuildGeocodingRequest or .BuildReverseGeocodingRequest with a query that contains a pound sign (#) the resulting URI is truncated at the location of the # and characters after are not included. This is related to how UriBuilder and UriBuilder .Query interact, specifically the constructed URI is fine (line 125) but when UriBuilder.Query is assigned to at 158 it truncates everything after #. There may be other problematic characters related to the construction of URIs.

According to the MapBox api any such punctuation can be passed but it's treated as a word separator:
https://docs.mapbox.com/help/troubleshooting/address-geocoding-format-guide/

One possible fix is to replace all such punctuation with spaces.

A more ambitious fix would be to implement some sort of GeocodingParameters.Query builder pattern which could handle this and also enforce other MapBox query limitations such as a max "word" length of 20 and a max character length of 256. This pattern could also implement the "batch" feature where multiple queries are separated via semicolons (though batch queries also require mapbox.places-permanent)

@JustinCanton JustinCanton self-assigned this Aug 14, 2022
@JustinCanton JustinCanton added the bug Something isn't working label Aug 14, 2022
@JustinCanton JustinCanton added this to the 1.2.0 milestone Aug 14, 2022
@JustinCanton
Copy link
Owner

@dernewt, according to the documentation

A query that uses this secondary address information, as well as any associated special characters like commas (,) and pound symbols (#), will return the same coordinates as a query with this information stripped out.

So this information really has no meaning when it is passed. Could you give me a bit of background on your use case for this?

@dernewt
Copy link
Author

dernewt commented Aug 16, 2022

The background/use case is nothing special, I'm using your library to abstract myself from the specifics of how MapBox works as much as possible. I'm passing through addresses that are variable in their format but are generally "human readable" addresses thus a # didn't really stand out.

Of course now that I read the MapBox docs I'm stripping all non-alpha-numeric as an extra step. I do agree that there must be a line where it's out of scope of the library and I defer that to you. I really think handling # (somehow) is required as it breaks your library. After that if you're filtering some characters it's easy to argue you should filter all of them considering that MapBox doesn't consider them.

Even if you did a full on query builder (which is out of scope of this bug) you'd still need to handle a raw query.

@JustinCanton
Copy link
Owner

I have done some playing around, and I don't really trust MapBox's statement that it doesn't consider them. Using the 4 examples they have on their website:

  • 123 Main St
  • 123 Main St #456
  • 123 Main St, Suite 7
  • 123 Main St, Building A

I get different results. Some of the queries give the same results, but others do not. Which calls into question whether they don't actually consider them and what that means.

The problem with stripping non alpha-numeric characters is that isn't really correct. You can have valid addresses with alpha-numeric characters, such as addresses in Hawaii.

Obviously the pound symbol has a special meaning within uri. Its the fragment identifier. Which is why everything after the pound symbol is moved to the end of the uri after formatting.

The way I see it, there are a couple of options I have:

  1. Strip the pound symbol from requests (as well as any other characters necessary)
  2. Encode the characters to respect uri requirements
  3. Do nothing and leave the requirements of handling and sanitizing data to the consumer

Of these, I think I am leaning towards option 2. I will think about it for a little while and get back to this.

@JustinCanton
Copy link
Owner

Digging into this some more, I have good news and bad news. The good news is that only the query in the url isn't being properly encoded. Meaning as long as I encode that, everything will work fine. The even better news is that none of the other libs aside from MapBox have this issue.

The bad news is about the query string building. Right now, it works correctly. The query string is properly encoded when used, and doesn't need to change. The problem is, the way I am currently building the query string isn't recommended. It has the possibility of an exploit. See the remarks under here.

So the fix you want is quick, but I need to invest a bit of time fixing up the query string building. I am hoping to have this completed soon.

@JustinCanton
Copy link
Owner

@dernewt. An alpha package has been created with the fix for this issue. Geo.MapBox.1.2.0-alpha.4.nupkg. Feel free to grab this and test it out to see if it fixes your issue.

I will be creating a full release version (1.2.0) at some point this weekend.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
2 participants