Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Slow down taginfo requests #3955

Closed
bhousel opened this issue Apr 10, 2017 · 13 comments
Closed

Slow down taginfo requests #3955

bhousel opened this issue Apr 10, 2017 · 13 comments
Labels
core An issue with one of the core iD components

Comments

@bhousel
Copy link
Member

bhousel commented Apr 10, 2017

We're hitting the taginfo service too frequently and should slow down the requests when the user is typing.

Currently these calls are debounced - but only at 100msec - which seems like it would not be very effective at preventing api calls except for fairly quick typing.

We can delay the calls a bit more, and also consider requiring 3-4 characters of typing before sending a query. We had this before and removed it for some reason - I can't remember why right now.

@bhousel bhousel added the core An issue with one of the core iD components label Apr 10, 2017
@joto
Copy link

joto commented Apr 10, 2017

The biggest problem seems to be that people misuse this feature as a name search. With 50 million names in the database, doing a substring search will never be fast. I am seeing query times of 10s to 20s in the log on these queries. Running several of these at the same time due to the search-as-you-type is really problematic.

Is it possible to cancel running requests when the next character is typed? I am not sure, though, whether the server supports that canceling.

@bhousel
Copy link
Member Author

bhousel commented Apr 10, 2017

I'm not sure why people would ever do a name search. It would only happen if they are using the raw tag editor to add a name, but I guess some people might do it.

@joto Is searching certain keys slower than others? I guess what I am wondering is - should we blacklist, or treat differently, certain keys like name that just have a lot of values and are unlikely to produce useful autocomplete results?

@slhh
Copy link
Contributor

slhh commented Apr 10, 2017

I'm not sure why people would ever do a name search.

Autocompletion of names would be quite useful if iD were searching the right things:

For highways:

  1. Names of connected highways
  2. Names of nearby highways
  3. Names from nearby addr:street

For POI (less important, and more difficult):
Names of same POI type, weighted by quantity in same country and/or region.

It would only happen if they are using the raw tag editor to add a name, but I guess some people might do it.

Intermediate or advanced users do likely edit much more features than new users do, and I would expect that many intermediate or advanced users work with all fields section closed when maintaining feature types which they are used to edit. All fields section doesnt't show all present information, and it doesn't show all supported information at a single view. Working with all tags seems to be much more efficient to use in this case. All field has its benefits for new users, or in case of tagging rare feature types where the user is not aware of the tagging rules.

@joto
Copy link

joto commented Apr 11, 2017

@bhousel Yes, searching is much slower for keys with more values. Just blacklisting name would probably solve a large part of the problem.

@bhousel
Copy link
Member Author

bhousel commented Apr 11, 2017

@bhousel Yes, searching is much slower for keys with more values. Just blacklisting name would probably solve a large part of the problem.

That's good to know. Could you get a list of the "keys with most values" that we might want to blacklist? I looked on taginfo.openstreetmap.org but don't think there is a way to do it - this probably requires a database query.

@althio
Copy link
Contributor

althio commented Apr 11, 2017

get a list of the "keys with most values"

https://taginfo.openstreetmap.org/keys
and sort by column "Values"

@slhh
Copy link
Contributor

slhh commented Apr 11, 2017

https://taginfo.openstreetmap.org/keys
and sort by column "Values"

There are many keys which are unlikely to be edited by iD, these can be blacklisted, but the effect is likely small.

We can sort the remaining keys with many keys into categories how to handle autocompletion

  • No autocompletion required:
    addr:housenumber
    name:en
    phone
    note
    addr:full
  • Use nearby objects (e.g. iD's context) for autocompletion:
    addr:street
    ref
    addr:postcode
  • Search in the destination database for autocompletion:
    wikipedia
    wikidata
    website

Keys like wikipedia have nearly as much values as objects are tagged. It is quite unlikely that a Wikipedia link which is already existing in OSM is desired when tagging a new object. Therefore, searching Taginfo doesn't make sense in this case. We should search Wikipedia itself for existing articles instead.

@1ec5
Copy link
Collaborator

1ec5 commented Apr 12, 2017

We should search Wikipedia itself for existing articles instead.

This is what the Wikipedia field under “All fields” does. Likewise, various address fields autocomplete based on nearby features. Perhaps we could special-case the underlying addr:street, addr:city, addr:state, and addr:postcode tags in the “All tags” section. wikipedia is a little trickier because the tag value is prefixed with a wiki’s language code. (Like the Wikipedia field, it might start out assuming the user’s current language, but not every wiki has every article, which is why the Wikipedia field provides a convenient language switcher.)

@bhousel bhousel added priority A top priority issue that has a big impact or matter most to new mappers wip Work in progress labels Apr 19, 2017
@bhousel
Copy link
Member Author

bhousel commented Apr 19, 2017

Perhaps we could special-case the underlying addr:street, addr:city, addr:state, and addr:postcode tags in the “All tags” section

My plan is to just blacklist the top 100 keys from
https://taginfo.openstreetmap.org/api/4/keys/all?sortname=values_all&sortorder=desc&page=1&rp=100&format=json_pretty

(exception: opening_hours)

I think it's safe to just exclude those keys from value lookup or combobox autocomplete.

bhousel added a commit that referenced this issue Apr 19, 2017
@bhousel bhousel mentioned this issue Apr 19, 2017
5 tasks
@pnorman
Copy link
Contributor

pnorman commented Apr 19, 2017

(exception: opening_hours)

I'd add source, fixme, and note to the exclusion list, and blacklist name:*. I was undecided on addr:city, but I think I'm against blacklisting it. When you've typed in 3-4 letters of most cities you get some meaningful results.

Some of the tags like ref, name, etc benefit from an autocomplete based on what's nearby, but not one based on worldwide tag stats.

@slhh
Copy link
Contributor

slhh commented Apr 19, 2017

I'd also exclude operator and maybe destination, but destination would likely benefit more from an autocomplete based on what's nearby.

Blacklisting 100 keys seems to be much more than required in view of server load, but I agree that autocomletion based on taginfo doesn't make sense for most of these keys.

An estimation of the generated server load seems to be the product of three factors:

  • the number of objects with the key (Probability of the chance to edit the tag)
  • a tag type specific constant (How likely is the tag manually edited? E.g. nearly zero for import specific tags.)
  • a function of the number of values (load of a single request)

@bhousel
Copy link
Member Author

bhousel commented Apr 20, 2017

@pnorman, @slhh, the address fields in the All Fields section do autocomplete based on what's nearby, not taginfo.

As far as I can tell, this change would only affect the raw tag editor. I don't think any of the other top 100 fields are things that we use combo boxes for in the All Fields section.

Anyway I don't want to overthink this. @joto says that "searching is much slower for keys with more values." so iD should just avoid doing searches like that.

@bhousel
Copy link
Member Author

bhousel commented Apr 20, 2017

done in #3975

@bhousel bhousel closed this as completed Apr 20, 2017
@bhousel bhousel removed the wip Work in progress label Apr 20, 2017
@bhousel bhousel removed the priority A top priority issue that has a big impact or matter most to new mappers label Jul 6, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
core An issue with one of the core iD components
Projects
None yet
Development

No branches or pull requests

6 participants