Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[on hold] add hard distance filter to short focus.point queries #1215

Closed
wants to merge 1 commit into from

Conversation

orangejulius
Copy link
Member

@orangejulius orangejulius commented Oct 19, 2018

Background

Short autocomplete inputs are very difficult to serve in a performant and low-latency way. With shorter inputs, many more documents match for just about any input string.

In our testing, one to three character input texts generally match up to 100 million documents out of a 560 million document full planet build.

There's really no way to make scoring 100 million documents fast, so in order to achieve acceptable performance (ideally, <100ms P99 latency), it's worth looking at ways to either avoid querying Elasticsearch or reducing the scope of autocomplete queries.

Short autocomplete queries without a focus.point parameter can be cached. There are only 47,000 possible 1-3 character alphanumerical inputs. At this time, caching is outside the scope of Pelias itself but can easily be implemented with Varnish, Nginx, Fastly, Cloudfront, and lots of other tools and services.

Queries with a focus.point are effectively uncachable however, since the coordinate chosen will often be unique.

Changes

This PR is a prototype of using the focus.point coordinate to build a hard filter limiting the search to documents only within a certain radius of the coordinate. This can reduce the number of documents searched and improve performance, while still returning results that are useful.

As it stands, the code essentially takes two values into consideration:

  • the input text length below which results will be filtered (currently 4)
  • the base filter distance (in kilometers) to be used, which is multiplied by the text length to get the actual filter distance.

In other words, results are currently filtered as follows:

text length max distance
1 50km
2 100km
3 150km
4+ unlimited

These parameters will eventually be configurable.

Testing, thoughts, and benchmarks appreciated.

@orangejulius
Copy link
Member Author

orangejulius commented Oct 19, 2018

Here's some benchmark data showing p50 and p99 latency for autocomplete queries of various input lengths as it stands now.

image

@missinglink
Copy link
Member

this looks great, we should add one more commit to make the constants 4 and 50 variable via a config value.

another option to consider alongside this code is to add a layer filter which excludes the address layer up to a certain character count, since it's ~90% the index it would have a huge impact.

Maybe we can enable that for one-token autocomplete queries regardless of whether they have a focus point or not.

@orangejulius
Copy link
Member Author

Yeah, I'm thinking about how we could make this even more powerful in the future.

I think the first thing we need to do is either create a layer or other identifying property unique to airports and then, we could build logic something like this:

  • filter all records in layers except admin and airports (in other words: address, street, venue, custom)
  • that exceed the focus.point distance currently calculated in this PR

That would allow us to always return admin areas and airports from anywhere, as well as nearby records of all types. That would be the ultimate filter I think, and would allow us to use it on character lengths well beyond 2.

Thoughts?

@orangejulius orangejulius force-pushed the focus-point-hard-distance-filter branch from 701f4e9 to c020889 Compare November 3, 2018 15:54
Short autocomplete inputs are very difficult to serve in a performant
and low-latency way. With shorter inputs, many more documents match for
just about any input string.

In our testing, one to three character input texts generally match up to
100 million documents out of a 560 million document full planet build.

There's really no way to make scoring 100 million documents fast,
so in order to achieve acceptable performance (ideally, <100ms P99
latency), it's worth looking at ways to either avoid querying
Elasticsearch or reducing the scope of autocomplete queries.

Short autocomplete queries without a focus.point parameter can be
cached. There are only 47,000 possible 1-3 character alphanumerical
inputs. At this time, caching is outside the scope of Pelias itself but
can easily be implemented with Varnish, Nginx, Fastly, Cloudfront, and
lots of other tools and services.

Queries with a `focus.point` are effectively uncachable however, since
the coordinate chosen will often be unique.

This PR uses the `focus.point` coordinate to build a
hard filter limiting the search to documents only within a certain
radius of the coordinate. This can reduce the number of documents
searched and improve performance, while still returning results that are
useful.

It takes two parameters, driven by `pelias-config`:

- `api.autocomplete.focusHardLimitTextLength': the maximum length of text
for which a hard distance filter will be constructed
- `api.autocomplete.focusHardLimitMultiplier`: the length of the input
text will be multiplied by this number to get the total hard filter
radius in kilometers.

For example, with `focusHardLimitTextLength` 4, and
`focusHardLimitMultiplier` 50, the following hard filters would be
constructed:

| text length | max distance |
| ---- | ----|
| 1 | 50 |
| 2 | 100 |
| 3 | 150 |
| 4+ | unlimited |
@orangejulius orangejulius force-pushed the focus-point-hard-distance-filter branch from c020889 to 52ec00a Compare November 3, 2018 15:59
@orangejulius orangejulius changed the title WIP: add hard distance filter to short focus.point queries [on hold]add hard distance filter to short focus.point queries Nov 5, 2018
@orangejulius orangejulius changed the title [on hold]add hard distance filter to short focus.point queries [on hold] add hard distance filter to short focus.point queries Nov 5, 2018
vs.set({
'boundary:circle:lat': clean['focus.point.lat'],
'boundary:circle:lon': clean['focus.point.lon'],
'boundary:circle:radius': `${50 * clean.text.length}km`
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

50 should not be distanceMultplier ?

@missinglink
Copy link
Member

missinglink commented Jun 17, 2019

After our discussion last week, we should totally revive this PR!

We discussed varying the geo filter based on layer (which can get a bit messy when dealing with filter combinations) but could actually be much simpler and powerful based on population and popularity.

An example ES query would be something like this (note: it is in the filter clause so it's non-scoring and cachable):

{
    "query": {
        "bool": {
            "filter": [
                {
                    "bool": {
                        "minimum_should_match": 1,
                        "should": [
                            {
                                "range": {
                                    "population": {
                                        "gte": 1000000
                                    }
                                }
                            },
                            {
                                "range": {
                                    "popularity": {
                                        "gte": 1000000
                                    }
                                }
                            },
                            {
                                "geo_distance": {
                                    "distance": "200km",
                                    "center_point": {
                                        "lat": 40,
                                        "lon": -70
                                    }
                                }
                            }
                        ]
                    }
                }
            ]
        }
    }
}

There are two variables which can be varied depending on the length of the test, the population/popularity score is one and the other one is the geo distance (only when a focus point is provided).

I think it would be a fairly simple task to create a new view which considers the length of the input text and sets these variables accordingly.

One other thing we discussed last week, before I forget, was that we don't currently ask for a 'zoom' with our focus point, so it's impossible to know if the user is zoomed out looking at the whole of Australia or zoomed in on a small town in the center.

So I think the distance parameter steps should be fairly large, at the largest it could cover an area as large as 1/5 the planet (or thereabouts), this would still offer a massive performance benefit.

@missinglink
Copy link
Member

It would also be great if it was easy to configure the 'steps' and under which conditions they apply, input length alone might not be the only factor to consider.

We should also emit a warning in the geojson response to let users know that a filter has been applied and which variables were used.

@orangejulius
Copy link
Member Author

This was a little too strict in its filtering, and will be replaced either by #1323 or another filtering improvement PR

orangejulius added a commit that referenced this pull request Jul 19, 2019
This change uses an extra filter clause that takes into account a focus
point (if used) and the input text length to filter out very far away
street and adress records, improving performance.

Replaces #1215
jsvrcek added a commit to EventKit/pelias-api that referenced this pull request Oct 8, 2019
* chore(package): update ciao to version 2.0.0

* chore(ciao): Update ciao tests

Amazingly in over a year since running these, only two things had to be
changed.

* fix(errors): Only return 400 status on known parameter errors

Pelias has for a long time returned 400 as a default status whenever
anything goes wrong, as well as when a user has passed invalid
parameters.

By using a new exception class, it is now possible to differetiate
between known parameter errors, and unexpected errors that truly
represent an HTTP 500.

* feat(trimByGranularity): refactor code to add logging and debug output, modify loop to break after first match

* feat(trimByGranularity): refactor to use an index instead of multiple loops over the results

* feat(address_search_using_ids): add postcode to variable store if we have it

* Update pelias-query to 9.7.0

This is required for the other half of postalcode matching

* fix(log): Log for each interpolation request

The groundwork for this was laid in pelias#1220, but it wasn't completed.

By logging information about each individual interpolation request (a
single API response may require many calls to the interpolation service
if several streets were returned from a query), we can get a good
overview of the interpolation service's performance.

* feat(log): Add full clean context to parser logs

Without the unmodified text input, it's hard to tell what cases is
causing these messages.

* feat(max_character_count_layer_filter): filter layer=address records for character counts <= 2

* feat(autocomplete): drive address filter by config

* feat(unit_number_only_queries): do not consider parses with only the unit number as "admin only"

* feat(libpostal_numeric_inputs): recast entirely numeric parses to label "house"

* feat(focus.point): use exponential instead of linear scoring

Linear scoring, by design, gives all records the same score past a
certain point.

This has the disadvantage that identical records that are very far away
cannot be sorted by distance.

By using exponential scoring, we can achieve decent sorting of even very
far away records. This is very helpful for cities and postalcodes.

Connects pelias#1206

* fix(autocomplete focus.point): Use 50km scale parameter

The `scale` parameter controls how quickly scores decrease from the
maximum as the distance from the `center_point` to the record in
question increases.

Set this to 50km, which is the same as search.

Connects pelias#1206

* Use test description consistent with fixture file

* feat(test): add test cases for autocomplete token matching permutations

* feat(refactor): refactor addressit sanitizer for readability

* feat(refactor): refactor addressit sanitizer - bugfix

* Update service/configurations/Interpolation.js

Change Language to Interpolation to fix copy/paste error.

* feat(docs): update gitter link

Most of our chatting happens in pelias/pelias, not pelias/api

* chore(package): update dependencies

* travis: reflect new best-practice travis-ci configuration

Travis-CI has or will shortly make in early December 2018 a number of beneficial
changes to their Linux continuous integration testing infrastructure.

Changes that impact pelias/api are:
* Linux infrastructure combined into one (virtualized), from two previously
  (virtualized and container-based). [0][1]
* Offering a more modern, supported Ubuntu Xenial (16.04 LTS) [2]

Projects using "sudo: false" (container-based infrastructure), have been
recommended to remove that configuration soon. In any case, the transition
will happen regardless for projects by December 7, 2018.

[0] https://blog.travis-ci.com/2018-10-04-combining-linux-infrastructures
[1] https://blog.travis-ci.com/2018-11-19-required-linux-infrastructure-migration
[2] https://docs.travis-ci.com/user/reference/xenial/

* Add boundary.wof parameter to search, search/structured and autocomplete endpoints

* fix(dedupe_query_size_1): check size when deduping

* feat(autocomplete_empty_tokens): remove empty tokens

* Change name from boundary.wof to boundary.gid and update param format

* Add boundary.gid to /reverse endpoint

* Add filter function for Placeholder results

* Cleanup

* feat(params): Add boundary.gid parameter

Connects pelias#1249

* feat(stable_sort): fix stable sorting on node 11.6.0

* feat(dedupe): improved matching across languages

* feat(dedupe): consider the user agent language in deduplication

* feat(dedupe_improved_parent_matching): only check parent fields above the layer of the least granular match

* feat(inter_layer_deduping): dedupe between layers and prefer some layers over others

* feat(dedupe_improved_parent_matching): do not compare parent items at same level, use explicit list of synonymous layers

* feat(dedupe_improved_parent_matching): add Vientiane test case

* fix typos

* fix(dedupe): Compare parent layers above lowest record

This expands the layers checked for hierarchy difference to be any layer
higher than the _lower_ of the two records being compared. Previously,
it was any layer higher than the _higher_ of the two record (another way
to think about it was it was only comparing hierarchy elements that BOTH
records have).

This fixes the issue where the city of Pennsylvania, Illinois was being
considered a duplicate of the state of Pennsylvania.

* feat(query) Set cutoff_frequency in all queries

* feat(addendum): display addendum data in geojson

* docs: Add link to Pelias services documentation

The rules for when different Pelias services are required or recommended
is now too complex to fit in the API readme. Fortunately we have it all
written down in one place.

* fix(package): update addressit to version 1.7.0

* feat(search): enable fallback queries when placeholder returns 0 results

* feat(admin_ngram): query ngram fields for admin matching on autocomplete

* fix(jshint): change syntax to silence jshint eerror on commit

* feat(admin_ngrams): revert ccb9bc8

* fix(logging): Remove log of every query sent to addressit

This is pretty verbose as its fairly common, and we can determine what
queries were sent to addressit in other ways.

* fix(logging): Clean up coarse reverse error log

It was logging both info and error level messages. While not commonly
called, this seems unnecessary.

* fix(logging): allow access log to be empty

This permits disabling access logs, which may or may not be useful in
different use cases.

* correct -> incorrect in documentation

This error completely changed the meaning of a line in documentation

* feat(deps): Remove `tmp` devDependency

This package hasn't been used since
33341b7, which was removed in favor of
the [microservice-wrapper](https://github.com/pelias/microservice-wrapper).

Closes pelias#1266

* feat(admin_ngrams): Restore admin ngrams functionality

This restores the functionality from pelias#1259
to allow autocomplete on partial administrative area names.

It's a solid improvement, and we are waiting to merge only to give the
community time to update their schema settings.

This reverts commit 43a4bed.

* feat(deps): Remove unused source-map dependency

Like pelias#1269, this dev dependency was
not being used. We added it long ago to try to fix issues with resolving
NPM dependencies, but those issues haven't been seen recently.

* fix(package): update check-types to version 8.0.0

* feat(categories): enable filter for autocomplete, show categories in geojson even when categories param is blank

* feat(categories): gracefully handle a mixture of valid and invalid categories

* Allow size parameter for autocomplete endpoint.

* Add support for boundary circle to autocomplete.

* Make sanitizer aware of boundary.circle.

* feat(multi boundary.country): Support for multi `boundary.country` query parameter

* fix(query): add code comments and sanity check to text_parser_addressit

* feat(performance): add single_token_address_filter sanitizer

* feat(performance): update single_token_address_filter to consider only layers relevant to clean.sources

* feat(performance): rename single_token_address_filter to _address_layer_filter, add numeric check, emit warning

* feat(performance): use parsed_text input to determine token length where available

* fix(package): update pelias-config to version 4.0.0

Closes pelias#1289

* feat(localNamingConventions): fix for ESP street number formatting

* Allow postalcode + admin area queries

We currently have very strict logic for which combinations of inputs the
fallback search query supports. Any input that is out of alignment with
those expectations will be deferred to the lower quality addressit query
system.

This changes some of that expectation logic. Where previously it ignored
inputs with a postcode and any other field, now a postcode in
combination with any administrative fields are allowed.

This is so, for example, a user can query for '90210, California' and
hopefully see postalcodes called 90210 that are in California first in
the results.

Obviously postalcodes are complex so there might be a few cases that
don't work, but it should help. As a bonus, it will remove fragility and
complexity from our query generation.

* feat(cutoff_frequency): enable cutoff_frequency for multi_match queries

* feat(docs): Add README section for custom sources/layers

This documentation exists only in the notes for a pull request, which
inevitably will get out of date over time and is not as readily
accessible.

Connects pelias#1131

* Add Romania to flipped street number countries.

* fix(package): update geolib to version 3.0.0

* Use new function names in geolib 3

* Update distance test expectation for precision changes in geolib 3

* fix(naming conventions): Columbian addresses should be flipped

Fixes pelias#972

* Add comment explaining purpose of helper/placeTypes.js

Its a list of _administrative_ placetypes.

* fix(search_with_ids): Skip query when not address related

The search_with_ids query (in some places more accurately referred to as
the `address_search_with_ids` query), can really ONLY return addresses
or related results such as streets, cities, etc. It will never return
venues.

However, if the user passed `layers=venue` as a query param, this query
was still executing and if any results were returned, then a non-venue
record would be returned as a result.

* feat(translation) Translate name to request language when available

* fix(package): update async to version 3.0.1

* feat(debug_population_popularity): display population & popularity in geojson when debug flag enabled

* feat(improved_es_debugging): display more information about the elasticsearch request when in debug mode

* fix(improved_es_debugging): remove unused statement

* feat(max_text_length): limit max text length to 140 characters

* feat(auto_discover_type_mapping): discover sources & layers by querying elasticsearch

* feat(auto_discover_type_mapping): remove _replica_first preference

* fix(predicate): Fix address related layers detection

This fixes an error in the logic from
pelias#1307.

The `address_search_with_ids` query can only return results in the
address or street layers. However, the predicate logic was allowing the
query to execute if _any_ admin layer was specified, which could allow
such a query to return street or address layers when not intended.

* feat(controller): Remove search_with_ids controller

I was attempting to use the new debug output from
pelias#1313 only to realize it was not added
to the `search_with_ids` controller, only the `search` controller.

These two controllers are very similar, so this PR combines them into
one that can be used in all cases.

* feat(query): Use more consistent names

The search endpoint uses multiple query types that are executed one
after the other: if the first does not return acceptable results, the
next is executed, and so on.

The last query in this list used to be the only query the search
endpoint had. Partially because it's been around for a long time, it is
referred to in several different ways across different parts of the
code.

This change unifies all of them so that the query is called
`search_addressit` everywhere.

* chore(package): update pelias-query to version 9.13.0

* feat(query) use full match_phrase query

This updates the fixture expectations for
pelias/query#100 which uses the full
`match_phrase` query format everywhere for consistency and to allow
adding future parameters easily.

* chore(package): update pelias-query to version 9.14.0

* feat(query): Remove deprecated optimize_bbox param

This parameter of the `geo_distance` query was deprecated in
Elasticsearch 2.2

Connects pelias/query#102

* chore(tests): Fix test description

* fix(package): update elasticsearch to version 16.0.0

* chore(tests): Ensure test ES documents have proper format

`source_id` should (currently) match `_id`, and should be present.

* feat(geojsonify): Use `source_id` rather than `_id` field

This change does not have any impact on the output of the API, but
allows us to make changes to the contents of the `_id` field in order to
correct the issue described in pelias/pelias#672

* fix(interpolation): Use street source_id if none found in interpolated result

This fix ensures the interpolation middleware always returns results
with a `source_id` field, fixing two issues:

1. It ensures the `source_id` property is always present in Pelias API
responses
2. It allows `helper/geojsonify.js` to rely on `source_id`

* feat(place): Query for old and new contents of `_id` field

_This change completes step 2 of the `_id` field contents migration
described in pelias/pelias#672 (comment) _

As part of the transition away from Elasticsearch types, we must change
the contents of the `_id` field in our Elasticsearch documents.

This change modifies the `/v1/place` endpoint so that it queries
Elasticsearch for documents based on both the old and new contents of
the `_id` field, allowing the same API code to work regardless of the
version of Pelias used to import data.

Connects pelias/pelias#672

* feat(auto_discover_type_mapping): update README

* fix(auto_discover_type_mapping): bugfix to allow existing references to the type mapping to continue functioning after update

* fix(auto_discover_type_mapping): update README

* Netherlands use 'street number'

The localized results of Pelias for the Netherlands are in the wrong order. The Dutch way is `street number`, not `number street` eg `damstraat 1`

* chore(tests): Ensure fixture name is in test output

This makes it faster to update the correct fixture

* feat(search): Use minimum_should_match to filter hits

This change utilizes the [minimum_should_match](https://www.elastic.co/guide/en/elasticsearch/reference/5.6/query-dsl-minimum-should-match.html)
Elasticsearch query parameter to reduce the number of hits to search
queries.

Previously, only one token in an input had to match, regardless of the
input size. This allows for queries to potentially match a huge number
of documents, especially as the number of tokens grows.

Now, most `ngrams` queries have thee parameter set to `1<-1 3<-25%`

Breaking this down, it means that for the given number of tokens in a
query, there is a certain number of optional tokens:

| token count | optional tokens |
| --- | --- |
| 1   | 0 (obviously) |
| 2   | 1   |
| 3   | 1   |
| 4   | 1   |
| 5   | 1   |
| 6+  | at least 75% of tokens must match |

This should help ensure quality results are returned where possible,
even if long inputs contain a few extraneous bits of information, while
reducing the chance of extremely expensive queries.

* feat(autocomplete): focus point distance filter

This change uses an extra filter clause that takes into account a focus
point (if used) and the input text length to filter out very far away
street and adress records, improving performance.

Replaces pelias#1215

* feat(peliasQueryAnalyzer): replace usage of peliasQueryPartialToken and peliasQueryFullToken analyzers with the new peliasQuery analyzer

* feat(dedupe): Prefer OSM records over Geonames

In the case of a venue, we should prefer OSM records over Geonames.

* feat(geocodeJSON): remove geocoding.query.tokens array

* fix(package): update check-types to version 9.0.0

* feat(elasticsearch): do not reuse config for elasticsearch client

* Handle signals in a Docker friendly way

Similar to pelias/placeholder#160, this small
change allows `ctrl+C` to immediately and gracefully shut down a docker
container started with `docker run pelias/api`.

It's a nice user experience improvement for people running our
containers on the command line.

* chore(Node.js): Add Node.js 12 to CI

Node.js 12 (and especially Node.js 12.9) is supposed to be significantly
faster than previous versions, especially when it comes to parsing JSON, which
is often a bottleneck for Pelias importers.

https://nodejs.org/en/blog/release/v12.9.0/
https://v8.dev/blog/v8-release-76

Connects pelias/pelias#800

* feat(flip_housenumber_street): update config to cover more country codes where the label format is "{{street}} {{housenumber}}"

* fix(package): update pelias-model to version 6.0.0

* chore(tests): Update tests for GID changes

Connects pelias/pelias#672

* feat(deps): Move from joi to hapi/joi

This just keeps the output of `npm install` free of warnings

* fix(package): update pelias-model to version 7.0.0

* chore(tests): Update test expectations for pelias-model-7.0.0

* fix(package): update check-types to version 10.0.0

* feat(140_chars): restrict autocomplete inputs to 140 chars

* chore(test): Remove all instances of `_type` from ES docs

This will help ensure that in the future no code depends on the `_type`
value from Elasticsearch.

Connects pelias/pelias#719

* chore(travis): Pin to Node.js 12.10

Pending resolution of issues that may be related to the `geolib` module.

See manuelbieh/geolib#208

* feat(parser): replace addressit with pelias native parser

* feat(parser): improved postfix cursor position for text with no admin classification

* feat(parser): pelias/parser improvements

* feat(parser): bump pelias/parser version

* feat(parser): bump pelias/parser version

* feat(parser): bump pelias/parser version

* feat(parser): bump pelias/parser version

* feat(parser): updates to tokenizer sanitizer

* typo

* feat(parser): stricter tokenization of exact matching admin queries

* feat(parser): switch to using multi_match for admin subqueries

* feat(admin_subqueries): test cross_fields query

* feat(admin_subqueries): test operator:and query

* feat(admin_subqueries): set all boosts to 1

* feat(admin_subqueries): add locality_a and country_a to multi_match

* feat(admin_subqueries): revert to operator:or

* feat(admin_subqueries): remove cutoff_frequency

* feat(admin_subqueries): move admin matching to MUST condition

* feat(tokenizer): consider query as complete if the final char is a numeral

* feat(autocomplete): test removing exact_matching subquery

* feat(admin_subqueries): add cutoff_frequency

* feat(pelias_parser): admin queries - remove subject from admin subquery

* feat(admin_subqueries): remove cutoff_frequency

* feat(autocomplete): use phrase index for complete tokens

* feat(parser): remove parsed_text.name

* feat(parser): so not consider address parses as safe to use with an ngrams index due to parses potentially containing partial suffixes

* feat(parser): bump pelias/parser version

* feat(tokenizer): consider query as complete if the $subject is not at the end of $clean.text

* feat(parser): bump pelias/parser version

* feat(autocomplete): experiment adding name.default to admin multi_match

* feat(autocomplete): progess commit

* feat(autocomplete): typo

* feat(autocomplete): improved matching at the cusp

* feat(autocomplete): improved performance and reduced noise for admin matching

* feat(deps): bump parser dep version

* feat(deps): bump parser dep version

* feat(deps): bump parser dep version

* test: disable parserConsumedAllTokens for admin parses

* feat(deps): bump parser dep version

* feat(query): add should subquery for cross_street matching

* feat(logging): add summary logging for pelias parser

* feat(deps): bump parser dep version

* feat(pelias_parser): additional parser tests

* feat(deps): bump parser dep version

* feat(pelias_parser): fix tests

* feat(search_addressit): generate cross_street subquery where available

* feat(pelias_parser): limit input text to 140 characters

* feat(pelias_parser): replace peliasQueryPartialToken analyzer with peliasQuery

* feat(pelias_parser): disable "ngrams_last_token_only_multi" view when every "completed" token is numeric

* Add context to pelias parser logs

* Pin to pelias-parser-1.38.0 for now

* refactor(pelias_parser): add code comments relating to "add_name_to_multimatch", clean up related code

* refactor(pelias_parser): remove disused code/comments

* feat(pelias_parser): completely remove "addressit" and references to it

* refactor(pelias_parser): remove references to "original style queries"

* chore(Travis): Use string to specify minor version

Otherwise 12.10 is interpreted as a number and becomes version 12.1.

* chore(query): Remove empty pop_subquery view

`pelias-query` now includes basic views for standard Elasticsearch
queries, including `match_all`. The API esentially created a wrapper
around the `match_all` view, which was used in some cases and resulted
in an extra query view that didn't do much.

This change removes that boilerplate code and makes it a bit more clear
what the underlying functionality of a query will be.

* chore(tests): Remove unused fixture

* chore(query): Remove unused function parameter

At first glance, this code made it appear that the phrase query was
being used across the focus point filter functionality on libpostal
based search queries, but it turns out it's not!

The `focus_only_function` view constructor takes no parameters, so this
code wasn't having any effect.

* updated with latest api

* fix(local_naming_conventions): improve localNamingConventions middleware to support records with name/street aliases

* fixed lint errors

* added back placeholder polygon

* fixed geometry results
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants