Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add optional popularity column to pelias csv importer #80

Merged

Conversation

blackmad
Copy link
Contributor

@blackmad blackmad commented Sep 29, 2020

To make external POIs be at the baseline scoring as OSM venues, they should have some baseline popularity, since many OSM POIs have a popularity of ~2000 which makes it hard for third party chain data to compete with it

Connects #71

Copy link
Member

@Joxit Joxit left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think that we should throw errors for popularity....

const popularityString = getCaseInsensitive('popularity', record);
const popularity = parseInt(popularityString);
if (popularityString && (isNaN(popularity) || !/^\d+$/.test(popularityString))) {
throw new Error("Popularity must be an int, got " + popularityString)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think that we should throw errors for popularity.... This is not a required field, so a default value is enough when the data is incorrect.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can't recall, does throw-ing here kill the whole importer or does it print a message and skip just this one row?

In general, we've had this problem across all the importers, where different people have one of two needs:

  • they want the import to finish if at all possible, even if one file/record/etc fails
  • they want the import to be as complete as possible, and would rather the entire import fails than some invalid data makes its way in, or some data is missed

Where needed, we've introduced flags like missingFilesAreFatal in the OA importer to toggle between the two preferences.

So, if this throw causes the whole importer to fail, then we need to provide a way to allow invalid records to be skipped. If it prints a warning/error and continues on importing other records, then it's probably fine.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, looks like it does just skip that one record. Now you all know the background anyway :)

@blackmad
Copy link
Contributor Author

blackmad commented Sep 30, 2020 via email

@Joxit
Copy link
Member

Joxit commented Oct 2, 2020

Hum... I thought it would just print a message on failure (with the try/catch). But in fact there is also a report at the end with stats.badRecordCount++; so this is OK 👌.

orangejulius pushed a commit that referenced this pull request Oct 2, 2020
@orangejulius orangejulius force-pushed the blackmad/add-popularity-to-pelias-csv-importer branch from 943035b to 0cca254 Compare October 2, 2020 17:44
Copy link
Member

@orangejulius orangejulius left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added some quick docs and this now looks good to go. Thanks!

@orangejulius orangejulius force-pushed the blackmad/add-popularity-to-pelias-csv-importer branch from 0cca254 to 93b063e Compare October 2, 2020 17:46
@orangejulius orangejulius merged commit 0720985 into master Oct 2, 2020
@orangejulius orangejulius deleted the blackmad/add-popularity-to-pelias-csv-importer branch October 2, 2020 17:52
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants