Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Contemplate strategy for duplicate bills #196

Open
reginafcompton opened this issue Dec 29, 2017 · 3 comments
Open

Contemplate strategy for duplicate bills #196

reginafcompton opened this issue Dec 29, 2017 · 3 comments

Comments

@reginafcompton
Copy link
Contributor

Recently, a Chicago bill changed its identifier (i.e., its MatterFile) from "CL 2017-966" to "`CL 2017-966". The scraper interpreted this as a new bill and thus created one with a different OCD ID, despite all other elements remaining constant.

Do we want to consider adding a mechanism to clean unusual characters from the identifier?

@hancush
Copy link
Collaborator

hancush commented Jan 3, 2018

FWIW, this seems to be in play here as well: datamade/nyc-council-councilmatic#87

@hancush
Copy link
Collaborator

hancush commented Jan 3, 2018

And perhaps here: datamade/chi-councilmatic#227

@hancush
Copy link
Collaborator

hancush commented Jan 3, 2018

Leaving these links here for posterity, though they may, in fact, be more related to opencivicdata/pupa#295

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

No branches or pull requests

2 participants