Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Goiânia-GO Spider #6

Merged
merged 3 commits into from
Apr 20, 2018
Merged

Goiânia-GO Spider #6

merged 3 commits into from
Apr 20, 2018

Conversation

jonjoncardoso
Copy link
Contributor

A página inicial de diários oficiais de Goiânia (http://www4.goiania.go.gov.br/portal/site.asp?s=775&m=2075) contém links para a lista de diários oficiais de cada ano, que são fáceis de extrair.

Cada uma dessas páginas contém apenas uma tabela com links para todos os diários do ano. Exemplo: http://www.goiania.go.gov.br/shtml//portal/casacivil/lista_diarios.asp?ano=2018.

url = response.urljoin(url)

#Apparently, Goiânia doesn't have a separate gazette for executive and legislative
power = 'executive'
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We've been using 'executive_legislature' in this case.

print(date.upper())

#Extra editions are marked either with 'suplemento' or 'comunicado'
is_extra_edition = 'suplemento' in link_text.lower() or 'comunicado' in link_text.lower()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This field expects a True or False.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is actually returning a True whenever suplemento or comunicado is in the link. Goiânia doesn't have gazettes labelled extra, they use these other names to flag additional gazettes in the same day

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, my mistake. For some reason I read 'suplemento' if link_text.lower() in 'suplemento' in link_text.lower().

Copy link
Contributor Author

@jonjoncardoso jonjoncardoso left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed power of Goiânia's gazettes to executive_legislature

@Irio Irio merged commit 531e6a8 into okfn-brasil:master Apr 20, 2018
@alfakini alfakini mentioned this pull request May 24, 2018
@trevineju trevineju added this to the Capitais | Capital Cities milestone Oct 10, 2022
@trevineju trevineju added the spider Adiciona robô raspador para município(s) label Oct 24, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
spider Adiciona robô raspador para município(s)
Projects
Development

Successfully merging this pull request may close these issues.

3 participants