Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add grimgrains.com #846

Merged
merged 6 commits into from
Sep 15, 2023
Merged

add grimgrains.com #846

merged 6 commits into from
Sep 15, 2023

Conversation

shantih19
Copy link
Contributor

Add scraper for https://grimgrains.com

Copy link
Collaborator

@jayaddison jayaddison left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great to me, thanks @shantih19

One annoying detail that isn't specific to this pull request, but a problem with the codebase at the moment: we seem to have a mix of types in the instructions field results.

In the past it had always returned a newline-separated (single) string value, but nowadays we have a few scrapers that return a list of strings.

I should raise a separate issue thread about that and figure out what to do This has been addressed by pull request #852. I'll merge this soon either way - this is mostly a heads-up that that field is a bit ambiguous type-wise at the moment.

@shantih19
Copy link
Contributor Author

shantih19 commented Sep 11, 2023

I see, if you prefer, I can change the format to the newline-separated string

@jayaddison
Copy link
Collaborator

I see, if you prefer, I can change the format to the newline-separated string

Yep, let's do that. I think the first step will be to make everything consistent, so that'll mean one less method to remember to update.

It's slightly annoying that mypy doesn't seem to detect this automatically. I guess if there's no type hint on the subclass methods, it treats them as untyped (the abstract class does define a return type...).

@jayaddison
Copy link
Collaborator

Relevant mypy bug seems to be: python/mypy#3903

@shantih19
Copy link
Contributor Author

I'll add the None return when the string is empty (like the edits in #852)

@jayaddison
Copy link
Collaborator

I see, if you prefer, I can change the format to the newline-separated string

Yep, let's do that. I think the first step will be to make everything consistent, so that'll mean one less method to remember to update.

This consistency mini-project should be complete: all of the scrapers in the codebase now return values of string type (str) from the instructions method.

Copy link
Collaborator

@jayaddison jayaddison left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should remove the description method here - apart from that, this looks good to go 👍

self.assertEqual("4 servings", self.harvester_class.yields())

def test_image(self):
self.harvester_class.url = "https://grimgrains.com/site/okonomiyaki.html"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note-to-future: this url attribute is set because we return a fully-qualified image URL, and when the tests run (on HTML, outside of a URL context) the base URL is test.example.com, which looks wrong.

Copy link
Collaborator

@jayaddison jayaddison left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @shantih19 - this looks great, and ready for merge. Thanks for your patience with the review and main-branch code edits during this!

@jayaddison jayaddison merged commit 4a1359f into hhursev:main Sep 15, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants