Add donna hay #1153

mlduff · 2024-06-19T03:50:16Z

Resolves #1150

No schema support. Most functions are supported (except for times).

Worked on collaboratively by myself, @heathrampazis , @a1831319 and @Mooree003.

Added Functionality to Scrape Instructions

Add donna hay

Added Functionality to Scrape Ingredients

jayaddison · 2024-06-24T18:43:00Z

Thanks all! And apologies for taking a while here; I plan to review this within the next 24h or so.

jayaddison

This is looking pretty good! I have two requests after reading through the code:

Could we try retrieving the recipe title from one of the other elements on the page, and filtering out the pipe (|) and subsequent content from that? I think that would make for more readable recipe titles.
To confirm that the ingredient_groups functionality works as expected, could we add another test case for a recipe that involves ingredient groupings?

a1831319 · 2024-06-25T13:16:37Z

To confirm that the ingredient_groups functionality works as expected, could we add another test case for a recipe that involves ingredient groupings?

Can do, I'll use https://www.donnahay.com.au/recipes/snacks-and-sides/smoky-eggplant-dip-with-hand-cut-potato-chips as the target if that works?

jayaddison · 2024-06-25T14:33:38Z

Sounds good - thanks, @a1831319!

Update based on feedback

…rapers-mlduff into add-donna-hay

Additional tests for testing the ingredient groups

jayaddison · 2024-07-08T14:01:53Z

* To confirm that the `ingredient_groups` functionality works as expected, could we add another test case for a recipe that involves ingredient groupings?

Resolved, thank you @a1831319 @mlduff!

* Could we try retrieving the recipe `title` from one of the other elements on the page, and filtering out the pipe (`|`) and subsequent content from that?  I think that would make for more readable recipe titles.

This isn't completely resolved yet - could we use the HTML <title> element, or og:title from a meta tag to retrieve the recipe title instead?

Retrieve recipe names from title element

jayaddison · 2024-07-21T10:46:08Z

recipe_scrapers/donnahay.py

@@ -38,7 +38,7 @@ def site_name(self):
        return "Donna Hay"

    def title(self):
-        return self.soup.find("h1", class_="recipe-title__mobile").text
+        return self.soup.find("title").text.split("|")[0].strip()


Suggested change

return self.soup.find("title").text.split("|")[0].strip()

html_title = self.soup.find("title")

recipe_title, _, _ = html_title.text.partition("|")

return recipe_title.strip()

Edit: call str.partition instead of str.rpartition

Ah, nope.. not quite correct. rpartition would return an empty result when | is not found in the string.

(updated/fixed to use str.partition instead)

jayaddison

Looks good to me! Thank you @a1831319 @heathrampazis @mlduff @Mooree003!

Ready to merge once the merge conflict in __init__.py is resolved; the str.partition usage suggestion is optional.

jayaddison · 2024-08-01T14:19:05Z

recipe_scrapers/donnahay.py

+# mypy: allow-untyped-defs
+


This pull request is generally ready I think - just some merge conflicts to resolve.

There's a small cleanup opportunity here too - after #1174 we don't need these allow-untyped-defs mypy directives, so this can be removed from the file header.

jayaddison · 2024-08-01T14:21:34Z

tests/test_data/donnahay.com.au/donnahay_1.json

@@ -0,0 +1,62 @@
+{


Please note: we've begun checking for a preferred ordering of the JSON key names (not alphabetical; more like priority/review-aid based).

After merging recent changes into your branch, one of the unit tests may begin complaining about the JSON files because of that. There is however a script provided that can automatically fix them -- running python scripts/reorder_json_keys.py should do that for you.

mlduff and others added 16 commits June 10, 2024 12:32

Initial class generation

9de6a98

Added functionality to scrap instructions

b7d3f4a

Added ingredients and ingredient grouping functions.

27105e9

Fixed servings sentence removal

4145fb4

Merge pull request #3 from heathrampazis/add-donna-hay-instructions

3512f47

Added Functionality to Scrape Instructions

image

f795fba

Title and Yield

6db5816

keywords and removing some redundant functions

7c5c5ee

all relevant fields captured

0b9caa6

updated yield method

5260fae

author method

3af0ab8

Merge pull request #5 from Mooree003/add-donna-hay

38899e8

Add donna hay

Merge pull request #4 from a1831319/add-donna-hay-ingredients

ba115e2

Added Functionality to Scrape Ingredients

Add test data and make minor adjustments

0329599

Use normalize_string and fix yields

3a4720b

Update readme

8cee672

jayaddison requested changes Jun 25, 2024

View reviewed changes

Mooree003 and others added 4 commits June 25, 2024 23:11

title method updated

12069c8

update test case

1715574

Added additional test to test the ingredient groupings function.

5a8d9d4

readd class identifier for title

9df73a5

mlduff and others added 5 commits July 3, 2024 12:05

Merge pull request #6 from Mooree003/add-donna-hay

70eec07

Update based on feedback

Update donnahay_2.json

3d23385

Update donnahay_1.json

97468b8

Merge branch 'add-donna-hay' of https://github.com/a1831319/recipe-sc…

edb8d39

…rapers-mlduff into add-donna-hay

Merge pull request #8 from a1831319/add-donna-hay

c7ff76e

Additional tests for testing the ingredient groups

a1831319 and others added 2 commits July 10, 2024 21:22

Updated recipe name acquisition and tests

d8603e5

Merge pull request #9 from a1831319/add-donna-hay

7fe672b

Retrieve recipe names from title element

jayaddison reviewed Jul 21, 2024

View reviewed changes

jayaddison approved these changes Jul 21, 2024

View reviewed changes

jayaddison reviewed Aug 1, 2024

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add donna hay #1153

Add donna hay #1153

mlduff commented Jun 19, 2024

jayaddison commented Jun 24, 2024

jayaddison left a comment

a1831319 commented Jun 25, 2024

jayaddison commented Jun 25, 2024

jayaddison commented Jul 8, 2024

jayaddison Jul 21, 2024 •

edited

Loading

jayaddison Jul 21, 2024

jayaddison Jul 21, 2024

jayaddison left a comment

jayaddison Aug 1, 2024

jayaddison Aug 1, 2024

-        return self.soup.find("title").text.split("|")[0].strip()
+        html_title = self.soup.find("title")
+        recipe_title, _, _ = html_title.text.partition("|")
+        return recipe_title.strip()

Add donna hay #1153

Are you sure you want to change the base?

Add donna hay #1153

Conversation

mlduff commented Jun 19, 2024

jayaddison commented Jun 24, 2024

jayaddison left a comment

Choose a reason for hiding this comment

a1831319 commented Jun 25, 2024

jayaddison commented Jun 25, 2024

jayaddison commented Jul 8, 2024

jayaddison Jul 21, 2024 • edited Loading

Choose a reason for hiding this comment

jayaddison Jul 21, 2024

Choose a reason for hiding this comment

jayaddison Jul 21, 2024

Choose a reason for hiding this comment

jayaddison left a comment

Choose a reason for hiding this comment

jayaddison Aug 1, 2024

Choose a reason for hiding this comment

jayaddison Aug 1, 2024

Choose a reason for hiding this comment

jayaddison Jul 21, 2024 •

edited

Loading