GitHub - mrdan/Slurp: Takes a RSS feed (designed for use with google reader feeds like "starred items") and downloads the first image found in each entry, generating a companion text file with the source URL and any text from the entry

mrdan / Slurp Public

Notifications You must be signed in to change notification settings
Fork 0
Star 1

Takes a RSS feed (designed for use with google reader feeds like "starred items") and downloads the first image found in each entry, generating a companion text file with the source URL and any text from the entry

1 star 0 forks Branches Tags Activity

Star

Notifications

Branches Tags

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
README		README
feedparser.py		feedparser.py
slurp.py		slurp.py

Repository files navigation

## Slurp

### About
Takes a RSS feed (designed for use with google reader feeds like "starred items") and downloads the first image found in each entry, generating a companion text file with the source URL and any text from the entry

### Usage
python slurp.py

### Options
There're a few variables you can set near the top of the script:
+ "_filedir" sets the directory to save files in (must exist)
+ "userid" is the google reader user id, you can get it from the URL on the greader site
+ "retrieveNum" is the number of items from the greader feed to download. This is a function of the feed, not this script
+ "blacklist" is a string array. Just put in domains you want to ignore, it's used in a simple substring check

### Notes
+ Only takes the first image from each entry
+ images without an extension in the URL will be left "as-is", i.e. the downloaded file won't have one either...
+ Would like to support a minimum-size option, to avoid downloading 1x1 size images that are just for tracking, but this apparently requires installing PIL