Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add instagram scraper #1

Merged
merged 34 commits into from
Jul 27, 2022
Merged

Add instagram scraper #1

merged 34 commits into from
Jul 27, 2022

Conversation

Seongbuming
Copy link
Contributor

@Seongbuming Seongbuming commented Jul 27, 2022

Features

  • Scrap all search results for a keyword entered as an argument.
  • Can be saved as .csv and .json.
  • Also collect user data who uploaded contents included in search results.

Usage

Install

pip install git+https://github.com/bigpicture-kr/default-scraper.git

It may require authentication before installing since default-scraper is a private repository of bigpicture-kr organization.

Scrap Instagram contents with tag

Run following command to scrap contents from Instagram:

python main.py --platform {instagram} --keyword KEYWORD [--output_file OUTPUT_FILE] [--all]

Use --all or -a option to also scrap unstructured fields.

Data description

  • Structured fields
    • pk
    • id
    • taken_at
    • media_type
    • code
    • comment_count
    • user
    • like_count
    • caption
    • accessibility_caption
    • original_width
    • original_height
    • images
  • Some fields may be missing depending on Instagram's response data.

Future works

  • Will perform crawl tasks using this library

@Seongbuming Seongbuming requested a review from k-gn July 27, 2022 07:25
@Seongbuming Seongbuming self-assigned this Jul 27, 2022
Copy link

@k-gn k-gn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

전설의 시작...

@Seongbuming Seongbuming merged commit 0088038 into main Jul 27, 2022
@Seongbuming Seongbuming changed the title Add instagram scrapper Add instagram scraper Jul 29, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants