Skip to content

Scrapes content from a website, Uses AI to Paraphrase it and then posts in on a blogging site

Notifications You must be signed in to change notification settings

prxshetty/AI-BlogBot

Repository files navigation

Blog Bot with AI Paraphraser

Paraphrases the content using pegasus from API or website using BeautifulSoup.

Dependencies

  • BeautifulSoup
  • requests
  • urllib
  • torch
  • requests_html
  • transformer
  • Pegasus
  • Python 3

Functionality:

Fetching Headlines: The script fetches headlines from any category of any website and displays them with their respective indexes to select them.

Fetching Article Content: It allows users to select a headline by its index and fetches the content of the corresponding article along with paraphrasing/summarizing it.

How to Use:

Install the required libraries using pip:

  • pip install beautifulsoup4 requests requests-html torch transformers pegasus There might be some additional packages on which pegasus runs on. Install those as well. Choose the website which you want and add it to the base_url parameter. then select the category and add it to the relative_url parameter. Choose a headline by its index, and the script will fetch and display the article content.

Note

The paraphrased content might not always be perfect, and manual review might be necessary depending on the element you want to extract data from. You need to provide the API for the site in which your blog would be uploaded. This script is for educational and demonstration purposes only. Ensure compliance with any website's terms of service when using their content. The Pegasus model used for paraphrasing needs to be fine-tuned for better results in production scenarios.In this project i had to split the data into chunks and further divide the data into paragraphs so ensure maximium accuracy so that the AI couldn't hallucinate.

About

Scrapes content from a website, Uses AI to Paraphrase it and then posts in on a blogging site

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published