Releases: assafelovic/gpt-researcher
Detailed reports 🤯🚀
Super excited to share the latest release, highly contributed by the the one and only @proy9714 👏
Introducing long and detailed reports, with a completely new architecture inspired by the latest STORM paper.
In this method we do the following:
- Trigger Initial GPT Researcher report based on task
- Generate subtopics from research summary
- For each subtopic the headers of the subtopic report are extracted and accumulated
- For each subtopic a report is generated making sure that any information about the headers accumulated until now are not re-generated.
- An additional introduction section is written along with a table of contents constructed from the entire report.
- The final report is constructed by appending these : Intro + Table of contents + Subsection reports
In addition, this release includes support for Azure OpenAI @norrisp90
Gemini and Docx support! 🎉
New embeddings, pdf styling and newspaper3k support 👨👨👦👦
Another great release thanks to the amazing community! ❤️
Big shoutout to the following contributions:
@proy9714 for adding newpaper3k support for better article scraping #365
@jimmylin0979 for adding support for additional embeddings such as Mistral, Ollama and HuggingFace #375
@assafelovic adding support for pdf styling of research reports #396
@WarrenTheRabbit for fixing a documentation typo #391
Thank you to everyone and looking forward for more contributions!
Stability and Poetry support 🎉
Excited to introduce latest version that removes strict dependencies from requirements.txt, fixes some installation issues and adds support for virtual env and Poetry!
Big shoutout to contributors @aaaastark for the PR: #319
Quick installation fix
Releasing new version that resolves dependency issues with latest version.
New features and stability 🎉
Excited to kick off the new year with a long awaited feature: Research report on specific urls! 🎉
You can now skip the search by providing urls directly to GPTResearcher and create a research report like so:
urls = ["https://docs.tavily.com/docs/tavily-api/introduction",
"https://docs.tavily.com/docs/tavily-api/python-sdk",
"https://docs.tavily.com/docs/tavily-api/rest_api"]
query = "How can I integrate Tavily Rest API with my application?"
async def get_report(query: str, source_urls: list) -> str:
researcher = GPTResearcher(query=query, source_urls=source_urls)
report = await researcher.run()
return report
report = asyncio.run(get_report(query, urls))
print(report)
The release includes additional stability and performance improvements, along with updated library dependencies.
Performance boost 🚀
Excited to release the latest version aimed at improving overall research performance! 🎉
We're introducing a new approach to extracting relevant information from scraped sites using Contextual Compression. We now leverage embeddings to better store and retrieve information across the research lifecycle. This latest improvement reduces research task time by an average of 60%, increases quality by ~30% and reduces GPT costs by 50% (we don't summarize with GPT anymore).
In addition, thank you to our amazing contributors:
@reasonmethis for the serp retriever fix: #261
@devon-ye for the Chinese README addition: #254
Mega refactor for optimized modularity 🎉🎆
We’re excited to release the next generation of GPT Researcher! We’ve completely refactored the code base to be more modular, customizable, stable and accurate. We’ve added many new features, improvements and bug fixes.
Below is a list of the main changes:
- Improved report generation prompt for better accuracy and quality.
- Added a new config structure including support for external JSON files.
- Redesigned GPT Researcher library to be enabled as a stand alone agent in any project (see example in repo).
- Added new structure for retrievers, enabling a better developer experience for adding and modifying information retrievers.
- Optimized configuration for latest GPT-4 Turbo model.
- Fixed issues with scraping and improved overall stability and speed.
- Added support for arxiv and pdf scraping urls.
- Added a rich documentation site with Docusaurus (see docs.tavily.com).
- Updated all Python packages to latest for most up to date performance and experience
Just git pull
the latest version and give it a run!
Next, we’re building embedding support and long term memory!
To see what’s next on our roadmap check it out here: https://trello.com/b/3O7KBePw/gpt-researcher-roadmap
GPT-4 Turbo Integration 🚀
Excited to connect GPT Researcher with the latest GPT-4 Turbo (128 context windows). This elevates the research agent to new levels, allowing much more RAG online information including a more detailed and comprehensive report generation.
Added features for improved research quality
We're excited to announce the latest release of GPT Researcher! This release includes an addition of various and powerful search engines for improved overall research quality and experience.
- Tavily API - An LLM focused search engine for optimized, explicit and factual results. It's free of use (!)
- Serp API - Requires paid account
- Google search API - Requires paid account
- Searx - Meta search engine.
In addition, this release includes additional performance improvements and bug fixes.