Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Scraper is unreliable: Some pages are not found at times. #75

Open
juhamust opened this issue Jan 2, 2025 · 0 comments
Open

Scraper is unreliable: Some pages are not found at times. #75

juhamust opened this issue Jan 2, 2025 · 0 comments

Comments

@juhamust
Copy link

juhamust commented Jan 2, 2025

Description

The number of pages and records the scraper finds and processes varies greatly.

Steps to reproduce

Reproduce the issue by using the provided repository or see the screenshot
https://github.com/juhamust/docusaurus-typesense-search

Screenshot from subsequent runs:

image

Expected Behavior

The number of records should remain consistent.

Actual Behavior

Metadata

Running both server and scraper within Docker containers running in MacOS. The target website is quite a vanilla Docusaurus website. The Docusaurus website is built and served using Docusaurus.

Typesense Scraper Version: typesense/docsearch-scraper:0.11.0
Typesense Version: typesense/typesense:27.1

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant