Scrapybara Python Library

The Scrapybara Python library provides convenient access to the Scrapybara API from Python.

Installation

pip install scrapybara

Reference

A full reference for this library is available here.

Requirements

Python >= 3.8
requests >= 2.25.1
anthropic ^0.39.0
pydantic ^2.0.0

License

This project is licensed under the MIT License - see the LICENSE file for details.

Usage

Instantiate and use the client with the following:

from scrapybara import Scrapybara

client = Scrapybara(
    api_key="YOUR_API_KEY",
)
client.start()

Async Client

The SDK also exports an async client so that you can make non-blocking calls to our API.

import asyncio

from scrapybara import AsyncScrapybara

client = AsyncScrapybara(
    api_key="YOUR_API_KEY",
)


async def main() -> None:
    await client.start()


asyncio.run(main())

Exception Handling

When the API returns a non-success status code (4xx or 5xx response), a subclass of the following error will be thrown.

from scrapybara.core.api_error import ApiError

try:
    client.start(...)
except ApiError as e:
    print(e.status_code)
    print(e.body)

Advanced

Retries

The SDK is instrumented with automatic retries with exponential backoff. A request will be retried as long as the request is deemed retriable and the number of retry attempts has not grown larger than the configured retry limit (default: 2).

A request is deemed retriable when any of the following HTTP status codes is returned:

408 (Timeout)
429 (Too Many Requests)
5XX (Internal Server Errors)

Use the max_retries request option to configure this behavior.

client.start(..., request_options={
    "max_retries": 1
})

Timeouts

The SDK defaults to a 60 second timeout. You can configure this with a timeout option at the client or request level.

from scrapybara import Scrapybara

client = Scrapybara(
    ...,
    timeout=20.0,
)


# Override timeout for a specific method
client.start(..., request_options={
    "timeout_in_seconds": 1
})

Custom Client

You can override the httpx client to customize it for your use-case. Some common use-cases include support for proxies and transports.

import httpx
from scrapybara import Scrapybara

client = Scrapybara(
    ...,
    httpx_client=httpx.Client(
        proxies="http://my.test.proxy.example.com",
        transport=httpx.HTTPTransport(local_address="0.0.0.0"),
    ),
)

Contributing

While we value open-source contributions to this SDK, this library is generated programmatically. Additions made directly to this library would have to be moved over to our generation code, otherwise they would be overwritten upon the next generated release. Feel free to open a PR as a proof of concept, but know that we will not be able to merge it as-is. We suggest opening an issue first to discuss with us!

On the other hand, contributions to the README are always very welcome!

Name		Name	Last commit message	Last commit date
Latest commit History 44 Commits
.github/workflows		.github/workflows
src/scrapybara		src/scrapybara
tests		tests
.fernignore		.fernignore
.gitignore		.gitignore
README.md		README.md
poetry.lock		poetry.lock
pyproject.toml		pyproject.toml
reference.md		reference.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Scrapybara Python Library

Installation

Reference

Requirements

License

Usage

Async Client

Exception Handling

Advanced

Retries

Timeouts

Custom Client

Contributing

About

Releases 7

Packages

Contributors 5

Languages

Scrapybara/scrapybara-python

Folders and files

Latest commit

History

Repository files navigation

Scrapybara Python Library

Installation

Reference

Requirements

License

Usage

Async Client

Exception Handling

Advanced

Retries

Timeouts

Custom Client

Contributing

About

Resources

Stars

Watchers

Forks

Releases 7

Packages 0

Contributors 5

Languages

Packages