Skip to content

A demonstration project showcasing the usage of PyZerox library for processing PDF documents using GPT-4 models.

Notifications You must be signed in to change notification settings

felipefontoura/pyzerox-demo

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

PyZerox Demo

Python License: MIT

A demonstration project showcasing the usage of PyZerox library for processing PDF documents using GPT-4 models.

🚀 Features

  • PDF document processing
  • Integration with GPT-4 models
  • Asynchronous operation
  • Configurable page selection
  • Custom system prompts support

📋 Prerequisites

  • Python 3.7+
  • OpenAI API key
  • Internet connection for accessing remote PDF files

🔧 Installation

  1. Clone the repository:
git clone https://github.com/felipefontoura/pyzerox-demo.git
cd pyzerox-demo
  1. Install dependencies:
pip install -r requirements.txt
  1. Set up your OpenAI API key on .env file.

🎯 Usage

The main script demonstrates how to process a PDF file using PyZerox:

import asyncio
from pyzerox import zerox

async def main():
    result = await zerox(
        file_path="your_pdf_url",
        model="gpt-4o-mini",
        output_dir="./tmp"
    )
    return result

result = asyncio.run(main())

⚙️ Configuration

The following parameters can be configured:

  • file_path: URL or local path to the PDF file
  • model: The GPT model to use (default: "gpt-4o-mini")
  • output_dir: Directory for output files
  • custom_system_prompt: Optional custom system prompt
  • select_pages: Optional page selection (None for all pages)

📦 Dependencies

  • py-zerox==0.0.7

🤝 Contributing

Contributions, issues, and feature requests are welcome! Feel free to check issues page.

📝 License

This project is licensed under the MIT License.

About

A demonstration project showcasing the usage of PyZerox library for processing PDF documents using GPT-4 models.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages