doc-gpt is a powerful Python CLI tool designed to process document files (PDF, DOCX, PPTX, TXT, MD) and URLs using Large Language Models (LLMs). It offers a flexible and efficient way to generate content, manage model configurations, and process multiple files asynchronously.
- Support for multiple file types: PDF, DOCX, PPTX, TXT, MD
- URL processing and content scraping
- Configurable model settings with multiple providers (e.g., OpenAI, Azure OpenAI, Ollama, Claude, Google Generative AI)
- Batch processing of files with customizable batch size
- Flexible input options: process single files, entire directories, or URLs
- Customizable prompts and system instructions
- Automatic loading of default prompt from prompt.md in the current working directory
To install the package directly from the GitHub repository, use:
pip install git+https://github.com/ShinChven/doc-gpt.git
For updating the package to the latest version, run:
pip install --upgrade git+https://github.com/ShinChven/doc-gpt.git
doc-gpt provides several commands to manage configurations and generate content. Here's an overview of the available commands:
For a step-by-step guided configuration, use the following command:
doc-gpt config
This command will prompt you for each configuration setting, making the process easier and more user-friendly.
To configure a new model or update an existing one using a single command (all options are mandatory):
doc-gpt config --alias MODEL_ALIAS --model_name MODEL_NAME --provider PROVIDER --key API_KEY --api_base API_BASE
--alias
: A unique name for the model configuration--model_name
: The name of the model (e.g., "gpt-4" for OpenAI, "claude-3-sonnet-20240229" for Claude, "gemini-1.5-pro" for Google Generative AI)--provider
: The provider of the model (e.g., "openai", "azure-openai", "ollama", "claude", or "google-generativeai")--key
: The API key (optional, depending on the provider)--api_base
: The API base URL (optional, defaults to https://api.anthropic.com for Claude)
If you don't provide all options using the single command method, you'll be prompted to enter the missing information.
To set the default model:
doc-gpt set-default MODEL_ALIAS
To delete a model configuration:
doc-gpt delete-model MODEL_ALIAS
To display all configured models with their provider and masked API key:
doc-gpt show-models
To generate content quickly, use the following simplified command:
doc-gpt g <INPUT_PATH_OR_URL>
This command reads the prompt from prompt.md
in the current directory. If prompt.md
doesn't exist, it will guide you to input a prompt.
For more advanced options, use the following command:
doc-gpt g <INPUT_PATH_OR_URL> [OPTIONS]
Options:
--output
or-o
: Designate the path for the output file (optional, default: input_file_name.doc-gpt.md or url_based_filename.doc-gpt.md)--model_alias
or-m
: Indicate the alias for the model (optional, defaults to the pre-set model)--prompt
or-p
: Provide the path to the prompt file (optional)--instructions
or-s
: Specify the path to the system instructions file (optional)--batch_size
or-b
: Define the number of tasks to process concurrently (default is 1)
Important: Default Prompt Loading
If a prompt file is not provided using the --prompt
option, doc-gpt will automatically look for a file named prompt.md
in the current working directory and use it as the default prompt. This feature allows you to maintain a consistent prompt across multiple runs without explicitly specifying it each time.
If neither a prompt file is provided nor a prompt.md
file exists in the current working directory, you'll be prompted to enter the prompt manually.
Note: The g
command now accepts the input path or URL as a required argument, similar to the text
command. You no longer need to use the --input
or -i
flag.
To extract text from a document or directory of documents:
doc-gpt text <input_path> [OPTIONS]
<input_path>
: Specify the path to the input file or directory (mandatory).--output
: Specify the output file path (optional). If omitted, the output will be written to a file with the same name as the input file, but with the extension.doc-gpt.txt
.
doc-gpt supports the following file types:
- PDF (.pdf)
- Microsoft Word (.docx)
- Microsoft PowerPoint (.pptx)
- Text (.txt)
- Markdown (.md)
When processing a directory, doc-gpt will process all supported files in the directory.
doc-gpt also supports processing URLs. When a URL is provided as input, the tool will scrape the content from the webpage and process it.
When processing a URL, doc-gpt converts the URL into a valid filename for the output. The conversion process:
- Removes the protocol (http:// or https://)
- Replaces invalid filename characters (including '=') with underscores
- Replaces slashes with hyphens
- Limits the filename length to 200 characters
- Appends ".doc-gpt.md" to the end of the filename
For example, the URL "https://www.example.com/some/long/path?param=value" would be converted to a filename like "www.example.com-some-long-path_param_value.doc-gpt.md".
doc-gpt supports batch processing of files, allowing you to process multiple files concurrently. Use the --batch_size
option to specify the number of files to process simultaneously. This can significantly speed up processing when dealing with multiple files.
doc-gpt stores its configuration in ~/.doc-gpt/config.json
. You can manually edit this file if needed, but it's recommended to use the config
command to manage your configurations.
doc-gpt now supports the following providers:
- OpenAI
- Azure OpenAI
- Ollama
- Claude (Anthropic)
- Google Generative AI
Each provider may have specific requirements for model names and API configurations. When configuring a new model, make sure to use the correct provider name and follow any provider-specific instructions.
If you encounter any errors while using doc-gpt, the application will provide informative error messages to help you troubleshoot the issue.
Contributions to doc-gpt are welcome! Please feel free to submit a Pull Request.
This project is licensed under the MIT License. See the LICENSE file for details.