Welcome to LlamaIndex! We’re excited that you want to contribute and become part of our growing community. Whether you're interested in building integrations, fixing bugs, or adding exciting new features, we've made it easy for you to get started.
If you're ready to dive in, here’s a quick setup guide to get you going:
- Fork the repo and clone your fork.
- Navigate to the project folder:
cd llama_index
- Set up a new virtual environment with
Poetry
:poetry shell
- Install development (and/or docs) dependencies:
poetry install --only dev,docs
- Install the package(s) you want to work on:
or for specific integrations:
pip install -e llama-index-core
pip install -e llama-index-integrations/llms/llama-index-llms-openai
That’s it! If anything seems unclear, scroll down to the Development Guidelines for more details.
There’s plenty of ways to contribute—whether you’re a seasoned Python developer or just starting out, your contributions are welcome! Here are some ideas:
Help us extend LlamaIndex's functionality by contributing to any of our core modules. Think of this as unlocking new superpowers for LlamaIndex!
- New Integrations (e.g., connecting new LLMs, storage systems, or data sources)
- Data Loaders, Vector Stores, and more!
Explore the different modules below to get inspired!
Create new Packs, Readers, or Tools that simplify how others use LlamaIndex with various platforms.
Have an idea for a feature that could make LlamaIndex even better? Go for it! We love innovative contributions.
Fixing bugs is a great way to start contributing. Head over to our Github Issues page and find bugs tagged as good first issue
.
If you’ve used LlamaIndex in a unique or creative way, consider sharing guides or notebooks. This helps other developers learn from your experience.
Got an out-there idea? We’re open to experimental features—test it out and make a PR!
Help make the project easier to navigate by refining the docs or cleaning up the codebase. Every improvement counts!
A data loader ingests data from any source and converts it into Document
objects that LlamaIndex can parse and index.
- Interface:
load_data
: Returns a list ofDocument
objects.lazy_load_data
: Returns an iterable ofDocument
objects (useful for large datasets).
Example: MongoDB Reader
💡 Ideas: Want to load data from a source not yet supported? Build a new data loader and submit a PR!
A node parser converts Document
objects into Node
objects—atomic chunks of data that LlamaIndex works with.
- Interface:
get_nodes_from_documents
: Returns a list ofNode
objects.
Example: Hierarchical Node Parser
💡 Ideas: Add new ways to structure hierarchical relationships in documents, like play-act-scene or chapter-section formats.
A text splitter breaks down large text blocks into smaller chunks—this is key for working with LLMs that have limited context windows.
- Interface:
split_text
: Takes a string and returns smaller strings (chunks).
Example: Token Text Splitter
💡 Ideas: Build specialized text splitters for different content types, like code, dialogues, or dense data!
Store embeddings and retrieve them via similarity search with vector stores.
- Interface:
add
,delete
,query
,get_nodes
,delete_nodes
,clear
Example: Pinecone Vector Store
💡 Ideas: Create support for vector databases that aren't yet integrated!
- Query Engines implement
query
to return structured responses. - Retrievers retrieve relevant nodes based on queries.
💡 Ideas: Design fancy query engines that combine retrievers or add intelligent processing layers!
- Fork the repository on GitHub.
- Clone your fork to your local machine.
git clone https://github.com/your-username/llama_index.git
- Create a branch for your work.
git checkout -b your-feature-branch
- Set up your environment (follow the Quick Start Guide).
- Work on your feature or bugfix, ensuring you have unit tests covering your code.
- Commit your changes, then push them to your fork.
git push origin your-feature-branch
- Open a pull request on GitHub.
And voilà—your contribution is ready for review!
LlamaIndex is organized as a monorepo, meaning different packages live within this single repository. You can focus on a specific package depending on your contribution:
- Core package:
llama-index-core/
- Integrations: e.g.,
llama-index-integrations/
- Install Poetry (if you don’t already have it):
curl -sSL https://install.python-poetry.org | python3 -
- Activate the environment:
poetry shell
- Install dependencies:
poetry install --only dev,docs
We use pytest
for testing. Make sure you run tests in each package you modify:
pytest
If you’re integrating with a remote system, mock it to prevent test failures from external changes.
By default, CICD will fail if test coverage is less than 50% -- so please do add tests for your code!
We’d love to hear from you and collaborate! Join our Discord community to ask questions, share ideas, or just chat with fellow developers.
Join us on Discord https://discord.gg/dGcwcsnxhU
Thank you for considering contributing to LlamaIndex! Every contribution—whether it’s code, documentation, or ideas—helps make this project better for everyone.
Happy coding! 😊