feat(loaders): CSV Loader to Document Loaders #30
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This PR implements a CSV loader as part of the document loaders module in Rig. It allows users to easily load and process CSV documents for use in RAG systems and other document processing tasks.
Changes
Implementation Details
The CsvLoader uses the csv crate to parse CSV files and extract content. It handles potential errors such as file not found or parsing errors. The extracted content is converted into a single DocumentEmbeddings object for further processing in Rig. Each row of the CSV is formatted as "header: value" pairs, separated by newlines.
Testing
Ran tests to ensure the CsvLoader correctly loads CSV files and handles various edge cases. The tests covered:
Documentation
Code files are commented, and usage examples have been added to the documentation.
Related Issue
Closes #29
Checklist
Additional Notes
This implementation focuses on converting CSV data into a single document for embedding. Future enhancements could include options for creating separate embeddings for each row or handling more complex CSV structures.