Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Exposes Kendra result item DocumentAttributes in the document metadata #7781

Conversation

wnleao
Copy link
Contributor

@wnleao wnleao commented Jul 16, 2023

  • Description: exposes the ResultItem DocumentAttributes as document metadata with key 'document_attributes' and refactors AmazonKendraRetriever by providing a ResultItem base class in order to avoid duplicate code;
  • Tag maintainer: @3coins @hupe1980 @dev2049 @baskaryan
  • Twitter handle: wilsonleao

Why?

Some use cases depend on specific document attributes returned by the retriever in order to improve the quality of the overall completion and adjust what will be displayed to the user. For the sake of consistency, we need to expose the DocumentAttributes as document metadata so we are sure that we are using the values returned by the kendra request issued by langchain.

I would appreciate your review @3coins @hupe1980 @dev2049. Thank you in advance!

References

@vercel
Copy link

vercel bot commented Jul 16, 2023

The latest updates on your projects. Learn more about Vercel for Git ↗︎

1 Ignored Deployment
Name Status Preview Comments Updated (UTC)
langchain ⬜️ Ignored (Inspect) Jul 18, 2023 9:32pm

@dosubot dosubot bot added the 🤖:improvement Medium size change to existing code to handle new use-cases label Jul 16, 2023
@wnleao wnleao force-pushed the kendra-expose-document-attributes-metadata branch from 6c45518 to 5738df0 Compare July 16, 2023 10:48
@wnleao
Copy link
Contributor Author

wnleao commented Jul 16, 2023

I will propose more points of refactoring soon in separate PRs/issues. Thank you in advance for your review and all the work moving this forward!

Copy link
Contributor

@3coins 3coins left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@wnleao
Thanks for making these updates. I have a minor comment, the code looks good otherwise 🚀 .

langchain/retrievers/kendra.py Outdated Show resolved Hide resolved
@wnleao
Copy link
Contributor Author

wnleao commented Jul 17, 2023

@3coins
Thank you very much for the review!

wnleao and others added 3 commits July 18, 2023 21:45
- Refactors retriever by providing a ResultItem base class in order to
avoid duplicate code;
- Exposes the ResultItem DocumentAttributes as
document metadata with key 'document_attributes'.
@wnleao wnleao force-pushed the kendra-expose-document-attributes-metadata branch from 9237bd1 to 63c44a0 Compare July 18, 2023 20:17
@baskaryan baskaryan added the lgtm PR looks good. Use to confirm that a PR is ready for merging. label Jul 18, 2023
@baskaryan baskaryan merged commit 8bb33f2 into langchain-ai:master Jul 19, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
🤖:improvement Medium size change to existing code to handle new use-cases lgtm PR looks good. Use to confirm that a PR is ready for merging.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants