-
Notifications
You must be signed in to change notification settings - Fork 15.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Github integration #5257
Comments
Sounds interesting! I'm on it :) |
# Creates GitHubLoader (#5257) GitHubLoader is a DocumentLoader that loads issues and PRs from GitHub. Fixes #5257 --------- Co-authored-by: Dev 2049 <[email protected]>
I'm trying this now, but I'm failing to use it with chroma:
any ideas? |
@mudler it seems chroma only accepts str, int & float values for metadata, and not lists. GitHubIssueLoader however also returns the metadata field labels as list. As quick fix, you could parse that metadata field and stringify it. @dev2049 To prevent this error, should all DocLoaders only return str/int/float for metadata, or should we add a parse method to chroma that stringifes ( & de-stringifies) lists? |
tried this with no luck: fixed_texts = []
for text in texts:
if 'metadata' in text and isinstance(text['metadata'], list):
text['metadata'] = ','.join(text['metadata'])
fixed_texts.append(text)
print(f"Creating embeddings. May take some minutes...")
db = Chroma.from_documents(fixed_texts, embeddings, persist_directory=PERSIST_DIRECTORY, client_settings=CHROMA_SETTINGS) I guess I'll be waiting for a fix(?) or am I doing something wrong here? |
Almost correct :) Not metadata is a list, but metadata["labels"] is a list. Here's a full working example:
|
right! 🤦 thanks for the snippet, that seems to do the trick! |
# Creates GitHubLoader (#5257) GitHubLoader is a DocumentLoader that loads issues and PRs from GitHub. Fixes #5257 --------- Co-authored-by: Dev 2049 <[email protected]>
When using GitHubIssuesLoader , only getting number of comments in the response.. |
@banyalshipu that's currently not possible |
# Creates GitHubLoader (langchain-ai#5257) GitHubLoader is a DocumentLoader that loads issues and PRs from GitHub. Fixes langchain-ai#5257 --------- Co-authored-by: Dev 2049 <[email protected]>
Is there any interest to enhance this loader to support loading issue comments ? How hard would that be to achieve in your opinion @UmerHA ? I might give it a go. Also side question but can this loader load discussions as well as issues? Or only issues and PRs ? |
I don't think it's very complicated. As said above, you can get the comment URLs. You would then have have to fetch each URL.
Discussions and issues are the same thing in GitHub, aren't they? |
Thanks for the update. I don't think that discussions are grouped under issues in the API, I did a quick search and I don't think that the REST API offers support for discussions. It might be available in the GraphQL API. |
Feature request
Would be amazing to scan and get all the contents from the Github API, such as PRs, Issues and Discussions.
Motivation
this would allows to ask questions on the history of the project, issues that other users might have found, and much more!
Your contribution
Not really a python developer here, would take me a while to figure out all the changes required.
The text was updated successfully, but these errors were encountered: