Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix file too large error #2799

Merged
merged 2 commits into from
Oct 14, 2024
Merged

Fix file too large error #2799

merged 2 commits into from
Oct 14, 2024

Conversation

Weves
Copy link
Contributor

@Weves Weves commented Oct 14, 2024

Previously would fail for .doc, .ppt, or spreadsheets that are over 5MB due to an API limitation.

https://danswer.slack.com/archives/C056265VB1N/p1728895185750569

Copy link

vercel bot commented Oct 14, 2024

The latest updates on your projects. Learn more about Vercel for Git ↗︎

Name Status Preview Comments Updated (UTC)
internal-search ✅ Ready (Inspect) Visit Preview 💬 Add feedback Oct 14, 2024 9:33pm

Copy link

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PR Summary

This pull request addresses an issue in the Google Drive connector where large files (.doc, .ppt, or spreadsheets over 5MB) caused failures due to API limitations.

  • Modified _fetch_docs_from_drive method in backend/danswer/connectors/google_drive/connector.py to handle export-related errors
  • Implemented graceful skipping of files exceeding 5MB limit for specific file types
  • Enhanced error handling to prevent connector failure on oversized files
  • Improved robustness of the Google Drive connector for processing large document sets
  • Resolved user-reported issue from Slack (link provided in PR description)

1 file(s) reviewed, 1 comment(s)
Edit PR Review Bot Settings | Greptile

Comment on lines 479 to 484
# these errors don't represent a failure in the connector, but simply files
# that can't / shouldn't be indexed
ERRORS_TO_CONTINUE_ON = [
"cannotExportFile",
"exportSizeLimitExceeded",
]
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

style: Consider adding a constant or enum for these error types to improve maintainability

@Weves Weves enabled auto-merge October 14, 2024 21:30
@Weves Weves disabled auto-merge October 14, 2024 21:47
@Weves Weves merged commit f8a7749 into main Oct 14, 2024
6 of 7 checks passed
@Weves Weves deleted the fix-file-too-large-error branch October 14, 2024 21:47
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants