Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

databricks: Add partner package directory and ChatDatabricks implementation #25430

Merged
merged 18 commits into from
Aug 22, 2024

Conversation

B-Step62
Copy link
Contributor

@B-Step62 B-Step62 commented Aug 15, 2024

Summary

Create langchain-databricks as a new partner packages. This PR does not migrate all existing Databricks integration, but the package will eventually contain:

  • ChatDatabricks (implemented in this PR)
  • DatabricksVectorSearch
  • DatabricksEmbeddings
  • UCFunctionToolkit (will be done after UC SDK work which drastically simplify implementation)

Also, this PR does not add integration tests yet. This will be added once the Databricks test workspace is ready.

Tagging @efriis as POC

Tracker

[✍️] Create a package and imgrate ChatDatabricks
[ ] Migrate DatabricksVectorSearch, DatabricksEmbeddings, and their docs
[ ] Migrate UCFunctionToolkit and its doc
[ ] Add provider document and update README.md
[ ] Add integration tests and set up secrets (after moved to an external package)
[ ] Add deprecation note to the community implementations.

Signed-off-by: B-Step62 <[email protected]>
Signed-off-by: B-Step62 <[email protected]>
Signed-off-by: B-Step62 <[email protected]>
@efriis efriis added the partner label Aug 15, 2024
@efriis efriis self-assigned this Aug 15, 2024
Copy link

vercel bot commented Aug 15, 2024

The latest updates on your projects. Learn more about Vercel for Git ↗︎

Name Status Preview Comments Updated (UTC)
langchain ✅ Ready (Inspect) Visit Preview 💬 Add feedback Aug 21, 2024 11:51pm

@dosubot dosubot bot added size:XXL This PR changes 1000+ lines, ignoring generated files. 🤖:enhancement A large net-new component, integration, or chain. Use sparingly. The largest features labels Aug 15, 2024
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

note: This is the major focus of this PR. Mostly a copy of the community ChatDatabricks. Aside from minor refactoring, one key difference is that the new class does not inherit from ChatMLflow to avoid depending on community package.

Signed-off-by: B-Step62 <[email protected]>
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

note: Just copied from existing documentation (https://python.langchain.com/v0.2/docs/integrations/chat/databricks/) with import update

Signed-off-by: B-Step62 <[email protected]>
Signed-off-by: B-Step62 <[email protected]>
Signed-off-by: B-Step62 <[email protected]>
"Release Notes" = "https://github.com/langchain-ai/langchain/releases?q=tag%3A%22databricks%3D%3D0%22&expanded=true"

[tool.poetry.dependencies]
# TODO: Replace <3.12 to <4.0 once https://github.com/mlflow/mlflow/commit/04370119fcc1b2ccdbcd9a50198ab00566d58cd2 is released
Copy link
Contributor Author

@B-Step62 B-Step62 Aug 15, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

note: MLflow currently uses pkg_resources (removed in the PR in the comment) so does not support Python 3.12. Once we release a new version contains the PR, we can remove this pin.

@@ -0,0 +1 @@
__pycache__
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't this ignored in the root .gitignore?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's just auto-generated and other packages seems to have the same file. I'd just keep it untouched:)

name = "langchain-databricks"
version = "0.1.0"
description = "An integration package connecting Databricks and LangChain"
authors = []
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should we fill in authors?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm ideally yes, probably it's tedious to maintain the list. Most of other partner packages leave this blank.

Copy link
Contributor

@harupy harupy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Contributor

@serena-ruan serena-ruan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

"\n",
"First, create a new Databricks serving endpoint that proxies requests to the target external model. The endpoint creation should be fairy quick for proxying external models.\n",
"\n",
"This requires registering OpenAI API Key in Databricks secret manager with the following comment:\n",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
"This requires registering OpenAI API Key in Databricks secret manager with the following comment:\n",
"This requires registering your OpenAI API Key within the Databricks secret manager as follows:\n",

Copy link
Contributor

@BenWilson2 BenWilson2 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM once remaining comments are addressed

Signed-off-by: B-Step62 <[email protected]>
@efriis efriis changed the title [databricks] [1] Add partner package directory and ChatDatabricks implementation databricks: Add partner package directory and ChatDatabricks implementation Aug 21, 2024
@dosubot dosubot bot added the lgtm PR looks good. Use to confirm that a PR is ready for merging. label Aug 21, 2024
@efriis efriis merged commit 3981d73 into langchain-ai:master Aug 22, 2024
18 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
🤖:enhancement A large net-new component, integration, or chain. Use sparingly. The largest features lgtm PR looks good. Use to confirm that a PR is ready for merging. partner size:XXL This PR changes 1000+ lines, ignoring generated files.
Projects
Archived in project
Development

Successfully merging this pull request may close these issues.

5 participants