-
Notifications
You must be signed in to change notification settings - Fork 16.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
databricks: Add partner package directory and ChatDatabricks implementation #25430
databricks: Add partner package directory and ChatDatabricks implementation #25430
Conversation
Signed-off-by: B-Step62 <[email protected]>
Signed-off-by: B-Step62 <[email protected]>
Signed-off-by: B-Step62 <[email protected]>
Signed-off-by: B-Step62 <[email protected]>
Signed-off-by: B-Step62 <[email protected]>
The latest updates on your projects. Learn more about Vercel for Git ↗︎
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
note: This is the major focus of this PR. Mostly a copy of the community ChatDatabricks. Aside from minor refactoring, one key difference is that the new class does not inherit from ChatMLflow to avoid depending on community package.
Signed-off-by: B-Step62 <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
note: Just copied from existing documentation (https://python.langchain.com/v0.2/docs/integrations/chat/databricks/) with import update
Signed-off-by: B-Step62 <[email protected]>
Signed-off-by: B-Step62 <[email protected]>
Signed-off-by: B-Step62 <[email protected]>
Signed-off-by: B-Step62 <[email protected]>
Signed-off-by: B-Step62 <[email protected]>
Signed-off-by: B-Step62 <[email protected]>
"Release Notes" = "https://github.com/langchain-ai/langchain/releases?q=tag%3A%22databricks%3D%3D0%22&expanded=true" | ||
|
||
[tool.poetry.dependencies] | ||
# TODO: Replace <3.12 to <4.0 once https://github.com/mlflow/mlflow/commit/04370119fcc1b2ccdbcd9a50198ab00566d58cd2 is released |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
note: MLflow currently uses pkg_resources
(removed in the PR in the comment) so does not support Python 3.12. Once we release a new version contains the PR, we can remove this pin.
@@ -0,0 +1 @@ | |||
__pycache__ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Isn't this ignored in the root .gitignore
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's just auto-generated and other packages seems to have the same file. I'd just keep it untouched:)
name = "langchain-databricks" | ||
version = "0.1.0" | ||
description = "An integration package connecting Databricks and LangChain" | ||
authors = [] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should we fill in authors?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm ideally yes, probably it's tedious to maintain the list. Most of other partner packages leave this blank.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
"\n", | ||
"First, create a new Databricks serving endpoint that proxies requests to the target external model. The endpoint creation should be fairy quick for proxying external models.\n", | ||
"\n", | ||
"This requires registering OpenAI API Key in Databricks secret manager with the following comment:\n", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"This requires registering OpenAI API Key in Databricks secret manager with the following comment:\n", | |
"This requires registering your OpenAI API Key within the Databricks secret manager as follows:\n", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM once remaining comments are addressed
Signed-off-by: B-Step62 <[email protected]>
Summary
Create
langchain-databricks
as a new partner packages. This PR does not migrate all existing Databricks integration, but the package will eventually contain:ChatDatabricks
(implemented in this PR)DatabricksVectorSearch
DatabricksEmbeddings
(will be done after UC SDK work which drastically simplify implementation)UCFunctionToolkit
Also, this PR does not add integration tests yet. This will be added once the Databricks test workspace is ready.
Tagging @efriis as POC
Tracker
[✍️] Create a package and imgrate ChatDatabricks
[ ] Migrate DatabricksVectorSearch, DatabricksEmbeddings, and their docs
[ ] Migrate UCFunctionToolkit and its doc[ ] Add provider document and update README.md
[ ] Add integration tests and set up secrets (after moved to an external package)
[ ] Add deprecation note to the community implementations.