Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

core: Convert SimSIMD back to NumPy #19473

Merged
merged 1 commit into from
Mar 25, 2024
Merged

Conversation

ashvardanian
Copy link
Contributor

This patch fixes the #18022 issue, converting the SimSIMD internal zero-copy outputs to NumPy.

I've also noticed, that oftentimes dtype=np.float32 conversion is used before passing to SimSIMD. Which numeric types do LangChain users generally care about? We support float64, float32, float16, and int8 for cosine distances and float16 seems reasonable for practically any kind of embeddings and any modern piece of hardware, so we can change that part as well 🤗

@efriis efriis added the partner label Mar 23, 2024
@efriis efriis self-assigned this Mar 23, 2024
@dosubot dosubot bot added the size:XS This PR changes 0-9 lines, ignoring generated files. label Mar 23, 2024
Copy link

vercel bot commented Mar 23, 2024

The latest updates on your projects. Learn more about Vercel for Git ↗︎

1 Ignored Deployment
Name Status Preview Comments Updated (UTC)
langchain ⬜️ Ignored (Inspect) Visit Preview Mar 23, 2024 11:43pm

@dosubot dosubot bot added Ɑ: core Related to langchain-core 🔌: elasticsearch Primarily related to elastic/elasticsearch integrations 🤖:improvement Medium size change to existing code to handle new use-cases labels Mar 23, 2024
@baskaryan baskaryan merged commit d01bad5 into langchain-ai:master Mar 25, 2024
100 checks passed
gkorland pushed a commit to FalkorDB/langchain that referenced this pull request Mar 30, 2024
This patch fixes the langchain-ai#18022 issue, converting the SimSIMD internal
zero-copy outputs to NumPy.

I've also noticed, that oftentimes `dtype=np.float32` conversion is used
before passing to SimSIMD. Which numeric types do LangChain users
generally care about? We support `float64`, `float32`, `float16`, and
`int8` for cosine distances and `float16` seems reasonable for
practically any kind of embeddings and any modern piece of hardware, so
we can change that part as well 🤗
chrispy-snps pushed a commit to chrispy-snps/langchain that referenced this pull request Mar 30, 2024
This patch fixes the langchain-ai#18022 issue, converting the SimSIMD internal
zero-copy outputs to NumPy.

I've also noticed, that oftentimes `dtype=np.float32` conversion is used
before passing to SimSIMD. Which numeric types do LangChain users
generally care about? We support `float64`, `float32`, `float16`, and
`int8` for cosine distances and `float16` seems reasonable for
practically any kind of embeddings and any modern piece of hardware, so
we can change that part as well 🤗
hinthornw pushed a commit that referenced this pull request Apr 26, 2024
This patch fixes the #18022 issue, converting the SimSIMD internal
zero-copy outputs to NumPy.

I've also noticed, that oftentimes `dtype=np.float32` conversion is used
before passing to SimSIMD. Which numeric types do LangChain users
generally care about? We support `float64`, `float32`, `float16`, and
`int8` for cosine distances and `float16` seems reasonable for
practically any kind of embeddings and any modern piece of hardware, so
we can change that part as well 🤗
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Ɑ: core Related to langchain-core 🔌: elasticsearch Primarily related to elastic/elasticsearch integrations 🤖:improvement Medium size change to existing code to handle new use-cases partner size:XS This PR changes 0-9 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants