Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: fix implicit ids in upload collection with paralell > 1 #460

Merged
merged 9 commits into from
Jan 31, 2024

Conversation

joein
Copy link
Member

@joein joein commented Jan 24, 2024

No description provided.

Copy link

netlify bot commented Jan 24, 2024

Deploy Preview for poetic-froyo-8baba7 ready!

Name Link
🔨 Latest commit 4440a96
🔍 Latest deploy log https://app.netlify.com/sites/poetic-froyo-8baba7/deploys/65ba66de18ea6b00083e532a
😎 Deploy Preview https://deploy-preview-460--poetic-froyo-8baba7.netlify.app/qdrant_client.conversions.conversion
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify site configuration.

@joein
Copy link
Member Author

joein commented Jan 24, 2024

#459

Copy link
Contributor

@coszio coszio left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm particularly interested in the suggested warning, otherwise it looks good to me

[13.0, 14.0, 15.0],
]
payload = [{"a": 2}, {"b": 3}, {"c": 4}, {"d": 5}, {"e": 6}]
ids = [1, 2, 3, 4, 5]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can make another test for locking in the behavior of auto-generating ids when ids = None

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is actually already there, the first call to upload_collection does not provide ids and payload, only vectors

The second call provides all of them - vectors, ids and payload. I put the data into one place because if we change vectors, then ids and payload should also be changed

Comment on lines +274 to +280
vectors = [
[1.0, 2.0, 3.0],
[4.0, 5.0, 6.0],
[7.0, 8.0, 9.0],
[10.0, 11.0, 12.0],
[13.0, 14.0, 15.0],
]
Copy link
Contributor

@coszio coszio Jan 31, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Outside the scope of this PR, but right now the behavior is to stop at the shortest iterator of any of ids, vectors, or ids. Is it possible to emit a warning when this happens? E.g, for when it stopped with any of those un-exhausted

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, we can consider it as a separate issue

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've just realised that having iterators of different length is a valid scenario, e.g. it is valid when ids iterator is infinite.

We can only check the number of ids/payloads/vectors right before making a request, however this check won't help when the smallest iterator is divisible by batch_size

@joein joein merged commit 4adbabe into dev Jan 31, 2024
9 of 14 checks passed
joein added a commit that referenced this pull request Jan 31, 2024
* fix: fix implicit ids in upload collection with paralell > 1

* fix: fix type hints

* fix: remove redundant code, simplify type hints

* fix: remove redundant import

* fix: fix batching

* fix: replace generator with list comprehension

* fix: sorry, I was wrong

* tests: update tests

* fix: extend upload records and upload points tests
@generall generall deleted the fix-upload-collection-implicit-ids branch May 3, 2024 10:50
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants