-
Notifications
You must be signed in to change notification settings - Fork 122
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix: fix implicit ids in upload collection with paralell > 1 #460
Conversation
✅ Deploy Preview for poetic-froyo-8baba7 ready!
To edit notification comments on pull requests, go to your Netlify site configuration. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm particularly interested in the suggested warning, otherwise it looks good to me
[13.0, 14.0, 15.0], | ||
] | ||
payload = [{"a": 2}, {"b": 3}, {"c": 4}, {"d": 5}, {"e": 6}] | ||
ids = [1, 2, 3, 4, 5] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can make another test for locking in the behavior of auto-generating ids when ids = None
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is actually already there, the first call to upload_collection does not provide ids and payload, only vectors
The second call provides all of them - vectors, ids and payload. I put the data into one place because if we change vectors, then ids and payload should also be changed
vectors = [ | ||
[1.0, 2.0, 3.0], | ||
[4.0, 5.0, 6.0], | ||
[7.0, 8.0, 9.0], | ||
[10.0, 11.0, 12.0], | ||
[13.0, 14.0, 15.0], | ||
] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Outside the scope of this PR, but right now the behavior is to stop at the shortest iterator of any of ids
, vectors
, or ids
. Is it possible to emit a warning when this happens? E.g, for when it stopped with any of those un-exhausted
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, we can consider it as a separate issue
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've just realised that having iterators of different length is a valid scenario, e.g. it is valid when ids iterator is infinite.
We can only check the number of ids/payloads/vectors right before making a request, however this check won't help when the smallest iterator is divisible by batch_size
* fix: fix implicit ids in upload collection with paralell > 1 * fix: fix type hints * fix: remove redundant code, simplify type hints * fix: remove redundant import * fix: fix batching * fix: replace generator with list comprehension * fix: sorry, I was wrong * tests: update tests * fix: extend upload records and upload points tests
No description provided.