-
Notifications
You must be signed in to change notification settings - Fork 348
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: enable feature store batch serve to Pandas DataFrame; fix: read instances uri for batch serve #983
feat: enable feature store batch serve to Pandas DataFrame; fix: read instances uri for batch serve #983
Conversation
morgandu
commented
Jan 28, 2022
- add batch_to_df
- add unit tests
- add integration tests
- fix bugs
…pdate: setup.py; fix: read_instances bug for batch_serve_to_*
project=self.project, credentials=self.credentials | ||
) | ||
|
||
featurestore_name_components = self._parse_resource_name(self.resource_name) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems like this method should include a self.wait
call because it's possible they could run methods like create
and create_entity_type
with sync=False
and then call this method. That may be the case else where in these classes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right. For create_entity_type
the sync
is on the EntityType
class, where the self.wait
will be needed for _parse_resource_name
for its entity_type_name_components
.
Checked and updated all methods in featurestore
module who may have this issue.
temp_bq_dataset_name = f"temp_{featurestore_id}_{uuid.uuid4()}".replace( | ||
"-", "_" | ||
) | ||
temp_bq_dataset_id = f"{initializer.global_config.project}.{temp_bq_dataset_name}"[ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please add a TODO with with the project ID buganizer ID and description in locations where we reference the global config's project.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added.
for message in reader.rows().pages: | ||
frames.append(message.to_dataframe()) | ||
|
||
return None if not frames else pd.concat(frames) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this use ignore_index=True
or is the SOT for the index in the table or possibly message.to_dataframe()
assigns incrementing index values?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ignore_index=True
is needed. Updated.
for message in reader.rows().pages: | ||
frames.append(message.to_dataframe()) | ||
|
||
return None if not frames else pd.concat(frames) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This return type doesn't match the signature. I think we should return an empty pd.DataFrame
if frames is empty.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
updated
…ex=True for pd.concat and return pd.DataFrame upon empty frames
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm
2cad0d4
to
ca0d7bf
Compare
Merge-on-green attempted to merge your PR for 6 hours, but it was not mergeable because either one of your required status checks failed, one of your required reviews was not approved, or there is a do not merge label. Learn more about your required status checks here: https://help.github.com/en/github/administering-a-repository/enabling-required-status-checks. You can remove and reapply the label to re-run the bot. |
ca0d7bf
to
290b07d
Compare