Skip to content

Commit

Permalink
Add UpdateFeatureGroup related APIs in sample notebook (aws#3515)
Browse files Browse the repository at this point in the history
  • Loading branch information
verayu43 authored Aug 4, 2022
1 parent 2e8c261 commit b31be03
Show file tree
Hide file tree
Showing 2 changed files with 152 additions and 0 deletions.
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
customer_id,city_code,state_code,country_code,email,name
573291,1,49,2,[email protected],John Lee
109382,2,40,2,[email protected],Olive Quil
828400,3,31,2,[email protected],Liz Knee
124013,4,5,2,[email protected],Eileen Book
147 changes: 147 additions & 0 deletions sagemaker-featurestore/feature_store_introduction.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -473,6 +473,152 @@
"all_records"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Add features to a feature group\n",
"\n",
"If we want to update a FeatureGroup that has done the data ingestion, we can use the `UpdateFeatureGroup` API and then re-ingest data by using the updated dataset."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"from sagemaker.feature_store.feature_definition import StringFeatureDefinition\n",
"\n",
"customers_feature_group.update(\n",
" feature_additions=[StringFeatureDefinition(\"email\"), StringFeatureDefinition(\"name\")]\n",
")"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Verify the FeatureGroup has been updated successfully or not."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"def check_last_update_status(feature_group):\n",
" last_update_status = feature_group.describe().get(\"LastUpdateStatus\")[\"Status\"]\n",
" while last_update_status == \"InProgress\":\n",
" print(\"Waiting for FeatureGroup to be updated\")\n",
" time.sleep(5)\n",
" last_update_status = feature_group.describe().get(\"LastUpdateStatus\")\n",
" if last_update_status == \"Successful\":\n",
" print(f\"FeatureGroup {feature_group.name} successfully updated.\")\n",
" else:\n",
" print(\n",
" f\"FeatureGroup {feature_group.name} updated failed. The LastUpdateStatus is\"\n",
" + str(last_update_status)\n",
" )\n",
"\n",
"\n",
"check_last_update_status(customers_feature_group)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Inspect the new dataset."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"customer_data_updated = pd.read_csv(\"data/feature_store_introduction_customer_updated.csv\")"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"customer_data_updated.head()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Append `EventTime` feature to your data frame again."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"customer_data_updated[\"EventTime\"] = pd.Series(\n",
" [current_time_sec] * len(customer_data), dtype=\"float64\"\n",
")"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Ingest the new dataset."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"customers_feature_group.ingest(data_frame=customer_data_updated, max_workers=3, wait=True)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Use `batch_get_record` again to check that all updated data has been ingested into `customers_feature_group` by providing customer IDs."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"updated_customers_records = sagemaker_session.boto_session.client(\n",
" \"sagemaker-featurestore-runtime\", region_name=region\n",
").batch_get_record(\n",
" Identifiers=[\n",
" {\n",
" \"FeatureGroupName\": customers_feature_group_name,\n",
" \"RecordIdentifiersValueAsString\": [\"573291\", \"109382\", \"828400\", \"124013\"],\n",
" }\n",
" ]\n",
")"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"updated_customers_records"
]
},
{
"cell_type": "markdown",
"metadata": {},
Expand Down Expand Up @@ -530,6 +676,7 @@
"* `delete()`\n",
"* `create()`\n",
"* `load_feature_definitions()`\n",
"* `update()`\n",
"* `update_feature_metadata()`\n",
"* `describe_feature_metadata()`\n",
"\n",
Expand Down

0 comments on commit b31be03

Please sign in to comment.