datahub-project · jywadhwani · Oct 5, 2020 · Sep 21, 2020 · Oct 2, 2020
diff --git a/docs/how/search-over-new-field.md b/docs/how/search-over-new-field.md
@@ -6,8 +6,10 @@ For this exercise, we'll add a new field to an existing aspect of corp users and
 
 This document will also guide you on how to leverage an existing field for faceted search i.e. use the field in aggregations, sorting or in a script.
 
-## 1. Add field to aspect (skip this step if the field already exists in an aspect)
+## 1: Add field to aspect (skip this step if the field already exists in an aspect)
+
 For this example, we will add new field `courses` to [CorpUserEditableInfo](../../metadata-models/src/main/pegasus/com/linkedin/identity/CorpUserEditableInfo.pdl) which is an aspect of corp user entity.
+
 ```
 namespace com.linkedin.identity
 
@@ -18,16 +20,17 @@ namespace com.linkedin.identity
 record CorpUserEditableInfo {
 
   ...
-  
+
   /**
    * Courses that the user has taken e.g. AI200: Introduction to Artificial Intelligence
    */
   courses: array[string] = [ ]
-  
+
 }
 ```
 
-## 2. Add field to search document model
+## 2: Add field to search document model
+
 For this example, we will add field `courses` to [CorpUserInfoDocument.pdl](../../metadata-models/src/main/pegasus/com/linkedin/metadata/search/CorpUserInfoDocument.pdl) which is the search document model for corp user entity.
 
 ```
@@ -44,11 +47,12 @@ record CorpUserInfoDocument includes BaseDocument {
    * Courses that the user has taken e.g. AI200: Introduction to Artificial Intelligence
    */
   courses: optional array[string]
-  
+
 }
 ```
 
-## 3. Modify the mapping of search index
+## 3: Modify the mapping of search index
+
 Now, we will modify the mapping of corp user search index. Use the following Elasticsearch command to add new field to an existing index.
 
 ```json
@@ -62,7 +66,8 @@ curl http://localhost:9200/corpuserinfodocument/doc/_mapping? --data '
 }'
 ```
 
-If this field needs to be a facet i.e. you want to enable sorting, aggregations on this field or use it in scripts, then your mapping may be different depending on the type of field. For **text** fields you will need to enable *fielddata* (disabled by default), as shown below
+If this field needs to be a facet i.e. you want to enable sorting, aggregations on this field or use it in scripts, then your mapping may be different depending on the type of field. For **text** fields you will need to enable _fielddata_ (disabled by default), as shown below
+
 ```json
 curl http://localhost:9200/corpuserinfodocument/doc/_mapping? --data '
 {
@@ -75,7 +80,8 @@ curl http://localhost:9200/corpuserinfodocument/doc/_mapping? --data '
 }'
 ```
 
-However *fielddata* enablement could consume significant heap space. If possible, use unanalyzed **keyword** field as a facet. For the current example, you could either choose keyword type for the field *courses* or create a subfield of type keyword under *courses* and use the same for sorting, aggregations, etc (second approach described below)
+However _fielddata_ enablement could consume significant heap space. If possible, use unanalyzed **keyword** field as a facet. For the current example, you could either choose keyword type for the field _courses_ or create a subfield of type keyword under _courses_ and use the same for sorting, aggregations, etc (second approach described below)
+
 ```json
 curl http://localhost:9200/corpuserinfodocument/doc/_mapping? --data '
 {
@@ -91,9 +97,11 @@ curl http://localhost:9200/corpuserinfodocument/doc/_mapping? --data '
   }
 }'
 ```
+
 More on this is explained in [ES guides](https://www.elastic.co/guide/en/elasticsearch/reference/current/fielddata.html).
 
-## 4. Modify index config, so that the new mapping is picked up next time
+## 4: Modify index config, so that the new mapping is picked up next time
+
 If you want corp user search index to contain this new field `courses` next time docker containers are brought up, we need to add this field to [corpuser-index-config.json](../../docker/elasticsearch-setup/corpuser-index-config.json).
 
 ```
@@ -119,9 +127,11 @@ If you want corp user search index to contain this new field `courses` next time
   }
 }
 ```
+
 Choose your analyzer wisely. For this example, we store the field `courses` as an array of string and hence use `text` data type. Default analyzer is `standard` and it provides grammar based tokenization.
 
-## 5. Update the index builder logic
+## 5: Update the index builder logic
+
 Index builder is where the logic to transform an aspect to search document model is defined. For this example, we will add the logic in [CorpUserInfoIndexBuilder](../../metadata-builders/src/main/java/com/linkedin/metadata/builders/search/CorpUserInfoIndexBuilder.java).
 
 ```java
@@ -133,9 +143,9 @@ public class CorpUserInfoIndexBuilder extends BaseIndexBuilder<CorpUserInfoDocum
   public CorpUserInfoIndexBuilder() {
     super(Collections.singletonList(CorpUserSnapshot.class), CorpUserInfoDocument.class);
   }
-  
+
   ...
-  
+
   @Nonnull
   private CorpUserInfoDocument getDocumentToUpdateFromAspect(@Nonnull CorpuserUrn urn,
       @Nonnull CorpUserEditableInfo corpUserEditableInfo) {
@@ -147,14 +157,15 @@ public class CorpUserInfoIndexBuilder extends BaseIndexBuilder<CorpUserInfoDocum
         .setSkills(corpUserEditableInfo.getSkills())
         .setCourses(corpUserEditableInfo.getCourses());
   }
-  
+
   ...
-  
+
 }
 
 ```
 
 ## 6: Update search query template, to start searching over the new field
+
 For this example, we will modify [corpUserESSearchQueryTemplate.json](../../gms/impl/src/main/resources/corpUserESSearchQueryTemplate.json) to start searching over the field `courses`. Here is an example.
 
 ```json
@@ -190,10 +201,12 @@ For this example, we will modify [corpUserESSearchQueryTemplate.json](../../gms/
   }
 }
 ```
+
 As you can see in the above query template, corp user search is performed across multiple fields, to which the field `courses` has been added.
 
-## 7: (*Optional*) For a field that is a facet, modify the search config.
-We define the list of facets in search config. If your field needs to be a facet, add it to the set of facets defined in method *getFacetFields*. For this example, we will add the logic in [CorpUserSearchConfig](../../gms/impl/src/main/java/com/linkedin/metadata/configs/CorpUserSearchConfig.java).
+## 7: (_Optional_) For a field that is a facet, modify the search config.
+
+We define the list of facets in search config. If your field needs to be a facet, add it to the set of facets defined in method _getFacetFields_. For this example, we will add the logic in [CorpUserSearchConfig](../../gms/impl/src/main/java/com/linkedin/metadata/configs/CorpUserSearchConfig.java).
 
 ```java
 package com.linkedin.metadata.configs;
@@ -210,6 +223,7 @@ public class CorpUserSearchConfig extends BaseSearchConfig<CorpUserInfoDocument>
 ```
 
 ## 8: Test your changes
+
 Make sure relevant docker containers are rebuilt before testing the changes.
 If this is a new field that has been added to an existing snapshot, then you can test by ingesting data that contains this new field. Here is an example of ingesting to `/corpUsers` endpoint, with the new field `courses`.
 
@@ -225,11 +239,11 @@ curl 'http://localhost:8080/corpUsers?action=ingest' -X POST -H 'X-RestLi-Protoc
             "AI100: Introduction to Artificial Intelligence"
           ],
           "skills": [
-            
+
           ],
           "pictureLink": "https://raw.githubusercontent.com/linkedin/datahub/master/datahub-web/packages/data-portal/public/assets/images/default_avatar.png",
           "teams": [
-            
+
           ]
         }
       }
@@ -241,7 +255,7 @@ curl 'http://localhost:8080/corpUsers?action=ingest' -X POST -H 'X-RestLi-Protoc
 
 Once the ingestion is done, you can test your changes by issuing search queries. Here is an example query with response.
 
-```
+```sh
 curl "http://localhost:8080/corpUsers?q=search&input=ai200" -H 'X-RestLi-Protocol-Version: 2.0.0' -s | jq
 
 Response:
@@ -251,22 +265,22 @@ Response:
       "urn:li:corpuser:datahub"
     ],
     "searchResultMetadatas": [
-      
+
     ]
   },
   "elements": [
     {
       "editableInfo": {
         "skills": [
-          
+
         ],
         "courses": [
           "Docker for Data Scientists",
           "AI100: Introduction to Artificial Intelligence"
         ],
         "pictureLink": "https://raw.githubusercontent.com/linkedin/datahub/master/datahub-web/packages/data-portal/public/assets/images/default_avatar.png",
         "teams": [
-          
+
         ]
       },
       "username": "datahub",
@@ -284,8 +298,57 @@ Response:
     "start": 0,
     "total": 1,
     "links": [
-      
+
     ]
   }
 }
 ```
+
+# Appendix: MidTier and UI changes
+
+## 1: Check if facets are enabled for the entity (optional)
+
+Inside the `PersonEntity` [render-props.ts](../../datahub-web/@datahub/data-models/addon/entity/person/render-props.ts)
+
+```json
+{
+  "search": {
+    "showFacets": true
+  }
+}
+```
+
+make sure `showFacets` property is set to `true`.
+
+## 2: Add fields to facets in MidTier if desired (optional)
+
+In [Search.java](../../datahub-frontend/app/controllers/api/v2/Search.java) add the desired fields here:
+
+```java
+private static final Set<String> CORP_USER_FACET_FIELDS = ImmutableSet.of("courses");
+```
+
+## 3: Add field in the Person entity
+
+In [person-entity.ts](../../datahub-web/%40datahub/data-models/addon/entity/person/person-entity.ts), add your new property
+
+```ts
+@alias('entity.courses')
+courses?: Array<string>;
+```
+
+## 4: Add fields in the Person configuration json
+
+Inside the `PersonEntity` [render-props.ts](../../datahub-web/@datahub/data-models/addon/entity/person/render-props.ts), add your new property:
+
+```json
+{
+  "showInAutoCompletion": true,
+  "fieldName": "courses",
+  "showInResultsPreview": true,
+  "displayName": "Courses",
+  "showInFacets": true,
+  "desc": "Courses description of the field",
+  "example": "courses:value"
+}
+```