Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ECP-469: Finalize new search schema #368

Merged
merged 12 commits into from
Nov 17, 2020
Merged

ECP-469: Finalize new search schema #368

merged 12 commits into from
Nov 17, 2020

Conversation

kokoc
Copy link
Member

@kokoc kokoc commented Apr 9, 2020

Overview

This PR contains graphql schema changes required to introduce new search functionality such as

  • highlights - parts of product attribute values which contain matched words. The words might be slightly different compared to the input, so they can't be calculated on the client side.
  • multi-facets - possibility to check multiple values per facet. This means that in addition to filtered results we need to get the facets without any filters.
  • advanced range facets (prices) - support of slider UX element. Basically, we need to extract additional information about price ranges, like min value; max value, etc.
  • suggestions, "did you mean?", etc - not a part of the proposed schema, but we need to make sure these parts are fit well in the future.
  • preselected variations - most relevant variation is preselected and rendered instead of configurable

The main challenge is to find a proper place in existing/new queries for new fields.

Inject into one of the existing types/queries

products query

type Query {
    products (
        search: String,
        filter: ProductAttributeFilterInput,
        pageSize: Int = 20,
        currentPage: Int = 1,
        sort: ProductAttributeSortInput
    ): Products
}
type Products {
    items: [ProductInterface]
    page_info: SearchResultPageInfo
    total_count: Int
    filters: [LayerFilter]
    aggregations: [Aggregation]
    sort_fields: SortFields
}

products query covers two main use cases: search by search phrase and catalog browsing with filtering by category.
This query is the most obvious choice for search-related customizations.
However, this query returns a Products type with ProductInterface inside. The ProductInterface is designed to
represent a product entity with complex product types support.

Adding fields to ProductInterface or Products will
make those fields optional and useless in "catalog browsing" scenario. Other words, our query is handling two different
use cases with two different results. Mixing those results in very common ProductInterface is messy,
inconvenient and just strange.

Products type is a little bit better place for new fields. We can put request-specific fields here, but we also need to introduce one more items conainer for
search results(will contain highlights + products). This approach is still messy, but at least it doesn't affect other product usecases (shopping cart item, wishlist item, pdp, etc).
Example for this option:

type Query {
    products (
        search: String,
        filter: ProductAttributeFilterInput,
        pageSize: Int = 20,
        currentPage: Int = 1,
        sort: ProductAttributeSortInput
    ): Products
}
type Products {
    items: [ProductInterface] @deprecated(???????)
    search_items: [ProductSearchItem]
    suggestions: [String]
    page_info: SearchResultPageInfo
    total_count: Int
    filters: [LayerFilter]
    aggregations: [Aggregation]
    sort_fields: SortFields
}
type ProductSearchItem {
    product: ProductInterface!
    highlights: [HighlightType]
}

other queries with product results

There are two more queries which return products and could be considered as a potential candidate for new fields:
category and categoryList. Both return CategoryInterface interface with products field inside:

interface CategoryInterface
    id: Int
    ...
    default_sort_by: String 
    products(
        pageSize: Int = 20 
        currentPage: Int = 1 
        sort: ProductAttributeSortInput 
    ): CategoryProducts 
    breadcrumbs: [Breadcrumb] 
}

type CategoryProducts 
    items: [ProductInterface] 
    page_info: SearchResultPageInfo 
    total_count: Int 
}

This interface also returns ProductInterface and also those queries do not receive a search phrase. Thus it's they are not a good candidates for new search functionality.

Proper place

In order to find the proper place we may take a look on the use cases we have and want to support:

  • Browsing by category - includes layered navigation (facets). Supports further filtering.
  • Searching by phrase - includes layered navigation (facets). Supports further filtering. Additional ordering by relevancy, highlights, variation preselect, suggestions
  • Advanced search - filtering by some values & full-text search by another
  • Product details page
  • Various product representation - wishlist item, shopping cart item
  • Category navigation

I think the following table is the desired assigment of queries per use cases:

Scenario Current Query Suggested Query Required changes
Browsing by category products
category
categoryList
category Add layered navigation results
Add filters to product sub-request
Searching by phrase products new query deprecate search related fields in products query
Advanced search n/a TBD TBD
Probably can be omitted
PDP products products deprecate filtering except product id
Category Navigation categoryList categoryList deprecate all non-category fields

You can find a possible shape of new query in PR content.

From the table above we can see that products query will be used mostly for retrieving the product by their ids, other functionality will be transferred to productSearch query.
Thus we may also deprecate entire products query entirely and create a new query designed to get products by ids.

Here is a possible shape of the api if we decide to deprecated old query:

type Query {
    "This query is deprecated by still works for backwards-compatibilty purposes"
    products (
            search: String,
            filter: ProductAttributeFilterInput,
            pageSize: Int = 20,
            currentPage: Int = 1,
            sort: ProductAttributeSortInput
        ): Products @deprecated("See product or productSearch queries")

     productSearch(
         phrase: String!,
         "Desired size of the search result page"
         pageSize: Int = 20,
         currentPage: Int = 1,
         filter: [SearchClauseInput],
         sort: [ProductSearchSortInput]
     ): ProductSearchResponse!

     product(id: ID!): ProductInterface
}

Here are few possible examples per use case:

Category Listing

category(id: 17) {
    products {
        items {
            id
            name
            sku
            thumbnail {
                url
            }
        }
    }
}

Search by phrase

productSearch(phrase: "Bags") {
    items {
       product {
            id
            name
            sku
            thumbnail {
                url
            }
       }
       hightlights {
           attribute_code
           value
       }
    }
}

Product details page

product(id: 18) {
    id
    name
    sku
    media_gallery_entries {
        url
        media_type
        content
    }
}

Category navigation

categoryList {
    id
    name
    url_path
    children {
        id
        name
        url_path
        children {
            id
            name
            url_path
        }
    }
}

kokoc added 3 commits April 10, 2020 10:43
- fixed range type to input
- added support for custom sorting
- search direction is mandatory
@kandy
Copy link
Contributor

kandy commented Apr 10, 2020

Can I ask you to describe scenario that will be covered by this API?

- Added attribute name to aggregations
input ProductSearchSortInput
{
attribute: String!
direction: SortEnum!
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why should we have direction as a mandatory attribute?
Most of RDBMS just use a default value of ASC / DESC

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here is official elastic search documentation:

The order defaults to desc when sorting on the _score, and defaults to asc when sorting on anything else.

I don't like this uncertainty

ProductSearchSortInput is optional so we can apply some default order, but if developer asks for custom ordering I think it's better to ask for direction as well.

Copy link
Contributor

@DrewML DrewML Apr 21, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like @maghamed's idea of having a default. If something isn't critical for an operation to succeed, (I think) it should be optional. Otherwise querying becomes tedious and verbose.

I don't like this uncertainty

I don't see much uncertainty from that quote, if we're only talking about ES.

But, are we planning to only support Elastic Search? If we're ever going to swap search providers, I think it would make sense for the application to define the default so it's consistent.

ProductSearchSortInput is optional so we can apply some default order

I think an application-defined default sort order would be great 👍

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think there is no need for any changes. We have the optional ProductSearchSortInput, so we can apply default sort. However, the user needs to specify direction for every custom sort. Does it work?

@DrewML
Copy link
Contributor

DrewML commented Apr 14, 2020

@kandy: Can I ask you to describe scenario that will be covered by this API?

Would also like to see this, with a few example queries showing how the front-end would cover the scenarios

kokoc added 4 commits April 15, 2020 09:39
- Removed attribute field duplicate from aggregations
- Removed multi-search
currentPage: Int = 1,
filter: [SearchClauseInput],
sort: [ProductSearchSortInput]
): ProductSearchResponse!
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

productSearch should be nullable, or all other operations under Query in a request will fail. See #372

Copy link
Contributor

@paliarush paliarush left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The table with old-to-new query mapping says we preserve products query with filtering by ID. In the example below, the query is called product(id).

design-documents/graph-ql/coverage/search.md Show resolved Hide resolved
}

input SearchClauseInput {
attribute_code: String!
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Inconsistent field name, attribute_code here vs. attribute on other type.


interface Bucket {
#Human readable bucket title
title: String!
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All fields under the interface (e.g. Bucket) must be listed under any type that implements the interface. e.g. I don't see title listed under StatsBucket, ScalarBucket and RangeBucket.

type Highlight {
attribute: String!
value: String!
matched_words: [String]!
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is a common convention to use camelCase for field name (e.g. matchedWords instead of matched_words). Not sure if this is Magento convention or else.

count: Int!
}

interface Aggregation {
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this supposed to be a type instead of an interface?

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should add title field for storing aggregation friendly name. This is so the client (e.g. UI) can use it for display, instead of displaying attribute code.

type ProductSearchResponse {
items: [ProductSearchItem]
facets: [Aggregation]
facets_values: [Aggregation]
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As discussed with @kokoc verbally, I think facets is sufficient for storing facets aggregation and I am not clear the use case for facet_values.

@paales
Copy link

paales commented Jun 16, 2020

Posted the following question earlier in Slack:

Question/feature request about product aggregations:

To get the layered navigation working properly pwa-studio is/we are relying on introspection queries because the aggregations don’t offer all the information required to build a proper layered navigation. This makes building frontends rather complex.

Wouldn’t it be better to actually have an AggregationInterface that will be implemented with:

  • AggregationEquals must implement the options { label, value }. Each option must also implement swatch information?
  • AggregationMatch must not implement options
  • AggregationNumberRange must imlement options: { label, from, to } instead of the concatenated string 100_* which requires custom handling, Can I just pass that value back to ProductAttributeFilterInput? The values there should be Int or Float, not String?
  • AggregationPriceRange can be implement to indicate a field should be rendered with a price in mind, else its the same as AggregationNumberRange
  • AggregationBoolean must not implement options

And @kokoc responded to me that this should be covered by this new search api and gave me the possibility to actually check it out! 🎉 So I've been trying out the proof-of-concept graphql endpoint (https://search-api-dev.magento-ds.com/graphql) and have some feedback / questions:

SearchClauseInput

Compared to the older products query we're loosing some functionality here: We are now unable to discover the attributes and possible input types (equals, range, match(?)).

As a developer I'd like to be able to discover which attributes are available for filtering like it was possible with ProductAttributeFilterInput so that I don't need to 'guess' which ones are available.

A great alternative might be something we see in our GraphQL CMS (GraphCMS):

input ProductSearchFilterInput {
  category: ID
  category_not: ID
  category_in: [ID!]
  category_not_in: [ID!]
  createdAt: DateTime
  createdAt_not: DateTime
  createdAt_in: [DateTime!]
  createdAt_not_in: [DateTime!]
  createdAt_lt: DateTime
  createdAt_lte: DateTime
  createdAt_gt: DateTime
  createdAt_gte: DateTime
  handle: String
  handle_not: String
  handle_in: [String!]
  handle_not_in: [String!]
  price: Float
  price_not: Float
  price_in: [Float!]
  price_not_in: [Float!]
  price_lt: Float
  price_lte: Float
  price_gt: Float
  price_gte: Float
  size: Float
}

This keeps the query arguments relatively flat, which is nice.

Aggregation and Bucket

When rendering an aggregation (should it be renamed to Facet?) on the frontend we should be able to see which type of facet we're actually rendering. Wouldn't something like the following be a better idea:

  • Rename Aggregation to Facet
  • Inline StatsBucket since there can never be more than one.
  • Bucket seems like an odd name to have? Maybe should be renamed to something different?
interface Bucket {
    title: String!
}
type ScalarBucket implements Bucket {
    id: ID!
    count: Int!
}
type RangeBucket implements Bucket {
    from: Float!
    to: Float!
    count: Int!
}

interface Facet {
    attribute_code: String!
    buckets: [Bucket!]!
}
type StatsFacet implements Facet {
    attribute_code: String!
    min: Float!
    max: Float!
    buckets: [ScalarBucket!]
}
type ScalarFacet implements Facet {
    attribute_code: String!
    buckets: [ScalarBucket!]!
}
type RangeFacet implements Facet {
    attribute_code: String!
    buckets: [RangeBucket!]!
}

Demo implementation

There are some inconsistencies on the demo endpoint that I think will/should be resolved at some point. So not sure how relevant they are:

  • HighLight.matched_words is empty
  • HighLight.value contains HTML, probably shouldn't
  • ProductItem is something that isn't available in the written spec here, will become ProductInterface in the end?

I hope you find this valuable!

@paales
Copy link

paales commented Jun 17, 2020

Oh, I forgot something, Shouldn't ScalarBucket be have additional implementations like ImageBucket, ColorBucket and TextBucket to account for swatch information?

@DrewML
Copy link
Contributor

DrewML commented Jul 24, 2020

@kokoc @nrkapoor what's the status with this PR? Seems like @paales might have some useful input

@melnikovi melnikovi added the needs update Author should update the proposal based on the feedback label Sep 11, 2020
@paliarush paliarush merged commit 8538211 into master Nov 17, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
GraphQL needs update Author should update the proposal based on the feedback
Projects
None yet
Development

Successfully merging this pull request may close these issues.

9 participants