Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add pagination to Listings API #56

Closed
Tracked by #63
avaleske opened this issue Jun 16, 2021 · 15 comments
Closed
Tracked by #63

Add pagination to Listings API #56

avaleske opened this issue Jun 16, 2021 · 15 comments
Assignees
Labels
epic: filtering Tickets related to Filtering size: 5 A few days of work
Milestone

Comments

@avaleske
Copy link

Part of CityOfDetroit/affordable-housing-app#76

We need to add pagination to the listings API so we can don't just have a single page with all listings.

The partner portal currently has pagination for the lists of applications and the early work for listings, so we might be able to use some of that logic. This might be a starting point

This is Done when we have a listings API with pagination.

@avaleske avaleske self-assigned this Jun 16, 2021
@avaleske
Copy link
Author

Was digging into this to figure out filtering work. It looks like the applications list page fetches data using this hook:
https://github.com/bloom-housing/bloom/blob/845f2e6faeff244d4c64ec79a2c72d6db27834d8/sites/partners/lib/hooks.ts#L38
which is just adding page and limit params to the query string.

It's hitting this backend endpoint:

@Get()
@ApiOperation({ summary: "List applications", operationId: "list" })
async list(
@Query() queryParams: PaginatedApplicationListQueryParams
): Promise<PaginatedApplicationDto> {
return mapTo(PaginatedApplicationDto, await this.applicationsService.listPaginated(queryParams))
}

which does some params validation and passes the limit + page info to the query params here:
const result = await paginate(qb, { limit: params.limit, page: params.page })

@anders-schneider
Copy link

Just recording some things I'm learning about the listings controller/service. The /listings GET endpoint only cares about one query parameter: jsonpath. I hadn't heard of it and so followed it to this jsonpath library which the ListingsService calls here when handling a GET /listings. Apparently it's a query language/engine for JSON-structured data, letting you do things like (from the documentation):

image

So urls like http://localhost:3100/listings?jsonpath=$..specialNotes retrieve only the specialNotes field from each listing. (Bonus: Exygy just added this specialNotes field in their changes that we pulled in recently! It's like a free-form text field, at the level of an individual listing, so it seems helpful to us.)

I can only see one (non-test) place where the jsonpath parameter is set: in .env.template in the LISTINGS_QUERY environment variable:

LISTINGS_QUERY=/listings/?jsonpath=%24%5B%3F(%40.buildingAddress.county%3D%3D%22Alameda%22)%5D

(After URL decoding, that looks like /listings?jsonpath=$[?(@.buildingAddress.county=="Alameda")], so only listings that are in Alameda? Indeed, it does look like this snuck in when Exygy merged in the latest commits from housingbayarea:

image

In any case, jsonpath isn't useful for pagination/filtering at the database-level (why did you spend so much time writing about it then, Andy...) since it looks like it's being applied after the DB has returned its results. But it might be useful to us for some type of filtering that we'd prefer to do in the Next app, as opposed to at the database-level? I can't think what type of filtering that might be though.

@anders-schneider
Copy link

I got proof-of-concept pagination working using nestjs-typeorm-paginate, which is what we already use for Application pagination (thanks for the pointers @avaleske !). Here's a simplified version of ListingsService.list that does pagination:

  public async list(): Promise<Listing[]> {
    // Create the queryBuilder
    let queryBuilder = this.getQueryBuilder().orderBy({
      "listings.id": "DESC",
      "units.maxOccupancy": "ASC",
      "preferences.ordinal": "ASC",
    })

    // Call the imported paginate function, with a limit of 3 (this means we'll get *two* Listings)
    let paginated_listings = await paginate<Listing>(queryBuilder, { limit: 3, page: 1 })

    // Retrieve the array with two Listings by calling .items
    return paginated_listings.items
  }

This was mostly straightforward; the only tricky bit was typeorm/typeorm#6691: "OrderBy expects propertyNames instead of column names." All this meant is that I had to switch units.max_occupancy to units.maxOccupancy.

@anders-schneider
Copy link

Hmm, I'm hitting a bug with this paginate library (and I think it's nestjsx/nestjs-typeorm-paginate#6) - often the number of items returned is less than it should be. E.g. there are 7 total items, I request { limit: 3, page: 1 }, I should get the first three items, and instead I get only two items. The returned metadata happily reports this error:

{
  totalItems: 7,
  itemCount: 2,
  itemsPerPage: 3,
  totalPages: 3,
  currentPage: 1
}

Someone on that other bug found a subquery workaround (select * from ( <query> )) that I've been trying to duplicate, without success. I left nestjsx/nestjs-typeorm-paginate#6 (comment) to ask after their solution.

Note: I believe our existing ApplicationsService pagination is susceptible to this same bug (but we may not be triggering it?) - I'll open a separate ticket.

@avaleske
Copy link
Author

Are the pages 0-indexed?

@anders-schneider
Copy link

Are the pages 0-indexed?

No, they're 1-indexed; setting page: 0 gives QueryFailedError: OFFSET must not be negative. And when I remove the orderBy and it works correctly, { limit: 3, page: 1 } does indeed give the first three listings.

@anders-schneider
Copy link

Recording where I got to: I got the subquery SELECT * FROM ( <query> ) working, but that (a) increased the number of entries to 155 (which I believe is all the cardinalities of each of the tables multiplied together) and (b) didn't fix the pagination issue (now no entries show up).

  public async list(jsonpath?: string): Promise<Listing[]> {
    let queryBuilder = this.getQueryBuilder().orderBy({
      "listings.id": "DESC",
      "units.maxOccupancy": "ASC",
    })

    let wrappingQueryBuilder = getConnection().createQueryBuilder().select("*").from("(" + queryBuilder.getQuery() + ")", "listings")
    
    // I've confirmed that this new query is just 'SELECT * FROM (<old_query>) "listings"'
    console.log(wrappingQueryBuilder.getQuery())

    // This hack comes from https://github.com/typeorm/typeorm/issues/4015#issuecomment-691030543
    // (To avoid "Cannot get entity metadata for the given alias" error)
    wrappingQueryBuilder.expressionMap.mainAlias.metadata = wrappingQueryBuilder.connection.getMetadata(Listing)

    // Need to cast this to include the Listing type
    const newQueryBuilderRescoped = wrappingQueryBuilder as SelectQueryBuilder<Listing>
    let listings = await paginate<Listing>(newQueryBuilderRescoped, {limit: 3, page: 10})
    
    /*
    This logs:
    {
      totalItems: 155,
      itemCount: 0,
      itemsPerPage: 52,
      currentPage: 1
    }
    */
    console.log(listings.meta)

    return []
  }

This workaround might be a dead end...

@avaleske
Copy link
Author

I don't know if you saw this in the TypeORM documentation, but I just discovered that there are take() and skip() methods on the builder for pagination:
https://typeorm.io/#/select-query-builder/using-pagination

@avaleske
Copy link
Author

avaleske commented Jun 18, 2021

Although that throws a 500 unless you update the orderBy clause to be

      .orderBy({
        "listings.id": "DESC",
        "units.maxOccupancy": "ASC",
        "preferences.ordinal": "ASC",
      })

by changing max_occupancy (the name of the db field on the units table) to maxOccupancy (the name of the field on the entity in unit.entity.ts)

I'm curious how take is actually working. Does it do a limit on the join or does it just drop the extra results in code? If it's just dropping the extra results in code that's not ideal

@anders-schneider
Copy link

Thanks Austin for the pointer to https://typeorm.io/#/select-query-builder/using-pagination. I'm going to try that next, and will check if skip() and take() are reflected in the SQL query or not.

Before I try that, recording where I got to with the nestjs-typeorm-paginate library:

  • Using paginateRaw (instead of paginate) and the select * from ( <actual_query>) pattern, I can indeed make the pagination work correctly.
  • That approach gets the raw results (instead of parsing the results into Listing entities) and introduces two problems, though:
    • 155 items are returned, instead of the 7 listings that should be returned. What is happening: if there are 2 units corresponding to 1 listing, the raw results include two separate rows (and this is happening across the 7 tables this query joins together).
    • It's non-trivial to transform the raw results into Listing entities; we'd just be layering hack on top of hack.

Last state of my code using the nestjs-typeorm-paginate library:

  public async list(jsonpath?: string): Promise<Listing[]> {
    let listings = this.getQueryBuilder()
    .orderBy({
        "listings.id": "DESC",
        "units.maxOccupancy": "ASC",
        "preferences.ordinal": "ASC",
    })

    const wrappingQueryBuilder: SelectQueryBuilder<Listing> = getConnection()
    .createQueryBuilder().select("*").from("(" + listings.getQuery() + ")", "listings")

    // This hack comes from https://github.com/typeorm/typeorm/issues/4015#issuecomment-691030543
    // (To avoid "Cannot get entity metadata for the given alias" error)
    wrappingQueryBuilder.expressionMap.mainAlias.metadata = wrappingQueryBuilder.connection.getMetadata(Listing)

    let rawListings = await paginateRaw<Listing>(wrappingQueryBuilder, {limit: 5, page: 1})
    
    /*
    This logs:
    {
      totalItems: 155,
      itemCount: 5,
      itemsPerPage: 5,
      totalPages: 31
      currentPage: 1
    }

    Which shows that pagination is working correctly.
    But the cardinality is way off: there should only be 7 listings. I believe the 155 totalItems
    comes from joining all the different tables (and so one listing now spans multiple rows
    in the raw results.)
    */
    console.log("paginateRaw metadata:")
    console.log(rawListings.meta)

    /*
    This logs:
    {
      listings_id: 'aadf50b9-944d-4d3e-9494-585c8e04ca8b',
      listings_created_at: 2021-06-18T15:08:30.894Z,
      listings_updated_at: 2021-06-18T15:08:30.894Z,
      ...
    }

    Notice that the attributes in these raw results are in the form "listing_id" instead of just "id".
    This means we'd have to do some transformation to be able to cast this to an instance of Listing.
    */
    console.log("paginateRaw output (first item):")
    console.log(rawListings.items[0])

    return []
  }

@anders-schneider
Copy link

skip and take don't affect the underlying query - so it seems like they're taking effect in the backend rather than in the database :(

And they're applied to the raw results instead of the parsed-into-Listing results (so .take(20) e.g. only returns 3 listings) :( :(

@anders-schneider
Copy link

Okay this finally makes sense: it's the orderBy that doesn't play well with take and skip.

Background: take and skip rely on executing two queries, the first uses LIMIT and OFFSET to retrieve a list of distinct IDs and the second filters so that we only get entries with IDs that the first query turned up.

When we orderBy only listings.id (and set skip(2).take(4), query logging shows:

First query:

SELECT DISTINCT "distinctAlias"."listings_id" as "ids_listings_id", "distinctAlias"."listings_id"
FROM ( <full_query> ) "distinctAlias"
ORDER BY "distinctAlias"."listings_id" DESC, "listings_id" ASC
LIMIT 4
OFFSET 2

Second query:

<full_query>
WHERE "listings"."id" IN ($1, $2, $3, $4)
ORDER BY "listings"."id" DESC
-- PARAMETERS: ["<id1>","<id2>","<id3>","<id4>"]

But if we orderBy units.maxOccupancy and preferences.ordinal as well, the first query instead becomes:

SELECT DISTINCT "distinctAlias"."listings_id" as "ids_listings_id", "distinctAlias"."listings_id", "distinctAlias"."units_max_occupancy", "distinctAlias"."preferences_ordinal"
FROM ( <full_query> ) "distinctAlias"
ORDER BY "distinctAlias"."listings_id" DESC, "distinctAlias"."units_max_occupancy" ASC, "distinctAlias"."preferences_ordinal" ASC, "listings_id" ASC
LIMIT 4
OFFSET 2

(It's looking for distinct instances of (listing.id, units.maxOccupancy, preferences.ordinal) instead of distinct instances of listing.id.)

I think this gives us two options:

  • orderBy(listings.id) (at the DB level) and then sort units and preferences in our code
  • Create two separate query builders: one to retrieve only listing IDs (with skip and take applied here) and the second to retrieve all the joined data matching those IDs.
    • This would mean two separate calls to the DB, and either two or three total queries (depending on how clever take and skip are implemented)

@anders-schneider
Copy link

This work is close to ready to be reviewed; I just pushed what I have to feature/aeschneider-backend-pagination. What's left to do before sending it off for review:

  • Check that removing the default values in PaginationQueryParams doesn't affect pagination for applications; i.e. make sure that the applications frontend is setting the page and limit values it wants in the params when calling GET /applications.
  • Run Cypress tests and make sure there are no new failures.
  • Maybe add a test to listings.controller.spec.ts.

@avaleske
Copy link
Author

avaleske commented Jun 24, 2021

I merged my initial filters branch, so you can probably rebase yours off of main now

@willrlin willrlin transferred this issue from CityOfDetroit/affordable-housing-app Jun 26, 2021
@avaleske avaleske mentioned this issue Jun 26, 2021
35 tasks
@willrlin willrlin added the epic: filtering Tickets related to Filtering label Jun 26, 2021
@willrlin willrlin added this to the Build v1 milestone Jun 28, 2021
@willrlin willrlin added M7 size: 5 A few days of work labels Jun 28, 2021
@anders-schneider
Copy link

Thanks to Abbie for getting #133 committed!

main now has the backend pagination, but it looks like there's a small bug in the totalPages count:
image

I'll fix this and then close this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
epic: filtering Tickets related to Filtering size: 5 A few days of work
Projects
None yet
Development

No branches or pull requests

3 participants