Apply max batch size to relation #7

maxperrimond · 2019-11-05T01:32:45Z

yanns

If I understand correctly, the goal of this PR is to use maxBatchSizeConfig to resolve only n relations at the same time (instead of trying to solve all of them at the same time).

We are building an Iterable[Future[...]]. Each element of the Iterable represents the n relations to resolve.

By using Future.sequence(results), we trigger the execution of all Future at the same time.
At the end, we will solve all relations at the same time.
Or do I misunderstand something.

To avoid fetching all Future at the same time. one has to sequentially execute all Future, like:

results.foldLeft (Future.successful(...)) { case (acc, e) => 
  // use the `acc` and the `e` to trigger the execution of `e` only
}

Personal opinion: I find very difficult to understand the logic. I added the types locally to help me:

    val groupedRelIds: Map[Relation[Any, Any, Any], Iterator[Vector[Any]]] = ctx.fetcher.config.maxBatchSizeConfig match {
      case Some(size) => nonCachedIds.map { case (rel, ids) => (rel, ids.grouped(size))}
      case None => nonCachedIds.map { case (rel, ids) => (rel, Iterator.single(ids)) }
    }

    val results: Iterable[Future[mutable.Map[Relation[Any, Any, Any], mutable.Map[Any, Seq[Any]]]]] =
      groupedRelIds.flatMap { case (rel: Relation[Any, Any, Any], groupIds: Iterator[Vector[Any]]) =>
        val t : Iterator[Future[mutable.Map[Relation[Any, Any, Any], mutable.Map[Any, Seq[Any]]]]] =
          groupIds.map { group =>
            if (group.nonEmpty)
              f.fetchRel(ctx, RelationIds(Map(rel -> group)))
                .map(groupAndCacheRelations(ctx, Map(rel -> group), _))
            else
              Future.successful(MutableMap.empty[Relation[Any, Any, Any], MutableMap[Any, Seq[Any]]])
          }
        t
      }

Maybe something to also consider?

src/main/scala/sangria/execution/deferred/FetcherBasedDeferredResolver.scala

Co-Authored-By: Yann Simon <[email protected]>

maxperrimond · 2019-11-06T03:28:32Z

I agree, it's hard to understand in the current state. I tried to keep same logic as resolveEntities and didn't want to change too much.
For futures, It should trigger all at the same time. Otherwise it will slowing down all relations fetching.

zvuki · 2019-11-06T04:07:19Z

+1 for hard to understand. Not to derail this PR, but the codebase, in general, has little to no comments, some of the design choices would be much easier to grasp if there were more comments. Do you think it can be changed going forward?

yanns · 2019-11-06T14:18:38Z

For futures, It should trigger all at the same time. Otherwise it will slowing down all relations fetching.

So instead of making one query that fetches 1000 entities, we do 10 queries in parallel and each query fetches 100 entities?
This will still put the same pressure on the data storage.

Can you explain more the intention of this change?

maxperrimond · 2019-11-07T04:38:00Z

My intention is to make consistent with the current behavior of entity fetching, that the batch size should be also applied to relation fetching.
In terms of pressure on the data storage, I don't think it's not the role of this library to assume which is the best how to resolve data. It's should be up to the end developer can choose to use or not the batch size or even limit parallel queries if necessary in his implementation.

yanns

OK. The original intention of this batching still remains a bit obscure to me.
But I assume it's ok to have consistent behaviors.

Apply max batch size to relation

d62e36b

yanns reviewed Nov 5, 2019

View reviewed changes

src/main/scala/sangria/execution/deferred/FetcherBasedDeferredResolver.scala Outdated Show resolved Hide resolved

src/main/scala/sangria/execution/deferred/FetcherBasedDeferredResolver.scala Outdated Show resolved Hide resolved

Apply suggestions from code review

58cdb7d

Co-Authored-By: Yann Simon <[email protected]>

yanns approved these changes Nov 7, 2019

View reviewed changes

yanns merged commit 117a095 into sangria-graphql-org:master Nov 7, 2019

maxperrimond deleted the resolve-rel-batch-size branch November 7, 2019 08:01

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Apply max batch size to relation #7

Apply max batch size to relation #7

maxperrimond commented Nov 5, 2019

yanns left a comment

maxperrimond commented Nov 6, 2019

zvuki commented Nov 6, 2019

yanns commented Nov 6, 2019

maxperrimond commented Nov 7, 2019

yanns left a comment

Apply max batch size to relation #7

Apply max batch size to relation #7

Conversation

maxperrimond commented Nov 5, 2019

yanns left a comment

Choose a reason for hiding this comment

maxperrimond commented Nov 6, 2019

zvuki commented Nov 6, 2019

yanns commented Nov 6, 2019

maxperrimond commented Nov 7, 2019

yanns left a comment

Choose a reason for hiding this comment