Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[RFC]Thoughts on an explicit way to define nested relationships? #1113

Closed
NullVoxPopuli opened this issue Sep 1, 2015 · 32 comments
Closed

Comments

@NullVoxPopuli
Copy link
Contributor

I've been working on sideloading (#1107) for the json adapter, and needed recursive relationship building of the json object.

The problem with doing a strictly recursive approach (independent of sideloading or not), is that if you have enough references on your objects to other objects, a single request (of any kind) is basically going to return the whole database.
Example: Post belongs_to an Author which has_many Posts, and each Post has_many Comments, and each Comment belongs_to an Author.... it and would just keep going. The code in #1107 at least doesn't allow for things to be serialized multiple times (like if a post has the same author, that author only gets serialized once).

So, a potential solution that I'm working on prototyping right now is something like this:

PostSerializer < #...
  has_many :comments, include: :author
end

So, instead of recursively serializing the comments, we'd serialize them like we do today, but because we have the author key in the include hash, we'd also serialize each of the coments' authors.

BlogSerializer < #...
  has_many :posts, include: { author: [:bio], comments: [:author]] }
end

resulting json may look like:

{
  blog: {
    posts: [
      {
        author: {
          bio: {...}
        },
        comments: [
          {  author: {...} }
        ]
      }
    ]
  }
}

or

AuthorSerializer < #...
  has_many :blogs, include: { posts: [:author, comments: :author] }
end

resulting json may look like:

{
  author: {
    blogs: [
      {
        posts: [
          {
            author: {..}
            comments: [
              {
                author: {..}
              }
            ]
          }
        ]
      }
    ]
  }
}

this gets in to a problem of potentially having redundant data in the resulting json, so caching would be super important here (which I think is implemented by default? or does a cache key need to be specified in the controller? if not, should there be a default? - or would this kind of cacheing be separate, and more like fancy memoization?)

Note, I found that there was a previous attempt at solving this problem here: #952, but I think it's not clear in #952 how deep the nested relationships go / it could imply that the same recursive problem exists.

@iMacTia
Copy link

iMacTia commented Sep 2, 2015

Hi @NullVoxPopuli, just to answer your last question (I'm the author of #952) the nesting can actually bring the recursive problem, but that's intended. It is actually a very similar feature to your side-loading, as you're actually asking to AMS to side-load all associations without explicitly declaring them. That's allow you to keep your code easier to maintain (if you add a new relationship later, that will be taken automatically).
The only situation a recursive loop could happen is when you have 2 serialisers like this:

TeamSerializer < #...
  has_many :players, include_nested_associations: true
end

PlayerSerializer < #...
  has_many :teams, include_nested_associations: true
end

This can actually fire the recursive problem, actually all many-to-many associations share this risk. But again, that's intended, because it depends on how you use the include_nested_associations option.
Moreover, remember that you can always define multiple serialisers for the same model to fit your needs.
Now, addressing your problem, that how I would solve it using my feature:

BlogSerializer < #...
  has_many :posts, include: include_nested_associations: true
end

PostSerializer < #...
  has_many :comments, include_nested_associations: true
  belongs_to :author, include_nested_associations: true
  # In this case, I want to have more info about the author because I want to display them after the blog post
end

AuthorSerializer < #...
  has_many :blogs, include_nested_associations: true
  has_one :bio
end

CommentSerializer < #...
  belongs_to :author
  # In this case, instead, I don't want anything else apart from basic informations (username, email, etc...) so no need for nested_associations
end

As I said before, you can always make even better defining different serialisers based on your needs.
So in this case, could be a good idea to have a BlogAuthorSerializer and a CommentAuthorSerializer :)

@NullVoxPopuli
Copy link
Contributor Author

thanks for explaining your your technique.

But yeah, I'm trying out this solution to the recursive problem: #1114
So, lets say you have these serializers:

CommentSerializer < #...
  belongs_to :author
end

PostSerializer < #...
  has_many :comments, include: :author
end

BlogSerializer < #...
  has_many :posts, include: [:comments]
end

when rendering a blog, you'd get posts and comments for each post, but not the author for the comments.

I think this is handy, in that you have a more explicit control of how deep the recursion goes

@iMacTia
Copy link

iMacTia commented Sep 2, 2015

Yes, that definitely makes sense, and if you want the author too you just need to change the BlogSerializer like this:

BlogSerializer < #...
  has_many :posts, include: [comments: :author]
end

Recursive loop problem should be fixed as well, the problem is that if you want a standard behaviour (like you always want the comment author when you get a comment), you have to copy-and-paste the same "include" option in all serialisers, where with the include_nested_associations you delegate this to the serializer itself and you just need to manage special cases.
Different solutions to the same problem, at the end. Having both of them allows you to choose the one that better fits your needs

@NullVoxPopuli
Copy link
Contributor Author

Yeah, I don't think either should conflict with each other (functionality-wise anyway, I'm sure there are code conflicts, as I refactored a bit too)

@NullVoxPopuli
Copy link
Contributor Author

@iMacTia so, lets say someone wants to always include the author on their comments, how do you feel about the following?

CommentSerializer < #....
  belongs_to :author, always_include: true
end

to notate that no matter who is rendering a comment, they also get the comment's author.

This makes for some interesting notation nuances for other seralizers:

BlogSerializer < #...
  has_many :posts, include: [comments: :author]
end

Here, the specification of author on comments is redundant and doesn't matter.

BlogSerializer < #...
  has_many :posts, include: [:comments]
end

I think this is what would be the most ideal for this scenario, even though it's not explicit as to what's being included in the resulting json.

@iMacTia
Copy link

iMacTia commented Sep 4, 2015

@NullVoxPopuli, you actually raised another interesting point.
I agree on this as it could actually address situations that both our solutions can't address without a copy-and-paste.
I think it could be a nice addition to the set of options for associations.
Maybe not needed in the first place, but if we want to have even greater flexibility, we can also later introduce an exclude option to override the always_include option from outside :)
But again, that comes later, let's do one thing at a time.
My PR for #952 is now obsolete and would need to be redesigned on top of your PR and the 0.10 version (I think it was still 0.9 when I started).
I would suggest we go by step and we start improving the gem with your PR, that will introduce the include option.
After that, we can start thinking together about all other options:

  • include_all, like a super include that automatically include all associations (similar to my PR)
  • always_include
  • exclude

@joaomdmoura, do you like the schedule?
Would really like to hear your thoughts on this guys :)

@jfelchner
Copy link
Contributor

@iMacTia @NullVoxPopuli glad I found this. 😀 I was just about to open an issue. This is exactly what we need out of AMS right now. I read through the issues. I think that include_nested_associations could be replaced by doing something like:

BlogSerializer < #...
  has_many :posts, include: :all_nested_associations
end

IMO this:

BlogSerializer < #...
  has_many :posts, always_include: true
end

Reads like "always include the posts in this output from the blog serializer". What that really means is "include all of the nested associations specified in this serializer". Also always_include sounds like it shouldn't be able to overridden, however the entire point of this is that the serializer's associations can be overridden.

Cons to this approach would mean that no one could have an association called all_nested_associations which I think is fairly safe.

@joaomdmoura we also need to think about how this would work for applying the JSON API spec which allows nested includes to be specified. See here.

@joaomdmoura
Copy link
Member

Hey everyone, I've being following it along and have just read through the last comments.
In really in favor of this implementation, really looking forward to get #1127 merged.

@iMacTia the schedule look great, indeed, let's take some baby steps, have the include will be awesome already and solve may of the existing needs. Then we can move forward with the other options.

Indeed @jfelchner, but the good news is that we already support neste associations following json-api conventions 😁.

I think we all agree that this is aiming for the right direction, at least the first step on #1127. Let's get it merged then close this issue and open a new one to deal with always_include 😄

@beauby
Copy link
Contributor

beauby commented Sep 14, 2015

One common pattern to avoid infinite loops in this case is to have a "light" serializer for non-primary resources (relevant for Json/FlattenJson, but not JsonApi). We could possibly make it so that primary resources are serialized according to ResourceSerializer, and by default, non-primary resources get serialized according to ResourcePreviewSerializer, if such a serializer is defined, and falls back to ResourceSerializer otherwise, with a warning/exception in case of infinite loop.

An other solution is to define the whole tree of included resources at the adapter level, JsonApi-style.

@malroc
Copy link

malroc commented Sep 15, 2015

Including all nested associations was the default option in 0.8.x.
All possible problems of this approach come from bad design and bad development practices. So why the interests of the bad developers should be respected more than the interests of the good ones?
As @beauby mentioned, there is a good way to prevent unnecessary nesting in 0.8.x approach: create a separate light serializer for the association. In this case, every serializer is responsible only for its direct nodes (unlike include approach), which is better from the single-responsibility principle perspective. For example, we add a new association to a model B that is itself an association of a model A. Using include approach, we have to rewrite ASerializer in that case, which makes no sense (model A itself is unchanged).
Explicit include ... is non-semantic. It's no different from using the same syntax in controllers (i.e. model.to_json(include: ...)) Adding even more options (exclude, always_include, etc) makes the syntax overcomplicated, comparing to the clean and simple approach of 0.8.x branch

@malroc
Copy link

malroc commented Sep 15, 2015

Also, include approach breaks 'convention over configuration' principle, which stands behind the whole ideology of RoR (and old AMS btw).

@malroc
Copy link

malroc commented Sep 15, 2015

Finally, include approach means that we have to handle atomic nodes and associations separately for an associated model: atomic nodes are handled by the serializer of the model itself, and associated nodes are handled by the serializer of the container model.
Having in mind that the serializer of the associated model can have its own associations defined, we get a mess in the code base.
Please take look at this structure (the simplest possible case):

class ASerializer
  attributes :d, :e, :f
  has_one :b, include: :c
end

class BSerializer
  attributes :g, :h, :i
  has_one :c
end

First of all, the code is duplicated (we include :c in both ASerializer and BSerializer). Second, for someone who is not familiar with the new syntax, it is unclear which piece of code is actually responsible for including c into a (i.e. has_one :c or include: :c). Last, if we include :c explicitly in ASerializer, why we don't do that for the atomic attributes of B (i.e. :g, :h, :i)?
If we add an attribute j to class B, we need to rewrite BSerializer (which is quite logical). But if j is an association, not an attribute, then we have to rewrite both ASerializer and BSerializer to get the same result.

@NullVoxPopuli
Copy link
Contributor Author

@malroc, I understand your concern, and that you may be upset with this change.

There has been quite a bit of discussion on this topic over the past months on various issues.
The primary issue being solved is the infinite nesting that comes with always including every relationship.

To counter your argument for creating a separate 'light' serializer, for the association would result in something like this:

Say, we have the objects comments, blogs, authors, and posts, as in the tests for AMS.

If we wanted each association, but associations were deeply evaluated, that means we would need the following serializers for all scenarios - (which may be many people's case)

CommentsSerializer
BlogSerializer
AuthorsSerializer
PostsSerializer
# everybody would at least have the above, regardless of implementation.
# but to have optionally nested associations in the way you describe, we would need:
CommentsWithAuthorSerializer
BlogWithPostsSerializer
PostsWithAuthorSerializer
AuthorsWithCommentsSerializer
AuthorsWithPostsSerializers
AuthorsWithPostsAndCommentSerializer
# and then for nesting associations
BlogWithPostsWithAuthorsAndWithCommentsWithAuthorsSerializer
PostsWithComentsWithAuthorsSerializer

As you can see, this gets out of hand, pretty quickly.

Explicit include ... is non-semantic. It's no different from using the same syntax in controllers (i.e. model.to_json(include: ...)) Adding even more options (exclude, always_include, etc) makes the syntax overcomplicated, comparing to the clean and simple approach of 0.8.x branch

include is the only option on the association that does any sort of manipulating on an association.
Additionally, the include options won't have to be on the serializers by the time official 0.10 is released. There will be the ability to set include from the controllers as well

render json: @objects, include: {...associations...}

this is consistent with the JSON API spec and adapter

Also, include approach breaks 'convention over configuration' principle, which stands behind the whole ideology of RoR (and old AMS btw).

have a look at this: http://jsonapi.org/format/#fetching-includes
In general, we are following the json api conventions for our adapters - at least for relationships. The JSON API adapter will be the most adherant to specification, and follow all of the conventions set forth by the jsonapi team.

for someone who is not familiar with the new syntax, it is unclear which piece of code is actually responsible for including c into a (i.e. has_one :c or include: :c).

We will be sure to document as much as possible before the release of 0.10

Last, if we include :c explicitly in ASerializer, why we don't do that for the atomic attributes of B (i.e. :g, :h, :i)?

we do include these, see this comment: #1127 (comment)

@malroc
Copy link

malroc commented Sep 15, 2015

If we wanted each association, but associations were deeply evaluated, that means we would need the following serializers for all scenarios - (which may be many people's case)

@NullVoxPopuli yes, you are right on that, and actually that's the only issue with the old (0.8.x) approach (please compare to the number of issues of the new approach I listed above).
But think about it from another perspective: if we need several representations of the same object for different cases, isn't it logical to create several serializers? After all, that's what serializers are for. If we need to include an attribute in one case and don't need it in another, we do exactly the same: create 2 different serializers (in some cases that can be avoided, but in general that is the rule). That's the price we pay for keeping our code readable and not making it a mess.
If include syntax were good enough, why would we need serializers at all if we can use the same syntax directly in our controllers?

@NullVoxPopuli
Copy link
Contributor Author

if we need several representations of the same object for different cases, isn't it logical to create several serializers?

Yes, but doing that for associations isn't the way to go. I'm working on some tests right now to demonstrate using flat serializers, and specifying nesting in the controller.

If include syntax were good enough, why would we need serializers at all if we can use the same syntax directly in our controllers?

valid point -- the reason it's in the seralizers now, is because it was easier - a stepping stone to get to fuller functionality. This feature isn't final, and I plan to implement sideloading before RC4

@NullVoxPopuli
Copy link
Contributor Author

@malroc here is the relevant commit that should address your issues: NullVoxPopuli@5806cb7

by default, nested associations are not rendered. So, specifying them in the controller would also allow for associations to be specified upon request from your frontend.

@NullVoxPopuli
Copy link
Contributor Author

@malroc
Copy link

malroc commented Sep 15, 2015

Yes, but doing that for associations isn't the way to go. I'm working on some tests right now to demonstrate using flat serializers, and specifying nesting in the controller.

So we step back to specifying nesting in the controllers? Again, why do we need serializers at all in that case?
I can specify all attributes and nesting in the controllers right now, and I don't need serializers for that.

The first line of the AMS documentation states that:

ActiveModel::Serializer brings convention over configuration to your JSON generation.

And the new approach for handling nested associations breaks that.

@NullVoxPopuli
Copy link
Contributor Author

I believe if we take convention over configuration in the way you want it, we just wouldn't have any serializers, and it would all be generated... -.-

@NullVoxPopuli
Copy link
Contributor Author

(you can continue to use 0.8, btw - you don't have to upgrade) :-)

@malroc
Copy link

malroc commented Sep 15, 2015

The support for 0.8.x is already dropped, so I can't continue using it in the long run.

@malroc
Copy link

malroc commented Sep 15, 2015

I believe if we take convention over configuration in the way you want it, we just wouldn't have any serializers, and it would all be generated... -.-

It worked perfectly for 0.8.x, and I see no reason why it can't work now.

@beauby
Copy link
Contributor

beauby commented Sep 15, 2015

I have a design in my head that could be pretty cool for nested JSON stuff: building on 0.8.x, and the idea I mentioned above, we could make it so that when a nested related resource gets serialized, let's say Post > Comment > Author, we would first look for PostCommentAuthorSerializer, then CommentAuthorSerializer, then RelatedAuthorSerializer (which would be some sort of catch-all for non-primary Author resources), and finally AuthorSerializer. Or even better: first PostSerializer::CommentSerializer::AuthorSerializer, then CommentSerializer::AuthorSerializer, then AuthorSerializer, that way people could easily define their serializers in a nested way, and keep all the stuff in the same place.
Example:

class PostSerializer < ActiveModel::Serializer
  attributes :id, :title
  has_many :comments

  class CommentSerializer < ActiveModel::Serializer
    attributes :id, :content
    belongs_to :author

    class AuthorSerializer < ActiveModel::Serializer
      attributes :id, :name
    end
  end
end

Thoughts everybody?

@beauby
Copy link
Contributor

beauby commented Sep 15, 2015

To be honest, I see use in both solutions:

  • @NullVoxPopuli's is terribly usefull when you want to conditionally include nested resources,
  • the one I sketched above or a variation of it is really cool when you statically know what you want serialized.

@malroc
Copy link

malroc commented Sep 15, 2015

@beauby +1 for your solution.

@NullVoxPopuli
Copy link
Contributor Author

@beauby that is a very interesting idea.
Only downside I can think is the one that @malroc mentioned in that you'd have to update attributes in many places if things change.

@beauby
Copy link
Contributor

beauby commented Sep 15, 2015

@NullVoxPopuli Not really, because if your Authors are always serialized the same way, you don't even need to define a nested serializer. And if you did define one, then it was because you had custom needs over the attributes/relationships you wanted to be included.
I'll make a PR for the solution sketched above, we'll see whether it gets traction.

@NullVoxPopuli
Copy link
Contributor Author

@beauby cool

@NullVoxPopuli
Copy link
Contributor Author

for clarification:

class PostSerializer < ActiveModel::Serializer
  attributes :id, :title
  has_many :comments

  # overrides the default / top level CommentSerializer
  class CommentSerializer < ActiveModel::Serializer
    attributes :id, :content
    belongs_to :author

    # by default - no nested relationships are rendered
    # unless a serializer is specified here?
    # this could also be class AuthorSerializer < ::AuthorSerializer, 
    # if we wanted it to be the same attributes, yeah?
    class AuthorSerializer < ActiveModel::Serializer
      attributes :id, :name
    end
  end
end

@beauby
Copy link
Contributor

beauby commented Sep 16, 2015

Some news:

  • Add support for wildcards in nested includes #1158 Allows for specifying on render which related resources to serialize along, with a wildcard syntax (path.to.resource.* for including all resources directly related to nested resource path.to.resource, path.to.resource.** for including all resources related to nested resource path.to.resource and all its descendants),
  • Add support for nested serializers. #1157 which is almost ready offers to optionally define nested serializers (the more nested the serializer, the higher the priority, like for an Author nested in a Comment, nested in a Post, we would use PostSerializer::CommentSerializer::AuthorSerializer if exists, otherwise CommentSerializer::AuthorSerializer if exists, otherwise AuthorSerializer). By default, all the defined relationships get serialized.

@beauby
Copy link
Contributor

beauby commented Sep 22, 2015

Let's continue this discussion in #1190.

@joaomdmoura joaomdmoura changed the title Thoughts on an explicit way to define nested relationships? [RFC]Thoughts on an explicit way to define nested relationships? Oct 2, 2015
@joaomdmoura
Copy link
Member

Nice, moving then, I'm closing this one then in favor of #1190 😄

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

6 participants