Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor schema generation to allow per-view customisation #5354

Merged

Conversation

carltongibson
Copy link
Collaborator

@carltongibson carltongibson commented Aug 23, 2017

Closes #5337

Current Status: Fully functional but pending below. (i.e. Try, Use, Feedback, But don't merge. 🙂)

Tasks:

  • Move (per-view) schema generation logic to descriptor on APIView
  • Tidy code from feedback.
  • Consider/Add api extensions for declarative customisation for common cases.
  • Add decorator for FBVs
  • Adjust documentation.

Summary

This PR adds APIViewSchemaDescriptor (😱: Name!) on APIView which handles the per-view part of the schema generation. This will resolve the issues people have had trying to customise the schema generation.

The main access point is APIViewSchemaDescriptor.get_link(). Override this to return a coreapi.Link instance for you endpoint.

Provide your subclass by setting the schema attribute on your view:

class MyView(SomeSuperView):
  schema = MySchemaDescriptorSubclass()  

Other sub-methods are available customising just one particular aspect. e.g. get_path_fields for URL parameters etc.

Feedback

Please try it out. If you're wanting to customise Schema Generation now is your moment to speak up.

Small issues inline here. Larger issues (or particular use-cases) may merit a separate issue Ref #5354 (this PR) or Ref #5337 (the parent issue).

@encode/django-rest-framework-core-team If you have capacity, I'd be grateful for your input.

Particular points of interest:

  • Low-level gripes: this is a first-pass, with all that entails.
  • The inline comments. What are your thoughts there?
  • Function based views are converted to APIView under the hood, so this should just extend. But, what's the api we want on the decorator to specify an APIViewSchemaDescriptor subclass etc?
  • Big one: What's the API we want for declarative customisation?

On this last point: we want to allow some customisation without overriding anything or subclassing. I'd like to do this by passing kwargs to APIViewSchemaDescriptor, rather than adding attributes to APIView itself.

As per comment on #5337, I imagine it looking something like:

class MyView(SomeSuperView):
  schema = APIViewSchemaDescriptor(  
     fields = {
      "my_extra": coreapi.Field( ... ), 
    },
    ...
  )

But:

  • What are the core use-cases we want to cover?
  • What's the behaviour we want? (fields—does this replace or add to the auto-generated? Do we have this or additional_fields, say.)

... and so on.

In the meantime I'll make a start on the docs.

@Igonato
Copy link

Igonato commented Aug 23, 2017

Tried it. Seems good overall. Some feedback:

fields—does this replace or add to the auto-generated?

+1 for replace. It's the first that comes to mind after looking at it. But "add" will be a more common use case, probably.

Regarding the name. How about ViewInspector? Similar to the EndpointInspector you have below.

Also, how about implementing this as a Mixin? Then if you need to, let's say add a field, you can naturally do:

class MyView(SomeSuperView):

    def get_serializer_fields(self, *args, **kwargs):
        fields = super().get_serializer_fields(*args, **kwargs)
        fields.append(coreapi.Field(
            name='custom',
            # ...
        ))
        return fields

At the current state, you would need to extend the APIViewSchemaDescriptor or support extra parameters like additional_fields. The biggest downside I think is that it's not clear that this is about the schema (in case of serializers at least), maybe change the method name to something like get_serializer_schema or get_serializer_schema_fields

@woile
Copy link
Contributor

woile commented Aug 23, 2017

Hey @carltongibson could you provide a clearer use case/example?. I'm trying to wrap my mind around it.

Thanks!

@carltongibson
Copy link
Collaborator Author

carltongibson commented Aug 23, 2017

@woile: One example would be providing a description for a (non-model) URL path parameter on a view. This isn't currently available automatically, so overriding get_path_fields would let you return the required coreapi.Field instances. (You might want to call super and then adjust...)

(See also #5335, and others.)

In general this would allow you to replace the provided generation for individual views without loosing auto-generation everywhere else.

@Igonato: The idea of using a descriptor is to keep all the schema logic out of APIView itself. (It also allows a familiar API when using it.)

Given that method names are scoped to the descriptor class, and are not on APIView, we already know they're schema related, and so don't need to mention that in the naming. (This is another benefit of keeping the logic separate.)

kwargs will be introduced into an __init__ method. Yes, this doesn't yet exist, but it's pending the design thoughts that arise from the discussion here.

Other points are very welcome.

Thanks both!

@Igonato
Copy link

Igonato commented Aug 23, 2017

Hey, did you skip the name proposal? :) I feel like ViewInspector is an improvement over APIViewShemaDescriptor. Or how about ViewShemaGenerator?

I understand your motivation regarding keeping the schema logic out view, I'm just really not a fan of the inevitable four (and probably even five) level indentation:

class MyView(SomeSuperView):
    # 1
    schema = APIViewSchemaDescriptor(
        # 2
        fields={
            # 3
            "my_extra": coreapi.Field(
                # 4
                name='foo',
                location='bar',
                required=True,
                schema=coreschema.Something(
                    # 5
                    title=title,
                    description=description
                    # ...
                ),
            ),
        },
        # ...
    )

As an alternative, how about changing schema to shema_class or chema_descriptor_class similar to the serializer_class, permission_classes and authentication_classes. This can help prevent this type of nested horror and seems more consistent. Also

There should be one-- and preferably only one --obvious way to do it.

In this case: subclass and extend. No need for __init__ args

@carltongibson
Copy link
Collaborator Author

carltongibson commented Aug 23, 2017

Hey, did you skip the name proposal? :)

No 😀

Other points are very welcome.

I want to gather an amount of input on this PR before any final decisions are made. So everything else gets taken under consideration.

On the other points, I think you've highlighted why we need to consider the API we want here carefully. There's going to be an amount of nesting any time you write coreapi types by hand, so we'll live with some of that—the thing we need to think through is how much, and where.

The canonical way of customising the schema generation will be to subclass APIViewSchemaDescriptor and override the bits you need to. (No problem with that—it will be documented as such)

All this final point comes to is whether we can add declarative shortcuts for the most common use-cases.

From both Django REST Framework and Django Filter I have a few thoughts on such shortcuts:

  1. Users really like them.
  2. They can be a bit addictive — users try to stick to the extra kwargs (or Meta options) long after they should have moved to doing it by hand (i.e. "declaring fields/filters/etc explicitly)
  3. So it's important to document (first and highlighted) what the canonical method is, and that it's preferred.

If it weren't for 1. I'd say not to have them. (But it is 🙂) The contrary argument, that will come up, is "Why do I have to write a whole subclass just to add a single path parameter?" — it doesn't matter that that subclass would only be 5 lines.

If we can get the API right here we can satisfy the simple cases (without subclassing) whilst giving an easy route to more complex uses. (Or that's the hope.)

@Igonato
Copy link

Igonato commented Aug 27, 2017

Pitching in with another bit of input: there is already a way (nice one IMO) to override Metadata. It would be nice if schema overriding shared logic with it, or just worked in a similar way, what do you think?

@carltongibson carltongibson force-pushed the 5337-refactor-schema-generation branch 2 times, most recently from 65237d8 to d3620b1 Compare August 30, 2017 10:36
# TODO: coerce_method_names is used both here and by SchemaGenerator in
# get_keys. It is read straight from the setting (i.e. it's not dynamic.)
# ???: Can it be a module level constant (or callable)?
coerce_method_names = api_settings.SCHEMA_COERCE_METHOD_NAMES
Copy link
Collaborator Author

@carltongibson carltongibson Sep 4, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@tomchristie I'm half happy just to leave this for now. I think the next phase will involve various smaller adjustments to these introspection methods, so tidying the remainder here will be OK to do then. ?

# - APIView is only used by SchemaView.
# - ???: Make `schemas` a package and move SchemaView to `schema.views`
# - That way the schema attribute could be set in the class definition.
APIView.schema = AutoSchema()
Copy link
Collaborator Author

@carltongibson carltongibson Sep 4, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@tomchristie What do you think about this one?

If we adjust the layout, the dependency graph needs to go like this:

┌────────────┐       ┌────────┐   
│  schemas   │──────▶│ views  │   
└────────────┘       └────────┘   
       │                  │       
       │                  │       
       │                  ▼       
       │          ┌──────────────┐
       └─────────▶│schemas.views │
                  └──────────────┘

Copy link
Collaborator Author

@carltongibson carltongibson Sep 4, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Options:

  1. Leave it like this. It's a monkey patch — but all the schema stuff stays in one file — and with a nice comment explaining it's not an issue
  2. Make schemas a package:
    • More complex
    • But keeps APIView totally declarative.

Copy link
Collaborator Author

@carltongibson carltongibson Sep 4, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I prefer Option 1. (With a better comment of course)

Copy link
Member

@tomchristie tomchristie left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some thoughts here. I think there's probably enough stuff currently in schemas.py that splitting it into schemas.py vs inspectors.py would be more comprehensible. I'm not wild about introducing an extra module, but it's confusing enough with 2 different classes for the schema generation, 1 class for the schema view, and 3 classes for the view introspection, that I think a split between "view introspection stuff" and a "bring it all together and serve an API Schema" would be sensible.

Provide subclass for per-view schema generation
"""
def __get__(self, instance, owner):
self.view = instance
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is a bit obscure. An alternative would be to expose get_link on the API View, that calls into the inspector, including the view.

Ie.

# Schema generation   (Probably introduce this at L363)

def get_link(self, path, method, base_url):
    """
    Return a `coreapi.Link()` for the given path template and method.

    This is the abstract representation that we use to build up a complete
    API Schema, that can be used for documentation and other tooling.
    """
    return self.schema.get_link(self, path, method, base_url)

Then use view as an explicit argument everywhere in the inspector methods, rather than an instance.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is just How Python Descriptors Work — So, is it obscure? Well, yes, but it's just one of the lesser used parts of the programming model. It's just Python. (It's exactly how all our declarative DSLs are built.)

The advantage (the whole point) of descriptors is that you get a reference back to the parent instance (or class) without needing to pass the extra reference by hand. You then get the nice familiar property dot-syntax to go with it.

If we don't use a descriptor we have (here) two API additions to APIView instead of one. schema and get_link instead of just schema.

I could add a docstring saying something about ViewInspector is a descriptor [link to how to guide](https://docs.python.org/3/howto/descriptor.html).

Who's looking at it? Two types of people:

  1. End users: They're never meant to even see this. They just declare their schema=ManualSchema(...) or whatever and then use it like every other object property they've ever seen. (And in Django almost all of those are descriptors too.)
  2. DRF contributors: We can cope. 😀

If don't use a descriptor here I would pull it off of APIView entirely. Instead instantiate an instance with a view parameter in the clear:

   inspector = AutoSchema(view)
   inspector.get_link(...)

But here we loose the ability to declare the customisations... so we can't just set a class on the view, we have to use an instance... so we might as well use the descriptor __get__ method...

Copy link
Member

@tomchristie tomchristie Sep 6, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If don't use a descriptor here I would pull it off of APIView entirely.

No, let's not go down that route.

Okay, alternate - let's just make sure that __get__ has a docstring, noting the usage.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"""
def __init__(self, link):
assert isinstance(link, coreapi.Link)
self._link = link
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is ManualSchema(link) what we want, or is ManualSchema({'GET': ..., 'POST': ...) more likely to actually be useful? How about when applied to viewsets - is there any way we could target per-action? Not necessarily a blocker, but worth considering before we pull this in.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah. There's thought still to go into this one.

Related issues: #5382 #4685 ...

My thought would be to address this Next™. Is this the right framework? If so pull it in and I'll turn the effort to the other Schema Generation tickets before 3.7

# * ???: When would `APIView.schema` be needed and that NOT be the case?
# * The alternative is to import AutoSchema to `views`, make `schemas` a
# package, and move SchemaView to `schema.views`, importing APIView there.
APIView.schema = AutoSchema()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd really like to see us able to put this on the APIView class directly - I think that'd add some clarity. I guess the lowest impact way to do that is to have these new inspectors in inspectors.py, with a bit of docstring at the top of schemas.py and inspectors.py detailing that inspectors is for the view inspection, and schemas is for bringing it all together.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK. No problem.

The import cycle is just caused by SchemaView being an APIView subclass. We can restructure and then import AutoSchema into views.

This is the only reason to break up the file. It's not anywhere near too big to comfortably handle.

Rather than create a whole load (more) files directly under rest_framework, we can make schemas a package. Something like:

rest_framework/
  ...
  schemas/
    __init__.py
    inspectors.py
    generators.py
    views.py

Then we expose the key items in the API as they are now, so a user would do something like:

from rest_framework.schemas import (
  AutoSchema, 
  ManualSchema, 
  SchemaGenerator, 
  get_schema_view,
)

This would be unchanged for anyone who isn't importing SchemaView directly (but I can't help them). I would imagine 99% of users being entirely unaffected.

Question: schema is already plural, so do we really want schemas with the extra s? (If so fine, if not a small breaking change to rename now would be appropriate — it would be easy to fix in user-projects.)

from rest_framework.schemas import AutoSchema

class CustomAutoSchema(AutoSchema):
def get_link(*args):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess we oughta include the arg spec fully here(?) (Rather than just *args)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@tomchristie
Copy link
Member

tomchristie commented Sep 6, 2017

@Igonato

there is already a way (nice one IMO) to override Metadata. It would be nice if schema overriding shared logic with it, or just worked in a similar way, what do you think?

Yes, it's possible that we'd want to address that once this is in. I think we'd most likely want to tackle that by having the Metadata call into the schema generation info, rather than replicating its own slightly different set of logic.

(Tho let's treat anything along those lines as a separate bit of work if we do decide to take it on.)

carltongibson added a commit to carltongibson/django-rest-framework that referenced this pull request Sep 6, 2017
@@ -26,7 +26,8 @@
from rest_framework.compat import NoReverseMatch
from rest_framework.response import Response
from rest_framework.reverse import reverse
from rest_framework.schemas import SchemaGenerator, SchemaView
from rest_framework.schemas import SchemaGenerator
from rest_framework.schemas.views import SchemaView
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the only point where we can't keep the external API of schemas the same.

┌────────────┐       ┌────────┐
│  schemas   │──────▶│ views  │
└────────────┘       └────────┘
       │                  │
       │                  │
       │                  ▼
       │          ┌──────────────┐
       └─────────▶│schemas.views │
                  └──────────────┘

Only option would be to move SchemaView out of schemas entirely.

In basic cases SchemaView is not directly employed by the end user anyway.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah. Not a blocker, so long as the change is documented clearly in the release notes.

@carltongibson
Copy link
Collaborator Author

carltongibson commented Sep 6, 2017

I made schema a package, and moved the declaration of APIView.schema into views.py.

The API for the base cases (and docs) is unchanged. Users can still do:

    from rest_framework.schemas import (
        AutoSchema,
        ManualSchema,
        get_schema_view,
        SchemaGenerator,
    )

These are the four documented access points. The documented use-cases remain unchanged.

Most users will use only the first two or three, depending on whether they're using include_doc_urls, the DefaultRouter and such.

@carltongibson
Copy link
Collaborator Author

carltongibson commented Sep 6, 2017

it's confusing enough with 2 different classes for the schema generation, 1 class for the schema view, and 3 classes for the view introspection, that I think a split between "view introspection stuff" and a "bring it all together and serve an API Schema" would be sensible.

For now, I've put EndpointInspector in generators with SchemaGenerator.

This is rather than putting it in inspectors with AutoSchema and ManualSchema.

So, first point, we need to think about name changes. 😀

The reason I've divided it this way, rather than 3 and 1, is that there are two distinct jobs:

  1. Getting the Link information for each individual view and method etc.
  2. Compiling that in the correctly formatted tree (for coreapi.Document)

EndpointInspector, despite it's name is very much part of Job Number 2. With this refactoring it only has four methods:

  • get_api_endpoints
  • get_path_from_regex
  • should_include_endpoint
  • get_allowed_methods

None of these are Job Number 1.

A candidate re-naming would be EndpointEnumerator or something.

@carltongibson carltongibson force-pushed the 5337-refactor-schema-generation branch from 7eb7900 to a958a3e Compare September 14, 2017 08:12
@tomchristie
Copy link
Member

Okay, lets go!

@alexgavrisco
Copy link

Wow!
This is the feature I really need now.
Is there a chance that you'll publish those changes with a minor release soon?

I have some endpoints which have something like query_serializer_class which validates query params and currently there's no obvious way to make DRF's schema generation to work with this.
And this PR seems to introduce needed tools to do that.

@carltongibson carltongibson deleted the 5337-refactor-schema-generation branch September 26, 2017 11:51
@carltongibson
Copy link
Collaborator Author

@Alexx-G Glad you like it. 😄

We're working on v3.7 now. Target is next Monday (3rd October). If you want to get going you can install off of master until then — we hopefully won't merge anything that breaks anything too badly.

I'm actively interested in user experiences here. AutoSchema gives you full-control, so you can employ any measures to generate your schema, but I'm interested in what are common use-cases and sticking points.

@alexgavrisco
Copy link

Your work is highly appreciated!

This PR seems to allow me to write a reusable mixin which will add fields from query serializer to the generated schema. If so, I'll finally remove a workaround we're using to customize schema via YAML-like comments (it's not reusable and extensible, but it was only solution back then).
If I get a chance to test AutoSchema before release, I'll provide you my feedback. 👍

@carltongibson
Copy link
Collaborator Author

carltongibson commented Sep 26, 2017

@Alexx-G After release will be good too! It will be good to see solutions so we can develop the patterns that we'll (all) use here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Allow per-view Schema Generation customisation
5 participants