Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement third party auth ID mapping API #9842

Merged
merged 1 commit into from
Oct 27, 2015

Conversation

xcompass
Copy link
Contributor

UPDATE Feb.16, 2016
Since "client_credentials" grant type is supported know, there is no need to manually create authorization code. The workflow below reflects this change.

Description:
This implements a new API that can be used to retrieve a list of mappings between third party auth remote IDs and edX usernames. For use cases and more details, please see proposal + discussion.

Background:
When an edX course is used in an on-campus environment and SAML, LTI or other SSO is enabled to allow students to authenticate to edX using their on-campus credentials, normally there will be an identifier sent from an on-campus IdP. This identifier is stored in edX and linked to the edX account. Later when students return from an on-campus IdP, edX is able to authenticate the student using their edX account.

Because students may create edX accounts not using their real identity, the instructors will have difficulty to identify the edX students in their courses and won't be able to upload grades from edX back to on-campus gradebook system as the downloaded grades from edX only contain edX IDs, which are unknown to on-campus system.

This new API can be used for mapping between the edX user ID and the ID provided by identity provider (IdP). This API will give information that will allow instructors or course support staff to figure out the identity of the on-campus students who are using the edX course. It will also allow grade information coming from edX to be appropriately associated to campus students for uploading to an on-campus learning management system (LMS) or student information system (SIS).

This API is suppose to be consumed by a on-campus middleware, which combines the information from on-campus system. The middleware would talk to campus system to get information about course and enrolment. Then use those information to control instructor access. So the instructor will be able to get his/her edX student on-campus identity, uploading grades back to their LMS or SIS, making groups based on on-campus activities, etc.

This API also helps with troubleshooting with the issue where students put in wrong email addresses and weren't able to activate their accounts. Thus they weren't able to login EdX through Shibboleth because of the inactive accounts. With this API, on-campus course support staff or instructor can provide EdX support the student real EdX username instead of the ID that Shibboleth passed over to EdX, thus avoid the EdX support manually querying the tables to figure out the mapping between Shibboleth ID and EdX username.

JIRA: N/A

Component Affected: LMS (Third Party Auth)

Users Affected: Currently users with EDX_API_KEY (EdX internal) or OAuth2 client authorization (external use)

Timeframe: Mid Oct.

Reviewers: @bradenmacdonald and TBD

Resources Needed: None

Database Table Changes:

  • This PR adds a new table called third_party_auth_apipermissions. So a paver update_db is required after merging.

Test Instructions:

  • Enable one or more third party auth providers (such as 'Dummy') and link them to user's accounts.
  • Setup authentication and authorization:
    • For using EDX_API_KEY: Edit ~/lms.auth.json and set EDX_API_KEY to some value, e.g. api-key. And restart LMS.
    • For using OAuth2:
      • Setup an OAuth2 Client. In Django Admin, go to OAuth2 -> Client -> Add Client.
        • User: it is good idea to create a dedicated user for this API
        • URL, Redirect URL: http://localhost// (not being used in this API)
        • Client Type: Confidential
        • Other fields: no specific requirements
      • Create an authorization code. As EdX/django-auth2-provider doesn't support client_credential grant type, we need to create the authorization code manually. Go to OAuth2 -> Grants -> Add grant. Remember the code for later use.
      • Give the new client permission to this API. Go to Third_Party_Auth -> Provider API Permissions -> Add Provider API Permission
        • Select the new client and the provider created in the previous step. This will give the new client access to query that provider mapping.
  • Test this API with commands like below, where {provider_id} should be replaced with the actual ID. If it is SAML provider, provider ID should be "saml-PROVIDER_SLUG", e.g., saml-ubc. For OAuth2 providers, it will be the "oa2-BACKEND_NAME", e.g., oa2-google
    • Using EDX_API_KEY

      curl --header "X-edx-API-key: api-key" http://localhost:8000/api/third_party_auth/v0/providers/{provider_id}/users 
      
      curl --header "X-edx-API-key: api-key" http://localhost:8000/api/third_party_auth/v0/providers/{provider_id}/users?username=USERNAME1,USERNAME2
      
      curl --header "X-edx-API-key: api-key" http://localhost:8000/api/third_party_auth/v0/providers/{provider_id}/users?username=USERNAME1&username=USERNAME2
      
      curl --header "X-edx-API-key: api-key" http://localhost:8000/api/third_party_auth/v0/providers/{provider_id}/users?remote_id=REMOTE_ID1,REMOTE_ID2
      
      curl --header "X-edx-API-key: api-key" http://localhost:8000/api/third_party_auth/v0/providers/{provider_id}/users?remote_id=REMOTE_ID1&remote_idREMOTE_ID2
    • Using OAuth2

      • Using the authorization code created in the previous step to get the access token
      curl --data "grant_type=authorization_code&code=CODE&client_id=CLIENT_ID&client_secret=CLIENT_SECRET" http://localhost:8000/oauth2/access_token
      Return: {"access_token": "2983fbbc6c41947a2807946678d594b06b403dbf", "token_type": "Bearer", "expires_in": 31535999, "refresh_token": "909c2b0926d642283f762f7437c1a9cd12844b07", "scope": ""}
      • Using client credential to get the access token
      curl --data "client_id=CLIENT_ID&client_secret=CLIENT_SECRET&grant_type=client_credentials" http://localhost:8000/oauth2/access_token
      Return: {"access_token": "c1efde84445b2f256e1c80886b3f6d46339b84ee", "token_type": "Bearer", "expires_in": 31535999, "scope": ""}
      • Using access token to issue request to API:
      curl --header "Authorization: Bearer ACCESS_TOKEN" http://localhost:8000//api/third_party_auth/v0/providers/{provider_id}/users
      
      curl --header "Authorization: Bearer ACCESS_TOKEN" http://localhost:8000//api/third_party_auth/v0/providers/{provider_id}/users?username=USERNAME1,USERNAME2
      
      curl --header "Authorization: Bearer ACCESS_TOKEN" http://localhost:8000//api/third_party_auth/v0/providers/{provider_id}/users?username=USERNAME1&username=USERNAME2
      
      curl --header "Authorization: Bearer ACCESS_TOKEN" http://localhost:8000//api/third_party_auth/v0/providers/{provider_id}/users?remote_id=REMOTE_ID1,REMOTE_ID2
      
      curl --header "Authorization: Bearer ACCESS_TOKEN" http://localhost:8000//api/third_party_auth/v0/providers/{provider_id}/users?remote_id=REMOTE_ID1&remote_id=REMOTE_ID2
  • API should return a json object including the mappings. E.g.:
{
  "page": 1,
  "page_size": 200,
  "count": 8,
  "results": [
    {"username": "USERNAME1", "remote_id": "REMOTE_ID1"},
    {"username": "USERNAME2", "remote_id": "REMOTE_ID2"},
  ]
}

@openedx-webhooks
Copy link

Thanks for the pull request, @xcompass! I've created OSPR-820 to keep track of it in JIRA. JIRA is a place for product owners to prioritize feature reviews by the engineering development teams.

Feel free to add as much of the following information to the ticket:

  • supporting documentation
  • edx-code email threads
  • timeline information ("this must be merged by XX date", and why that is)
  • partner information ("this is a course on edx.org")
  • any other information that can help Product understand the context for the PR

All technical communication about the code itself will still be done via the GitHub pull request interface. As a reminder, our process documentation is here.

@openedx-webhooks openedx-webhooks added open-source-contribution PR author is not from Axim or 2U needs triage labels Sep 20, 2015
@xcompass
Copy link
Contributor Author

ping @antoviaque

@sarina
Copy link
Contributor

sarina commented Sep 20, 2015

Hi @xcompass - since this is a work in progress, could you let me know what resources, if any, you need from edX at this time?

Could you also flesh out your PR description with more detail, especially regarding timeline (when do you need this feature by, and please be very explicit), how to test the feature, etc? Check out our detailed guidelines here: http://edx-developer-guide.readthedocs.org/en/latest/process/cover-letter.html - making specific note of our brand new guidelines on how to contribute to the documentation for your feature.

@antoviaque
Copy link
Contributor

@xcompass Thanks for the ping! We can schedule one of the two reviews on our side - likely over the coming week.

@xcompass
Copy link
Contributor Author

@sarina I've updated the PR description. Let me know if I missed anything. Thanks.

@xcompass
Copy link
Contributor Author

@sarina updated based on your comments. Is it easier for you to see the updates if I squash the comments?

@sarina
Copy link
Contributor

sarina commented Sep 22, 2015

Separate commits are easier, I think. 👍 from me on test coverage and code quality. However, Braden and a platform team member will need to give thumbs up as well.

@bradenmacdonald
Copy link
Contributor

@xcompass FYI, I will continue reviewing this for you once you add a model or mechanism for defining permissions that control access to this API (which is the missing piece needed for OAuth support, as we discussed).

@xcompass xcompass force-pushed the tpa-mapping-api branch 3 times, most recently from 812c112 to 1aeb52f Compare September 24, 2015 07:43
@xcompass
Copy link
Contributor Author

@bradenmacdonald I've done the authorization part. Could you take a look at it when you get some time? Thanks!

@xcompass
Copy link
Contributor Author

@nasthagiri I have some trouble to get authorization_code from command line (curl). What I'm doing right now is manually creating the code in Django admin. But once I use the code to get the token, the code is expired. (which is expected behaviour). Do you have any tip to get the authorization code from command line? Thanks.

@nasthagiri
Copy link
Contributor

@xcompass To support server-to-server auth, we need to use the client_credentials grant (not the authorization_code grant - sorry for misspeaking earlier). This is described in http://tools.ietf.org/html/rfc6749#section-4.4 and http://alexbilbie.com/2013/02/a-guide-to-oauth-2-grants/.
Unfortunately, it doesn't seem that this grant_type is supported in the version of the django-oauth2-provider that we use. Do you have any luck in finding a django implementation of this that we can incorporate into our oath provider?

By the way, the EDX_API_KEY is something that we want to deprecate. It provides an all-or-nothing access to our server APIs and something we want to replace in favor of OAuth2+JWTs in the long run.

@xcompass
Copy link
Contributor Author

@nasthagiri Thanks. I thought I missed something. Yep, I figured current django-oauth2-provider doesn't support client_credentials.

It seems django-oauth2-toolkit supports client_credentials grant.
https://github.com/evonove/django-oauth-toolkit/blob/8b8e638d2fd0c5dd4a5595573542a476cd4a6c9d/oauth2_provider/oauth2_validators.py#L22. Also the current version of Django Rest Framework is recommanding it over django-oauth2-provider: http://www.django-rest-framework.org/api-guide/authentication/#django-oauth-toolkit

I never used it before, so I'm not sure how much work involved. Are you guys interested in migrating to django-oauth2-toolkit?

@nasthagiri
Copy link
Contributor

It seems django-oauth2-toolkit supports client_credentials grant.
Are you guys interested in migrating to django-oauth2-toolkit?
Migrating to django-oauth2-toolkit may be a large effort, perhaps outside the scope of this PR.
However, I do strongly prefer using the client_credentials grant rather than a shared all-or-nothing API_KEY in the long run. We can also consider updating the django-oauth2-provider library to add support for client_credentials.

One thought is to address using the client_credentials grant in a separate PR. We will also want to deprecate the use of the EDX_API_KEY throughout our platform.

@bradenmacdonald
Copy link
Contributor

@nasthagiri @xcompass Is it possible for us to somehow use the authorization_code grant in the meantime to authorize UBC's server to pull data from this API?

@nasthagiri
Copy link
Contributor

@bradenmacdonald Yes, it's "possible". But it's not how the authorization_code grant type was intended to be used. It's possible by having UBC's server make the one-time call to get the access_token, store the access_token for all future calls, and then refresh the access_token whenever it expires.

@bradenmacdonald
Copy link
Contributor

@nasthagiri I understand. I'm just trying to figure out how we can get this merged on time so that UBC has something they can use within their timeframe. Do you think we should use authorization_code for now (and perhaps you could create some stories for edX to migrate to django-oauth2-toolkit in the future), or do you see another path forward?

FYI @xcompass @nasthagiri @sarina I'm going to be away next week. So if we don't get this to the point where I can give +1 today, Eugeny will likely take over from me as a reviewer.
I'm also wondering, who will be the one to merge this PR when it is ready?

@xcompass
Copy link
Contributor Author

Yes, it's "possible". But it's not how the authorization_code grant type was intended to be used. It's possible by having UBC's server make the one-time call to get the access_token, store the access_token for all future calls, and then refresh the access_token whenever it expires.

@nasthagiri @bradenmacdonald This is what I'm doing when developing this authorization.

@nasthagiri
Copy link
Contributor

Pan, that sounds good for now. We will then update edx-platform to support the client_credentials grant type. Thanks.

Sent from my iPhone

On Sep 26, 2015, at 5:07 PM, Pan Luo [email protected] wrote:

Yes, it's "possible". But it's not how the authorization_code grant type was intended to be used. It's possible by having UBC's server make the one-time call to get the access_token, store the access_token for all future calls, and then refresh the access_token whenever it expires.

@nasthagiri @bradenmacdonald This is what I'm doing when developing this authorization.


Reply to this email directly or view it on GitHub.

@xcompass
Copy link
Contributor Author

@bradenmacdonald Updated based on your comments.

@xcompass xcompass force-pushed the tpa-mapping-api branch 2 times, most recently from 45b793c to 61a3bb6 Compare September 27, 2015 09:44
@@ -52,6 +52,11 @@ def get(cls, provider_id):
return None

@classmethod
def get_all_provider_ids(cls):
""" Gets a list of all enabled provider ids """
return [provider.provider_id for provider in cls._enabled_providers()]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please remove this method (get_all_provider_ids) and its test as well, if it's unused.

@xcompass xcompass force-pushed the tpa-mapping-api branch 2 times, most recently from cf7875a to db6fcc5 Compare October 26, 2015 03:21
@xcompass
Copy link
Contributor Author

@nasthagiri updated based on your comments.

usernames_str = self.request.QUERY_PARAMS.get('usernames', None)
remote_ids_str = self.request.QUERY_PARAMS.get('remote_ids', None)
username_list = self.request.QUERY_PARAMS.getlist('username', None)
remote_ids_list = self.request.QUERY_PARAMS.getlist('remote_id', None)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This presents a confusing API if there are multiple ways to provide a list of usernames and a list of remote_ids.
The QueryDict implementation already supports both comma-separated and multi-value fields. Can we just support the last 2 queries above?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you point me to where the QueryDict supporting comma-separated fields? I can't find any mentioned comma in the doc. Also the field name is slightly different (username vs usernames). Should I use the same field name (singular form)? (/user?username=user1,user2)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@xcompass Yes, please use the same field name: singular form.

I see the problem you are running into. In a separate (but not yet merged) PR, I have created a common MultiValueField class that serves exactly this purpose. It supports both ways of passing in multiple values for a filed.

That class is not yet merged and assumes one is using django Forms for parsing the query params. In the interim, though, you can use the same logic. Essentially:

  1. join all the fields into one comma-separated string, and then
  2. parse the comma-separated string into multiple values in a set/list.

@nasthagiri
Copy link
Contributor

@xcompass Thanks for updating the PR based on the last round of review. It looks great! I'm now waiting for the following items before giving my thumbs:

  1. Updating the multi-value field code per suggestion (and only supporting the singular form).
  2. Negative test case for OAuth2.
  3. Squashing and final cleanup of the commits.

@xcompass
Copy link
Contributor Author

@nasthagiri All done. Let me know if there is anything else. Looking forward to get this merge :) Thanks for all your help.

({'username': [ALICE_USERNAME], 'remote_id': ['remote_' + STAFF_USERNAME]}, 200,
get_mapping_data_by_usernames([ALICE_USERNAME, STAFF_USERNAME])),
)
@ddt.unpack
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are there any tests for testing invalid usernames and invalid remote_ids?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added

@nasthagiri
Copy link
Contributor

👍

This mapping API enables the mapping between the edX user ID and the ID
provided by identity provider (IdP). For details, please see
https://github.com/edx/edx-platform/pull/9842
@xcompass
Copy link
Contributor Author

@sarina Nimisha finished the review. I have squashed the commits and rebased on current master. Let me know if there is anything else need to be done.

@nasthagiri
Copy link
Contributor

@xcompass Looks good. Thanks for all your work on this PR. It's a great contribution to open edX and our partners. Is there anything left on your end? If not, I'll go ahead and merge.

@xcompass
Copy link
Contributor Author

@nasthagiri I don't think so. Thanks. Glad to make the contribution and I learnt a lot :)

nasthagiri added a commit that referenced this pull request Oct 27, 2015
Implement third party auth ID mapping API
@nasthagiri nasthagiri merged commit 698e542 into openedx:master Oct 27, 2015
@xcompass xcompass deleted the tpa-mapping-api branch October 27, 2015 23:40
@nasthagiri
Copy link
Contributor

FYI - @clintonb has added support for the client_credentials grant type in our django oauth2 provider. See edx/django-oauth2-provider#25. Given this new support, we should now be able to update this feature to use client_credentials as well.

@xcompass
Copy link
Contributor Author

@nasthagiri Great! I'll take a look at it and update the API in a new PR

@xcompass
Copy link
Contributor Author

@nasthagiri It looks I don't need to update anything :)

Is this in production already? (Edge and edx.org) I'll update the PR message to reflect the change.

@nasthagiri
Copy link
Contributor

@xcompass It will be in production later today (hopefully by noon). You shouldn't need to update any code. However, you can (1) eliminate the use of the EDX_API_KEY and (2) update your OAuth call to use the client_credentials grant (assuming your OAuth client is already configured as a "Confidential" client).

@xcompass
Copy link
Contributor Author

@nasthagiri Thanks! That's what I thought.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
engineering review open-source-contribution PR author is not from Axim or 2U waiting on author PR author needs to resolve review requests, answer questions, fix tests, etc.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants